[Dovecot] scalability and arhive ideas
Hi,
I´m reading the past topics related to archive and scalability of dovecot, they are all very interesting. Here, I´m using two dovecot proxies in front of five storages pairs, and we split the domain´s accounts among those servers. So, we can share the i/o load and if one server goes down only few accounts of the domain stops (not all of them).
But, we began to have space problems - and the solution would be insert more and more storage servers. So I was searching for some archive solutions (hard links - at S.O level, or some dovecot extension). A friend told me that he knows an ISP that share even the mailbox of the users among many servers -
this is very weird and (at same time) very interesting approach. Instead of put all messages into one maildir and this maildir into one server, this "maildir" (?) is spplited among many servers - so, if one servers fails the account is still acessible and they move old/big messages to a new "cheap" storage - archiving transparently.
Well, maybe it´s an stupid idea, but couldn´t dovecot imap/pop proxy do the same ? I mean, imagine the following scenario:
1- Proxy does an user account sql - today the return is (among other data), the final server IP, but it could be the storage servers for this account 2- Proxy does parallel connections (instead of today´s one connection with the final server). Retrieve the messages and caches locally where the message is. If the cache is lost, no problem, it just connects again and re-cache them. 3- When a message is deleted/flagged, etc - it has the cache (allowing it to know where send the command to), when a message inserted (as sent folder) - the system can have the 'default storage' where the messages are delivered and saved to - the other storages are just for archiving (or even a round-robin delivery !? - hard to control).
Problems:
1- Have the cache at proxy level. 2- Quota calculation (how the delivery process can check this ?) 3- Maybe mixing proxy connection and this new approach is not easy way. 4- the more servers, the more time waiting their answers (imagine one of them answering slowly).
Advantages:
1- No local mounts at storage level - (as nfs and other network mounted partitions). 2- Servers independency - if space is at a critical value, insert a pair of servers, move messages there, insert the servers in the sql answer (server1, server2, server3, ... , serverN). 3- Scalability - the more servers, the more i/o load you can share.
Well, sorry for the long post...but I hope to collaborate with some ideas (even if they are stupid :( )
Regards, Fernando Bertasso Figaro
On Tue, 2009-08-25 at 11:00 -0300, fernando@dfcom.com.br wrote:
this is very weird and (at same time) very interesting approach. Instead of put all messages into one maildir and this maildir into one server, this "maildir" (?) is spplited among many servers - so, if one servers fails the account is still acessible and they move old/big messages to a new "cheap" storage - archiving transparently.
Well, this is somewhat related to the filesystem abstraction that I'm planning. You'll just need to implement a filesystem that allows distributing a single user's mails to multiple servers. That's actually also what I was planning on doing by using some existing database for that (Cassandra?) And sure it would be possible to implement all of that on my own, but probably it's too much trouble..
Hi Timo,
Yes it´s related, but I don´t understand '... You'll just need to implement a filesystem that allows distributing a single user's mails to multiple servers ...'.
My idea is just in the direction that we don´t need to care about filesystem, we don´t need any distribuited filesystem...let it be as user wants....at any some proxy level, the end storage can be ext3, reiser, S.O linux, freebsd, and so on. I think that the more elements we insert, the more complex and hard to mount/debug the solution would be.
Administrator maintains storage pairs, with any o.s/filesystem he wants- his only work would be to create the accounts and folders at each storage server (if you create a folder - you create at three servers - the same for accounts) and set a database with the servers envolved at the process. The account structure must be sync'ed, and messages will be stored where the users want to.
I also like the idea to user some database to store message index.
Fernando
On Tue, 2009-08-25 at 11:00 -0300, fernando@dfcom.com.br wrote:
this is very weird and (at same time) very interesting approach. Instead of put all messages into one maildir and this maildir into one server, this "maildir" (?) is spplited among many servers - so, if one servers fails the account is still acessible and they move old/big messages to a new "cheap" storage - archiving transparently.
Well, this is somewhat related to the filesystem abstraction that I'm planning. You'll just need to implement a filesystem that allows distributing a single user's mails to multiple servers. That's actually also what I was planning on doing by using some existing database for that (Cassandra?) And sure it would be possible to implement all of that on my own, but probably it's too much trouble..
On Tue, 2009-08-25 at 12:12 -0300, fernando@dfcom.com.br wrote:
Hi Timo,
Yes it´s related, but I don´t understand '... You'll just need to implement a filesystem that allows distributing a single user's mails to multiple servers ...'.
My idea is just in the direction that we don´t need to care about filesystem, we don´t need any distribuited filesystem...let it be as user wants....at any some proxy level, the end storage can be ext3, reiser, S.O linux, freebsd, and so on. I think that the more elements we insert, the more complex and hard to mount/debug the solution would be.
I mean Dovecot accesses filesystem through a simple abstraction layer. You wouldn't have to implement a real OS level filesystem, but you'd implement a "proxying Dovecot-filesystem backend" that sits on top of real filesystems. It's probably not much of a difference to what you're thinking about, except in my mind it's entirely isolated from actual Dovecot code. There's just a simple API that the backend needs to implement and it'll work with Dovecot.
BTW. Maybe http://www.xtreemfs.org/ or http://www.gluster.org/ already does what you're thinking about? I haven't looked at them closely enough..
I´m not sure if i´m being quite clear about my solution (pardon my english)...
I´m not thinking in any filesystem abstraction layer, I´m afraid of this kind of solution. If some corruption happens we can loose data - and the crash-recovery is not easy.
I´m suggesting the opposite, to not have an abstraction layer. The abstraction layer would be the proxy - dovecot does the abstraction layer. When I´m telling 'proxy' I meaning something like the proxy feature of the dovecot.
And in my head the final solution is very 'simple' (despite of its hard programming work). Dovecot just access 'in parallel' a set of servers, requesting their informations (really, with imap connection) and caches them - the results (maybe in some database, as you suggested). It would be an extension of the proxy feature that already exists at dovecot.
Fernando
On Tue, 2009-08-25 at 12:12 -0300, fernando@dfcom.com.br wrote:
Hi Timo,
Yes it´s related, but I don´t understand '... You'll just need to implement a filesystem that allows distributing a single user's mails to multiple servers ...'.
My idea is just in the direction that we don´t need to care about filesystem, we don´t need any distribuited filesystem...let it be as user wants....at any some proxy level, the end storage can be ext3, reiser, S.O linux, freebsd, and so on. I think that the more elements we insert, the more complex and hard to mount/debug the solution would be.
I mean Dovecot accesses filesystem through a simple abstraction layer. You wouldn't have to implement a real OS level filesystem, but you'd implement a "proxying Dovecot-filesystem backend" that sits on top of real filesystems. It's probably not much of a difference to what you're thinking about, except in my mind it's entirely isolated from actual Dovecot code. There's just a simple API that the backend needs to implement and it'll work with Dovecot.
BTW. Maybe http://www.xtreemfs.org/ or http://www.gluster.org/ already does what you're thinking about? I haven't looked at them closely enough..
On Tue, 2009-08-25 at 12:41 -0300, fernando@dfcom.com.br wrote:
I´m suggesting the opposite, to not have an abstraction layer. The abstraction layer would be the proxy - dovecot does the abstraction layer. When I´m telling 'proxy' I meaning something like the proxy feature of the dovecot.
Dovecot needs to read/write data, so the part of Dovecot that does that is the filesystem abstraction layer. Or if you'd prefer it could be called I/O layer. On top of that you could build all kinds of I/O accesses (all running in Dovecot processes):
- regular filesystem
- database
- your proxy thingy
And in my head the final solution is very 'simple' (despite of its hard programming work). Dovecot just access 'in parallel' a set of servers, requesting their informations (really, with imap connection) and caches them - the results (maybe in some database, as you suggested). It would be an extension of the proxy feature that already exists at dovecot.
Well.. This is what I initially thought you might have meant, but.. You wanted to distribute one user's mails across multiple servers, right? So they would exist in multiple different servers and Dovecot would access them via IMAP and somehow gather them together and show to user? That would be horribly difficult to implement via IMAP.
I have thought about implementing a caching IMAP proxy though. But the main use of that would be if the primary server was somewhere with higher latency or lower bandwidth. And implementing even that is going to take a lot of work.
Hi Timo,
Do you have any experience using Dovecot and Glusterfs ? How about the maturity of Glusterfs in a huge email environment ?
Fernando
On Tue, 2009-08-25 at 12:12 -0300, fernando@dfcom.com.br wrote:
Hi Timo,
Yes it´s related, but I don´t understand '... You'll just need to implement a filesystem that allows distributing a single user's mails to multiple servers ...'.
My idea is just in the direction that we don´t need to care about filesystem, we don´t need any distribuited filesystem...let it be as user wants....at any some proxy level, the end storage can be ext3, reiser, S.O linux, freebsd, and so on. I think that the more elements we insert, the more complex and hard to mount/debug the solution would be.
I mean Dovecot accesses filesystem through a simple abstraction layer. You wouldn't have to implement a real OS level filesystem, but you'd implement a "proxying Dovecot-filesystem backend" that sits on top of real filesystems. It's probably not much of a difference to what you're thinking about, except in my mind it's entirely isolated from actual Dovecot code. There's just a simple API that the backend needs to implement and it'll work with Dovecot.
BTW. Maybe http://www.xtreemfs.org/ or http://www.gluster.org/ already does what you're thinking about? I haven't looked at them closely enough..
fernando@dfcom.com.br wrote:
I´m reading the past topics related to archive and scalability of dovecot, they are all very interesting. Here, I´m using two dovecot proxies in front of five storages pairs, and we split the domain´s accounts among those servers. So, we can share the i/o load and if one server goes down only few accounts of the domain stops (not all of them).
But, we began to have space problems - and the solution would be insert more and more storage servers. So I was searching for some archive solutions (hard links - at S.O level, or some dovecot extension). A friend told me that he knows an ISP that share even the mailbox of the users among many servers -
this is very weird and (at same time) very interesting approach. Instead of put all messages into one maildir and this maildir into one server, this "maildir" (?) is spplited among many servers - so, if one servers fails the account is still acessible and they move old/big messages to a new "cheap" storage - archiving transparently.
Surely, other than the possibility of archiving a copy in a separate location at delivery time, everything else here is better done by a high-availability clustered SAN and *not* by an application?
Archival is a valuable thing to have. Being able to, on delivery, deposit a separate copy elsewhere (without necessarily indexing it etc.) allows for near-line back-up or storage where legal or corporate regulation require.
(I'm currently doing this using a cron job and a program I've written which checks to see if there are any new messages in everyone's inbox Maildirs and then hard-links them into a separate directory structure once a minute. Messages which disappear from the true inbox are then kept for a further 90 days. This allows users to recover messages that they may have accidentally deleted from their inbox.)
Oh, and with reference to the second paragraph... hard links only work on a single filesystem, not across multiple filesystems or servers.
Steve
IT Systems Administrator, E-Mail:- steve@earth.ox.ac.uk Department of Earth Sciences, Tel:- +44 (0)1865 282110 University of Oxford, Parks Road, Oxford, UK. Fax:- +44 (0)1865 272072
participants (3)
-
fernando@dfcom.com.br
-
Stephen Usher
-
Timo Sirainen