[Dovecot] scalability and arhive ideas

fernando at dfcom.com.br fernando at dfcom.com.br
Tue Aug 25 17:00:00 EEST 2009


Hi,

I´m reading the past topics related to archive and scalability of dovecot,
they are all very interesting. Here, I´m using two dovecot proxies in
front of five storages pairs, and we split the domain´s accounts among
those servers. So, we can share the i/o load and if one server goes down
only few accounts of the domain stops (not all of them).

But, we began to have space problems - and the solution would be insert
more and more storage servers. So I was searching for some archive
solutions (hard links - at S.O level, or some dovecot extension). A friend
told me that he knows an ISP that share even the mailbox of the users
among many servers -

this is very weird and (at same time) very interesting approach. Instead
of put all messages into one maildir and this maildir into one server,
this "maildir" (?) is spplited among many servers - so, if one servers
fails the account is still acessible and they move old/big messages to a
new "cheap" storage - archiving transparently.

Well, maybe it´s an stupid idea, but couldn´t dovecot imap/pop proxy do
the same ? I mean, imagine the following scenario:

1- Proxy does an user account sql - today the return is (among other
data), the final server IP, but it could be the storage servers for this
account
2- Proxy does parallel connections  (instead of today´s one connection
with the final server). Retrieve the messages and caches locally where the
message is. If the cache is lost, no problem, it just connects again and
re-cache them.
3- When a message is deleted/flagged, etc - it has the cache (allowing it
to know where send the command to), when a message inserted (as sent
folder) - the system can have the 'default storage' where the messages are
delivered and saved to -  the other storages are just for archiving (or
even a round-robin delivery !? - hard to control).

Problems:

1- Have the cache at proxy level.
2- Quota calculation (how the delivery process can check this ?)
3- Maybe mixing proxy connection and this new approach is not easy way.
4- the more servers, the more time waiting their answers (imagine one of
them answering slowly).

Advantages:

1- No local mounts at storage level - (as nfs and other network mounted
partitions).
2- Servers independency - if space is at a critical value, insert a pair
of servers, move messages there, insert the servers in the sql answer
(server1, server2, server3, ... , serverN).
3- Scalability - the more servers, the more i/o load you can share.

Well, sorry for the long post...but I hope to collaborate with some ideas
(even if they are stupid :( )

Regards,
Fernando Bertasso Figaro




More information about the dovecot mailing list