multi sync (>2 servers) + selective sync + trigger

b-dovecot.org at grmbl.net b-dovecot.org at grmbl.net
Sun May 31 09:23:30 UTC 2015


Daniel,

On Sat, May 30, 2015 at 09:26:32AM -0300, Daniel van Ham Colchete wrote:
>    B,
>    I really like the idea of N-way replication. Pairs are ugly, they cost double! Even if you have 20 servers, when one goes down all that IO traffic goes to just one.
>    So, what I did here was a (kind of) DHT-based n-way replication, where the node for the second copy is independent of where the first copy is.
>    For that you will have to use the mail_replica inside your userdb. Here I'm using MySQL. The catch is that the mail_replica is now always the same for the same user. The first server need a mail_replica pointing to the second while the second need a mail_replica pointing to the first. In order to do that I have a table with two fields: hostA and hostB. Each dovecot will have to figure out witch one to use. So, my dovecot-sql.conf-ext  looks like:
>    user_query = select email, uid, gid, home, concat('tcp:', IF(hostA = '10.0.3.11', hostB, hostA), ':12345') as mail_replica from users where email='%Lu' and (hostA='10.0.3.11' or hostB='10.0.3.11')
>    password_query = SELECT email as user, password FROM users WHERE email = '%Lu'
>    iterate_query = SELECT email AS user FROM users WHERE hostA = '10.0.3.11' or hostB='10.0.3.11'
>    On each host you have to put the right IP on each dovecot-sql.conf.ext.Â
>    It works perfectly! And it is nice too! Imagine you have a 10 node cluster and the first 2 nodes fail. Instead of having 10% of your users down, only 2.2% (2 x 1/10 x 1/9) of  your users will be offline! Also if each node can handle 10k users and if you want to have N+1 redundancy you can have 90k users in the same servers, instead of 50k if you had pairs. So, availability is UP and costs are DOWN! Is there really a choice here? Hehe.
>    Caveats:
>    - Dovecot Proxy don't understand the concept of having 2 (or more) active places for the same account. It seems like you would need an outside monitor that would check the network every x seconds and change the proxy field on the database when a node went down to hostA or hostB. I think this is madness so I had to write my own POP3/IMAP/LMTP Proxy (it's not that hard) to try the primary first and if that socket don't connect in 1 second we go straight to the second without the user noticing it. It is really bad that Dovecot Proxy can't talk replication.


This is very similar to how I would work around it: RR-DNS or service discovery for the other nodes in mail_replica, which would randomly sync data to another node.
The big problem with this is that propagation is slow.
Therefor my question if any mail triggers all mailboxes with changes to be synced or just the mailbox for the one being delivered which would speed that up.

For POP3/IMAP, what do you use? perdition?
What for LMTP?

>    - Your availability goes really UP (lots of nines) when you add a third copy while still having lower costs than with pairs, but you can't use Dovecot's notificator/aggregator/replication plugins for that to make everything run smoothly and I don't think that an outside/cron-based dsync is an option.

I guess for full syncs it would but that removes the "real-time" aspect of things.

>    Â - If you really use DHT-like algorithms, rebalancing your cluster will require mininum data movement and you add or remove nodes. It will require a small change on the SQL queries above so that the new node also recognizes the new user, but it is easy to do.Â
>    Let me know what you think!

I'm wonder why Timo didn't expand mail_replica to be a list of servers rather than just accepting one.
That would sort out a lot of this already.
mail_replica = 'server1,server2, .. serverN'

Am I missing something?

Cheers!
B


More information about the dovecot mailing list