multi sync (>2 servers) + selective sync + trigger
Hi all,
I've been researching ways to replicate mail across multiple mailstores and have a few questions.
Synching 2 mailstores (M1 & M2) via dsync works fine. I want to add a 3rd and 4th (M3 & M4) server to also be synced.
Multi-sync (>2 servers):
- How do I tell M1 and/or M2 to also sync to M3 and/or M4?
Selective sync:
- How do I sync specific domains to specific servers and not to others?
Trigger:
- Does a normal sync check only check the mailbox being changed or does it get triggered on mailbox changes to check ALL indexes and sync all changed mailboxes accordingly?
- Can I use an email to trigger a sync for any other mailboxes which might be out of sync?
TIA
B
B,
I really like the idea of N-way replication. Pairs are ugly, they cost double! Even if you have 20 servers, when one goes down all that IO traffic goes to just one.
So, what I did here was a (kind of) DHT-based n-way replication, where the node for the second copy is independent of where the first copy is.
For that you will have to use the mail_replica inside your userdb. Here I'm using MySQL. The catch is that the mail_replica is now always the same for the same user. The first server need a mail_replica pointing to the second while the second need a mail_replica pointing to the first. In order to do that I have a table with two fields: hostA and hostB. Each dovecot will have to figure out witch one to use. So, my dovecot-sql.conf-ext looks like:
user_query = select email, uid, gid, home, concat('tcp:', IF(hostA = '10.0.3.11', hostB, hostA), ':12345') as mail_replica from users where email='%Lu' and (hostA='10.0.3.11' or hostB='10.0.3.11') password_query = SELECT email as user, password FROM users WHERE email = '%Lu' iterate_query = SELECT email AS user FROM users WHERE hostA = '10.0.3.11' or hostB='10.0.3.11'
On each host you have to put the right IP on each dovecot-sql.conf.ext.
It works perfectly! And it is nice too! Imagine you have a 10 node cluster and the first 2 nodes fail. Instead of having 10% of your users down, only 2.2% (2 x 1/10 x 1/9) of your users will be offline! Also if each node can handle 10k users and if you want to have N+1 redundancy you can have 90k users in the same servers, instead of 50k if you had pairs. So, availability is UP and costs are DOWN! Is there really a choice here? Hehe.
Caveats:
Dovecot Proxy don't understand the concept of having 2 (or more) active places for the same account. It seems like you would need an outside monitor that would check the network every x seconds and change the proxy field on the database when a node went down to hostA or hostB. I think this is madness so I had to write my own POP3/IMAP/LMTP Proxy (it's not that hard) to try the primary first and if that socket don't connect in 1 second we go straight to the second without the user noticing it. It is really bad that Dovecot Proxy can't talk replication.
Your availability goes really UP (lots of nines) when you add a third copy while still having lower costs than with pairs, but you can't use Dovecot's notificator/aggregator/replication plugins for that to make everything run smoothly and I don't think that an outside/cron-based dsync is an option.
If you really use DHT-like algorithms, rebalancing your cluster will require mininum data movement and you add or remove nodes. It will require a small change on the SQL queries above so that the new node also recognizes the new user, but it is easy to do.
Let me know what you think!
Best, Daniel Colchete
On Fri, May 29, 2015 at 4:25 PM, b-dovecot.org@grmbl.net wrote:
Hi all,
I've been researching ways to replicate mail across multiple mailstores and have a few questions.
Synching 2 mailstores (M1 & M2) via dsync works fine. I want to add a 3rd and 4th (M3 & M4) server to also be synced.
Multi-sync (>2 servers):
- How do I tell M1 and/or M2 to also sync to M3 and/or M4?
Selective sync:
- How do I sync specific domains to specific servers and not to others?
Trigger:
- Does a normal sync check only check the mailbox being changed or does it get triggered on mailbox changes to check ALL indexes and sync all changed mailboxes accordingly?
- Can I use an email to trigger a sync for any other mailboxes which might be out of sync?
TIA
B
Daniel,
On Sat, May 30, 2015 at 09:26:32AM -0300, Daniel van Ham Colchete wrote:
B, I really like the idea of N-way replication. Pairs are ugly, they cost double! Even if you have 20 servers, when one goes down all that IO traffic goes to just one. So, what I did here was a (kind of) DHT-based n-way replication, where the node for the second copy is independent of where the first copy is. For that you will have to use the mail_replica inside your userdb. Here I'm using MySQL. The catch is that the mail_replica is now always the same for the same user. The first server need a mail_replica pointing to the second while the second need a mail_replica pointing to the first. In order to do that I have a table with two fields: hostA and hostB. Each dovecot will have to figure out witch one to use. So, my dovecot-sql.conf-ext  looks like: user_query = select email, uid, gid, home, concat('tcp:', IF(hostA = '10.0.3.11', hostB, hostA), ':12345') as mail_replica from users where email='%Lu' and (hostA='10.0.3.11' or hostB='10.0.3.11') password_query = SELECT email as user, password FROM users WHERE email = '%Lu' iterate_query = SELECT email AS user FROM users WHERE hostA = '10.0.3.11' or hostB='10.0.3.11' On each host you have to put the right IP on each dovecot-sql.conf.ext. It works perfectly! And it is nice too! Imagine you have a 10 node cluster and the first 2 nodes fail. Instead of having 10% of your users down, only 2.2% (2 x 1/10 x 1/9) of  your users will be offline! Also if each node can handle 10k users and if you want to have N+1 redundancy you can have 90k users in the same servers, instead of 50k if you had pairs. So, availability is UP and costs are DOWN! Is there really a choice here? Hehe. Caveats:
- Dovecot Proxy don't understand the concept of having 2 (or more) active places for the same account. It seems like you would need an outside monitor that would check the network every x seconds and change the proxy field on the database when a node went down to hostA or hostB. I think this is madness so I had to write my own POP3/IMAP/LMTP Proxy (it's not that hard) to try the primary first and if that socket don't connect in 1 second we go straight to the second without the user noticing it. It is really bad that Dovecot Proxy can't talk replication.
This is very similar to how I would work around it: RR-DNS or service discovery for the other nodes in mail_replica, which would randomly sync data to another node. The big problem with this is that propagation is slow. Therefor my question if any mail triggers all mailboxes with changes to be synced or just the mailbox for the one being delivered which would speed that up.
For POP3/IMAP, what do you use? perdition? What for LMTP?
- Your availability goes really UP (lots of nines) when you add a third copy while still having lower costs than with pairs, but you can't use Dovecot's notificator/aggregator/replication plugins for that to make everything run smoothly and I don't think that an outside/cron-based dsync is an option.
I guess for full syncs it would but that removes the "real-time" aspect of things.
 - If you really use DHT-like algorithms, rebalancing your cluster will require mininum data movement and you add or remove nodes. It will require a small change on the SQL queries above so that the new node also recognizes the new user, but it is easy to do. Let me know what you think!
I'm wonder why Timo didn't expand mail_replica to be a list of servers rather than just accepting one. That would sort out a lot of this already. mail_replica = 'server1,server2, .. serverN'
Am I missing something?
Cheers! B
On 31/05/2015 7:23 PM, b-dovecot.org@grmbl.net wrote:
 - If you really use DHT-like algorithms, rebalancing your cluster will require mininum data movement and you add or remove nodes. It will require a small change on the SQL queries above so that the new node also recognizes the new user, but it is easy to do. Let me know what you think!
I'm wonder why Timo didn't expand mail_replica to be a list of servers rather than just accepting one. That would sort out a lot of this already. mail_replica = 'server1,server2, .. serverN'
Am I missing something?
Cheers! B
I thought space separated values would work so I have had this running for a while - but after a closer examination just now I've just realised that space separated is accepted without error but -doesn't- actually work.
In other words, the latest define of mail_replica silently overwrites the previous definitions. This is probably desirable as it allows per-user settings to override global ones, however it means that yes, there appears to be no way to configure multiple mail_replica values and have them work...
Reuben
participants (3)
-
b-dovecot.org@grmbl.net
-
Daniel van Ham Colchete
-
Reuben Farrelly