On 14.3.2012, at 18.45, Michescu Andrei wrote:
Nope dsync was not running during the email delivery on that account. I've simulated in a controlled environment.
How? You mean simply deliver mail to server A and to server B and run dsync and it duplicates it? I can't reproduce it that way, only if I run dsync during a flood of new mails.
YES. simply deliver mail to server A and then to server B (to the same user_1). After run dsync and you get exactly what you saw in my previous email. That's why I included the ls for the both servers, so that you can see the email files too. because each server duplicates only its own email (so brings the email from the other server and duplicates its own email).
Think that for incoming SMTP I can even restrict which server is the master (forcing all other to redeliver to this one). BUT, for a distributed IMAP cluster there is no way to restrict users to perform changes on only one server. This would defeat the model and the purpose of a distributed cluster...
For IMAP it's not much of a problem, because user typically still uses only one client actively, so clients aren't uploading mails to multiple servers at the same time.
hehe... one would think so, but when you have road-warriors that roam you can not insure that the server where they connect for IMAP (closest based on geo-ip) is the same as the server that you have picked for inbound SMTP. So you already have 2 servers that mess-up with user's mailbox.
The second case where you can not control this is for mobile devices that flip-in/out of wi-fi (my iPhone is in Canada when it is on 3G and in Europe when it is WiFi due to vpn tunneling, and this can change every couple of minutes... :( )
One idea might be to have the IDs dependent on server where they appear first time so that they keep the ID once they get replicated. Here there are many options:
The messages have GUIDs that stay the same always, but IMAP UIDs are required to be ascending from client's point of view, and several clients rely on that, so when UID conflict happens the only safe thing to do is to assign new UIDs for all of the conflicting mails.
well I don't know much about IMAP standard (you guys are the experts :)! here). If the GUID stays the same then this can be used to prevent the duplication error.
Also, as you can detect if the email is new or not (a client has already seen it or not): in the case that no one has seen it then it is safe to assign any UIDs that fits. In case that on only one server it has been seen then you can give it that UIDs on all servers, and reassign all the unseen ones. So the only messed-up case is if on both servers the message has been seen with different UIDs :(
Thank you very much for your time and patience. I know that our setup is pretty atypical. And think that this model with only 2 servers I'm showing you is only for simplicity as the real deployment has multiple servers geographically sparse connected by slow intercontinental internet links... :)) Otherwise we'd use a distribute file system and have only a unified storage :P
Best regards. Andrei