-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Thu, 17 May 2007, Timo Sirainen wrote:
Hello,
OpenLDAP uses another strategy, which is more robust aka needs less fragile interaction between the servers.
OpenLDAP stores any transaction into a replication log file, after it has been processed locally. The repl file is frequently read by another demaon (slurp) and forwarded to the slaves. If the forward to one particular slaves fails, the transaction is placed into a host-specific rejection log file. OpenLDAP uses a feature, that any modifiation (update, add new, delete) can be expressed in "command" syntax, hence, the "slave" speaks the same protocol as the master.
The biggest advantage is that the transation already succeeded for the master and is replayed to the slaves. So when pushing the message to the slave, you need not fiddle with decreasing UIDs for instance, because to perform a partial sync of a known-good-state mailbox. And the transaction is saved in the replay log file. In case the master process/host is crashing.
I think, if the replication log is replayed fastly - e.g. by "tailing" the file, you can effectively separate the problem of non-reacting slaves and re-replay for slaves that come up later and have quasi-immediate updates of the slaves. Also, because one replay agent per slave can be used, all interaction to the slave is sequential. You wrote something about avoiding files, what about making the repl log file a socket; so the frontend is dealing with the IMAP client and forwards the request to the replayer and is, therefore, not effected by probably bad network issues to the slaves.
You cannot have the advantage of OpenLDAP to use the same IMAP protocol for the slaves, because of some restrictions. You want to have a 100% replica, as I understand it, hence, the UIDs et al need to be equal. So you will probably need to improve the IMAP protocol by:
"APPEND/STORE with UID".
The message will be spooled with the same UID on the slave. As you've wrote, it SHOULD NOT happen, that the slave fails, but if the operation is impossible, due to some wicked out-of-sync state, the slave reports back and requests a full resync. The replay agent would then drop any changes in the transaction for the specific host and mailbox and syncs the whole mailbox with the client, probably using something like rsync?
BTW: It would be good, if the resyncs can be initiated on admin request, too ;-)
For the dial-up situation you've mentioned (laptop with own server), the replay agent would store any changes until the slave come up, properly by contacting the Master Dovecot process and issues something like "SMTP ETRN".
When the probability is low that the same mailbox is accessable on different hosts (for shared folders multiple accesses are likely), this method should be even work well in multi-master situations. You'll have to run replay agents on all the servers then.
To get the issues with the UIDs correct, when one mailbox is in use on different hosts, you thought about locks. But is this necessary?
If only the UIDs are the problem, then with a method to "mark" an UID as taken throughout multiple masters, all masters will have the same UID level, not necessarily with the message data already associated, meaning:
master A is to APPEND a new message to mailbox M, it sends all other masters the info: "want to take UID U". If the UID is already taken by another master B, B replies "UID taken", then the mailboxes are out-of-sync and need a full resync. If a master B receives a request for UID U, it has sent a election for itself, masters A&B are ranked, e.g. by IP address, so master B replies either "you may take it" or "I want to take it". In first case, master B re-issues its request for another UID U2 and marks UID U as taken. Otherwise master B marks UID U as taken in mailbox M.
If master A got the "OK for UID U", it allocates it finally and accepts the message from the IMAP/SMTP client and places the message into the replay log file.
When now a master B gets a transaction "STORE message as UID U" being taken, but no message, yet, the master accepts the transaction.
doesn't make sure that messages themselves aren't lost. If the master's hard disk failed completely and it can't be resynced anymore, the messages saved there are lost. This could be avoided by making the saving wait until the message is saved to slave:
- save mail to disk, and at the same time also send it to slave
- allocate UID(s) and tell to slave what they were, wait for "mail saved" reply from slave before committing the messages to mailbox permanently
Well, this assumes that everything is functional hyper-good. To preseve a hard disk should not be the issue of Dovecot, but the underlaying filesystem, IMHO. (aka RAID, SAN)
If you want to wait for each transaction, that all slaves gave their OK, you'll have problems with the "slave server on laptop" scenario. Then you'll need to perfrom a full sync each time.
BTW: There is something like DLM (Distributed Lock Manager), I don't know it this is what you are looking for.
Bye,
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux)
iQEVAwUBRlGgKC9SORjhbDpvAQLvaQgAtIebLdGqSsV0AGMb/miU9GErGdRBvyWQ /0Z99DWugw4zDwOBzLgArOLxnJLKORMEs79/UXZVrESlXGzvOjjc5xzGU7VPEJ25 5UP8C8I/cTOeI8nvN0KTZ8Af576YgTb/qL5Jq1YwW6y60HYMiglFq5ZTvjAvZHPW oFQM30h0ZjnQxHDvXVy4PNtx0J1sU8vb1vD3Bd7jEsEwzj+3rtdmKoN9OxgqDV4X 5bEF+f2TAX28f1YGh5I0kfibh/7wseWMhqlNyUhAWmY9SSSHte0ZRg9b69PCU3rF ovz5807zOTzV51NmXjQPEYxBDnX5/VCwvotKmwEMhBhlJlW4pHyFQw== =ppQK -----END PGP SIGNATURE-----