On 03/23/12 22:25, Timo Sirainen wrote:
In case anyone is interested in reading (and maybe helping!) with a dsync redesign that's intended to fix all of its current problems, here are some possibly incoherent ramblings about it:
http://dovecot.org/tmp/dsync-redesign.txt
and even if you don't understand that, here's another document disguising as an algorithm class problem :) If anyone has thoughts on how to solve it, would be great:
http://dovecot.org/tmp/dsync-redesign-problem.txt
It only deals with saving new messages, not expunges/flag changes/etc, but those should be much simpler.
Well, dsync is a very useful tool, but with continuous replication it tries to solve a problem which should be handled -at least partially- elsewhere. Storing stuff in plain file systems and duplicating them to another one just doesn't scale.
I personally think that Dovecot could gain much more if the amount of work going into fixing or improving dsync would go into making Dovecot to (be able of) use a high scale, distributed storage backend. I know it's much harder, because there are several major differences compared to the "low latency" and consistency problem free local file systems, but its fruits are also sweeter for the long term. :)
It would bring Dovecot into the class of open source mail servers where there are currently no contenders.
BTW, for the previous question in this topic (are there any nosql dbs supporting application-level conflict resolution?), there are similar solutions (like CouchDB, but having some experiences with it, I wouldn't recommend it for massive mail storage -at least the plain CouchDB product), but I guess you would be better off with designing a schema which doesn't need it at the first time. For example, messages are immutable, so you won't face this issue in this area. And for metadata, maybe the solution is not to store "digested" snapshots of the current metadata (folders, flags, message links for folders etc), but to store the changes happening on the user's mailbox and occasionally aggregate them into a last known good and consistent state. Also, there are other interesting ideas, maybe with real single instance store (splitting mime parts? Storing attachments in plain binary form? This always brings up the question of whether the mail server should modify the mails, can be pretty bad for encrypted/signed stuff).
And of course there is always the problem of designing a good, consistent method which is also efficient.