[Dovecot] Replication protocol design

Ed W lists at wildgooses.com
Thu May 1 16:26:50 EEST 2008


>
> Dovecot stores flag changes as "added flags" and "removed flags" in 
> transaction file, so it doesn't need to do any comparing to figure out 
> what had changed. This makes the flag changes also more reliable. For 
> example if a message originally had flags (\Flagged) and then two 
> servers changed them:
>
> S1: STORE 1 +FLAGS \Answered
> S2: STORE 1 +FLAGS \Seen
> S2: STORE 1 -FLAGS \Flagged
>
> If replication protocol sent the changes as +flags -flags, it would be 
> unambiguous what the final flags are: (\Answered \Seen).
>
> If replication protocol instead sent the flags as their currently 
> known flag states (as IMAP protocol does):
>
> S1: * 1 FLAGS (\Answered \Flagged)
> S2: * 1 FLAGS (\Seen)
>
> There aren't any good ways to figure out what the wanted final flags 
> are supposed to be.

Sounds like a good candidate for a slightly customised IMAP command to 
get that info?

>> I do like the idea of making this more generic and hence hackable 
>> than writing all the code into dovecot itself.  Perhaps we could 
>> start with an external proxy app at each end of the link which is 
>> external to the imap server, ie basically start with IMAP sync.
>
> That would work for the mailbox synchronization part, but I'm more 
> interested in the incremental synchronization part which replicates 
> all changes in all mailboxes immediately. That's not really possible 
> to base on an external proxy. Mostly because the IMAP protocol 
> supports seeing changes only in a single mailbox at a time, and trying 
> to change that would most likely make the protocol different enough 
> from IMAP that there's not much point in using IMAP as a base anymore.


I'm not sure.  Consider a design where we have two ways to sync servers.

1) Live instant replication.  Done by setting a given folder to be 
monitored for live changes.  All changes made to that folder cause a 
transaction log to be generated (actually probably two logs, one listing 
the operations and another possibly listing the data relating to the 
affected messages).  These log files could be a simple incremental bz2 
file which occasional flush points so that they can be truncated up to a 
flush point easily.  At any point it would be possible to simply take 
that file and use the transport mechanism of choice (usb stick, cd, 
internet, etc) to replay that log back on the other server.

2) We can guarantee that any such transactional sync will go wrong for 
lots of reasons, not least on disk changes outside of the control of the 
server, eg backup/restore, corruption, etc.  Therefore there is a need 
for an online style sync where we simply compare the list of files in 
both folders and resolve the changes to bring both into sync (IMAPSync 
style)

Now where I was going with this is that it's going to need a custom 
protocol to get at those log files in 1) above anyway and we might want 
to turn it on and off per folder, so it's could end up being a runtime 
parameter, hence does it matter whether it lives inside the server code 
or outside.  However, I have lost my train of thought now so I will just 
quietly slink away...

Ed W


More information about the dovecot mailing list