Re-starting replication after a long disconnect

Timo Sirainen tss at iki.fi
Fri Jan 29 23:40:04 UTC 2016


Hi,

> On 28 Jan 2016, at 19:23, John Krug <jk at ucls.uchicago.edu> wrote:
> 
> Using Dovecot 2.2.15.8
> 
> We use dsync as shown at http://wiki.dovecot.org/Replication <http://wiki.dovecot.org/Replication> to have a second copy of our mail store. Users connect to one server and the other is considered a secondary, and is there “just in case”. If one-way backup/sync would make more sense, I would consider that, but at the time we set it up (October 2013), I recall there was some issue with one-way not being as reliable. 
> 
> We had some problems with the secondary mail server and had to take it off line for while. Now that the mail on that server is stale, what is the best way of starting over? When I simply turned it back on after a lengthy break (weeks), some users were reporting deleted messages showing up in their inboxes. I know dsync is master/master and uses index files, but I am not certain of the details and whether this behavior is completely unexpected or not.

Replication tracks deleted mails within dovecot.index.log files, which can't grow indefinitely. So eventually the old information gets lost, and dsync rather wants to not lose any information so it rather brings back older messages than wrongly deletes messages. Although it shouldn't actually be exactly that simple. I think it should only happen if the newest message(s) in the folder are deleted they could come back, but nothing deleted below the highest non-deleted messages should be coming back. If that's happening I think it's a bug.

> The laundry list:
> What are the f and R flags for sync? http://wiki2.dovecot.org/Tools/Doveadm/Sync

-f makes sure everything is synced by opening all the folders and fetching the email UIDs. Not just looking at the number of messages and UIDNEXT and HIGHESTMODSEQ values.

-R is relevant only to doveadm backup, which does the backup to the other direction (overwriting local mailbox according to remote changes).

> Use doveadm backup to get the initial data over to the seconday?

That would work and be the most efficient way. You could run it for all the users. Probably as easily as running on the primary server:

doveadm backup -A -d -f

After that replication should be working again.

> Can the server be in use while I do backup?

Yes, but keep the replication disabled (e.g. by disabling replication plugin).

> Do I need to wipe the secondary and start over?

That's another possibility, but it of course has to transfer all the data all over again.



More information about the dovecot mailing list