[Dovecot] Slightly more intelligent way of handling issues in sdbox?

Timo Sirainen tss at iki.fi
Thu Feb 9 01:26:04 EET 2012


On 7.2.2012, at 14.08, Mark Zealey wrote:

>> http://hg.dovecot.org/dovecot-2.1/rev/a765e0a895a9 fixes this.
> 
> I've not actually tried this patch yet, but looking at it, it is perhaps useful for the situation I described below when the index is corrupt. In this case I am describing however, the not is NOT corrupt - it is simply an older version (ie it only thinks there are the first 2 mails in the directory, not the 3rd). This could happen for example when mails are being stored on different storage than indexes; say for example you have 2 servers with remote NFS stored mails but local indexes that rsync between the servers every hour. You manually fail over one server to the other and you then have a copy of the correct indexes but only from an hour ago. The mails are all there on the shared storage but because the indexes are out of date, when a new message comes in it will be automatically overwritten.

I don't recommend using local indexes with dbox, since there is actual data loss if they're not up to date (flags, and with mdbox the user may have copied/moved the mail elsewhere). Still, better to catch this situation than not:
http://hg.dovecot.org/dovecot-2.1/rev/09db0f7aa6ce

>>> (speaking of which, it would be great if force-resync also rebuilt the cache files if there are valid cache files around, rather than just doing away with them)
>> Well, ideally there shouldn't be so much corruption that this matters..
> 
> That's true, but in our experience we usually get corruption in batches rather than a one-off occurrence. Our most common case is something like this: Say for example there's an issue with the NFS server (assuming we are storing indexes on there as well now) and so we have to killall -9 dovecot processes or similar. In that case you get a number of corrupted indexes on the server. Rebuilding the indexes generates an IO storm (say via lmtp or a pop3 access); then the clients log in via imap and we have to re-read all the messages to generate the cache files which is a second IO storm. If the caches were rebuilt at least semi-intelligently (ie you could extract from the cache files a list of things that had previously been cached) that would reduce the effects of rare storage level issues such as this.

Well, the decisions are now remembered: http://hg.dovecot.org/dovecot-2.1/rev/d8d214cc1936

That can't really be improved .. If nothing is deleted from cache, it might contain invalid data and doveadm force-resync wouldn't be doing its job right. If anything is added to cache, it would require reading and parsing the mail contents during rebuild, and that's not in any way better than letting the imap processes do it later when the mailbox isn't locked.


More information about the dovecot mailing list