[Dovecot] Slightly more intelligent way of handling issues in sdbox?
Mark Zealey
mark.zealey at webfusion.com
Tue Feb 7 14:08:09 EET 2012
06-02-2012 22:47, Timo Sirainen yazmış:
> On 3.2.2012, at 16.16, Mark Zealey wrote:
>
>> I was doing some testing on sdbox yesterday. Basically I did the following procedure:
>>
>> 1) Create new sdbox; deliver 2 messages into it (u.1, u.2)
>> 2) Create a copy of the index file (no cache file created yet)
>> 3) deliver another message to the mailbox (u.3)
>> 4) copy back index file from stage (2)
>> 5) deliver new mail
>>
>> Then the message delivered in stage 3 ie u.3 gets replaced with the message delivered in (5) also called u.3.
> http://hg.dovecot.org/dovecot-2.1/rev/a765e0a895a9 fixes this.
I've not actually tried this patch yet, but looking at it, it is perhaps
useful for the situation I described below when the index is corrupt. In
this case I am describing however, the not is NOT corrupt - it is simply
an older version (ie it only thinks there are the first 2 mails in the
directory, not the 3rd). This could happen for example when mails are
being stored on different storage than indexes; say for example you have
2 servers with remote NFS stored mails but local indexes that rsync
between the servers every hour. You manually fail over one server to the
other and you then have a copy of the correct indexes but only from an
hour ago. The mails are all there on the shared storage but because the
indexes are out of date, when a new message comes in it will be
automatically overwritten.
>> (speaking of which, it would be great if force-resync also rebuilt the cache files if there are valid cache files around, rather than just doing away with them)
> Well, ideally there shouldn't be so much corruption that this matters..
That's true, but in our experience we usually get corruption in batches
rather than a one-off occurrence. Our most common case is something like
this: Say for example there's an issue with the NFS server (assuming we
are storing indexes on there as well now) and so we have to killall -9
dovecot processes or similar. In that case you get a number of corrupted
indexes on the server. Rebuilding the indexes generates an IO storm (say
via lmtp or a pop3 access); then the clients log in via imap and we have
to re-read all the messages to generate the cache files which is a
second IO storm. If the caches were rebuilt at least semi-intelligently
(ie you could extract from the cache files a list of things that had
previously been cached) that would reduce the effects of rare storage
level issues such as this.
Mark
More information about the dovecot
mailing list