[Dovecot] Index corruption causes child process to die

Bill Cole dovecot-20061108 at billmail.scconsult.com
Sat Nov 11 03:10:44 UTC 2006

At 1:19 PM -0500 11/10/06, bofh list wrote:
>I am seeing the following errors with vanilla RC13 (two servers, 
>nfs, mmap_disable=yes, lock_method=fnctl, Maildir).  These only 
>manifested after the upgrade to 1.0RC13
>dovecot: Nov 09 15:44:08 Error: IMAP(user1): file ioloop.c: line 22 
>(io_add): assertion failed: (fd >= 0)
>dovecot: Nov 09 15:44:08 Error: child 2920 (imap) killed with signal 6
>dovecot: Nov 09 16:09:47 Error: IMAP(user2): file ioloop.c: line 22 
>(io_add): assertion failed: (fd >= 0)

Data point: I am seeing the same thing (ioloop.c line 22 assertion 
failure) with a very different system: MacOS X 10.4.8, 
mmap_disable=no (default), single machine, Maildir at ~/Maildir on 
the same internal disk as everything else, lock_method=fcntl.

Because this is a small informal test system I also have a little 
more info that may be relevant. Since rc7 (my first rc version) I 
have seen occasional incidents where a single account (the same one 
involved in my assertion failures today) gets one subdirectory locked 
with one or more temp.$PID.<random> files in it and a message in tmp, 
This account has multiple clients almost constantly logged in, one 
Eudora/Mac, one Outlook 2003/XP, one Versamail 3.5/PalmOS. Only 
Eudora moves anything anywhere, Outlook sits mostly idle, Versamail 
is scanning the Inbox every 15 minutes and sync's everything 
irregularly every few hours. Most mail arriving for this account is 
automatically moved to one of 3 IMAP folders, and about half of the 
rest is automatically stashed locally by the client and trashed on 
the server. In all cases, the jammed IMAP directory In all 4 cases of 
the hang, a simple stop and restart of Dovecot and all clients has 
failed to solve the hang, and the solutions have come from either 
full system reboots or cleaning out both the dovecot index files and 
the file in the tmp folder of the Maildir subdirectory that was hung. 
I believe this is connected because twice today around  the same 
times as the assertion failures in the logs, I had clients time out 
while trying to synch IMAP folders, which had me suspecting the same 
sort of failure, only to work on another connection within 10 minutes.

Bill Cole                                  
bill at scconsult.com

