[Dovecot] dovecot-1.2.8 imap crash (with backtrace)
David Halik
dhalik at jla.rutgers.edu
Wed Dec 30 19:10:47 EET 2009
Ok, I think I've got some more info and a more accurate time line for
you. I tried this on two different dumps from two different users. The
count was 4 in the first example and 0 in the second. I'm guessing
that's considered "small"? The links to my gdb sessions for both are
below and have some of the info you were looking for. The corresponding
logs are also there so you can see how each failed. I put everything on
pastebin so it's a little easier to see.
By the way, I also found that the stale NFS file handle message does
appear first in each instance, it was just farther back in the logs.
"Lowering uid" message also appears immediately after every stale NFS
message, which in turn causes all of this n amount of time later
(sometimes 5 minutes, sometimes 20) when a user does a new action. The
"file reappeared message only occurs some of the time. Here's the chain
of events in every case so far that I can see:
1) fdatasync(/rci/nqu/rci/u8/user/dovecot/.INBOX/dovecot-uidlist)
failed: Stale NFS file handle
2) /rci/nqu/rci/u8/user/dovecot/.INBOX/dovecot-uidlist: next_uid was
lowered (n -> n-1, hdr=n-1)
...a few minutes later...
(may or may not be a "message reappeared" warning at this point)
3) /rci/nqu/rci/u8/user/dovecot/.INBOX/dovecot-uidlist: Duplicate file
entry at line 3:
1261057547.M378185P17303V03E80002I0197FB4A_0.gehenna9.rutgers.edu,S=7174:2,RS
(uid i -> n+1,2,3 )
4) Panic: file maildir-uidlist.c: line 405
(maildir_uidlist_records_array_delete): assertion failed: (pos != NULL)
One thing to note, after the "Expunged message reappeared, giving a new
UID" he died quickly and one more than one server simultaneously. The
gdb output is from server gehenna11 of that log file. The uid in
*recs[0] is also the number that you can see in the logs being lowered
from 719 -> 718.
First user log: http://pastebin.com/m1718f07b
First user gdb: http://pastebin.com/m40088dc8
The second user also died on more than one server. The output is also
from gehenna11
Second user log: http://pastebin.com/f3a1756f2
Second user gdb: http://pastebin.com/m59aacde4
On 12/29/2009 7:50 PM, Timo Sirainen wrote:
> On 29.12.2009, at 19.09, David Halik wrote:
>
>
>> I'll definitely get back to you on this. Right now we're closed until after New Years and I don't want to go updating the dovecot package on all of our servers until we're all back at work. I did do some quick poking around and the count is optimized out, so I'll have the package rebuilt without optimization and let you what the values are at the beginning of next week. Thanks again.
>>
> well, you can probably also get the values in a bit more difficult way:
>
> p count = p uidlist.records.arr.buffer.used / uidlist.records.arr.element_size
>
> p recs[n] = p *(*uidlist.records.v)[n]
>
>
More information about the dovecot
mailing list