Re: [Dovecot] Corrupted transaction log file / record size too small
I've been working around the problem by deleting the contents of ~user/.imap for affected users, but I'd really prefer to fix it properly. :-) Any suggestions would be gratefully received.
I think this should get fixed by itself? Although the first login fails, the next one should succeed.
Yes, that's right. I didn't realize it at first, because we were also seeing other problems (eg assertion failure at line 39 of index-sync.c) which did require the index to be deleted.
I've since made the following changes:
commented out the assertion at line 39 of index-sync.c (as per http://www.nabble.com/dovecot-panic-after-upgrade-1.0.15-to-1.1.2-td19098506...)
recompiled dovecot without optimization (ie, gcc -O0 instead of -O2) (as per http://www.mail-archive.com/dovecot@dovecot.org/msg12936.html)
reverted to using flock for file locking instead of fcntl (we'd been using flock with dovecot 1.0.10; 1.1.3 used fcntl by default in a couple of places, so I changed it back to see what would happen)
After this, we've seen fewer problems, but there are still the occasional ones:
the original corrupted transaction log error which prompted this thread
we still see the occasional assertion failure at line 1091 of index-mail.c; this is the one that I'd hoped to solve by recompiling without optimization, but at least it seems to be happening less frequently than before
this doesn't happen often, but once in a while we see assertion failures at line 163 of src/pop3-login/client-authenticate.c or line 200 of src/imap-login/client-authenticate.c
Can you reproduce this by running imaptest (on a test account)? http://imapwiki.org/ImapTest
Actually this and your other suggestions may not be necessary; I think I just realized what's been going on.
I missed this originally, but on reflection I notice in the logs that every time the "record size too small" error appears, inevitably the user involved is over disk quota and out of grace period. For example,
Oct 3 17:52:21 alcor dovecot: imap[14973] XXXX, NNN.NNN.NN.NNN: Corrupted transaction log file /home/XXXX/mail/.imap/sent-mail/dovecot.index.log: record size too small (type=0xcd91, offset=8180, size=0) Oct 3 17:52:21 alcor dovecot: imap[14973] XXXX, NNN.NNN.NN.NNN: fscking index file (in-memory index) Oct 3 17:52:21 alcor dovecot: imap[14973] XXXX, NNN.NNN.NN.NNN: Transaction log got desynced for index (in-memory index) Oct 3 17:52:21 alcor dovecot: imap[14973] XXXX, NNN.NNN.NN.NNN: Disconnected: IMAP session state is inconsistent, please relogin. bytes=169/1848 Oct 3 17:52:22 alcor dovecot: imap-login: Login: [14998], XXXX, NNN.NNN.NN.NNN Oct 3 17:52:22 alcor dovecot: wrapper[15023]: imap, XXXX, NNN.NNN.NN.NNN Oct 3 17:52:22 alcor dovecot: imap[15023] XXXX, NNN.NNN.NN.NNN: open(/home/XXXX/mail/.imap/CFI-Howe/.temp.alcor.15023.e786c194b1dcc08c) failed: Disk quota exceeded
Having seen this, I went back to the saved log for September 30th (the one containing the excerpt I pasted into my original message in this thread) -- and interestingly, I found that the quota failure wasn't recorded there. I don't know what was different about that occurrence, but everything I've seen recently suggests that "record size too small" is due to interaction with the quota system.
...so this is where I admit that I have indeed read http://wiki.dovecot.org/Quota/FS; unfortunately, the only user-writable file system on this machine that doesn't have (or need to have!) disk quotas enabled is /var/tmp, which doesn't seem to be a suitable place for dovecot indexes.
Under the circumstances, are you still interested in seeing ImapTest results and a backtrace? If so, should I do that before or after upgrading to 1.1.4?
Thanks,
- Steven
Steven Winikoff | Concordia University | "The end of the world will occur at Montreal, QC, Canada | 3:00 p.m., this Friday, with symposium smw@alcor.concordia.ca | to follow." http://alcor.concordia.ca/~smw | - fortune(6)
On Oct 6, 2008, at 2:20 AM, Steven Winikoff wrote:
- recompiled dovecot without optimization (ie, gcc -O0 instead of
-O2) (as per http://www.mail-archive.com/dovecot@dovecot.org/msg12936.html)
The problem here turned out to be a bug in Sun CC (and gcc -O2
apparently worked after all, so maybe a problem with testing).
- reverted to using flock for file locking instead of fcntl (we'd
been using flock with dovecot 1.0.10; 1.1.3 used fcntl by default in a couple of places, so I changed it back to see what would happen)
This shouldn't really make a difference.
- we still see the occasional assertion failure at line 1091 of index-mail.c; this is the one that I'd hoped to solve by recompiling without optimization, but at least it seems to be happening less frequently than before
I had hoped v1.1.3 would have fixed this. But perhaps this is also
related to the disk quota issues. Are there any other errors that are
logged before the assert?
- this doesn't happen often, but once in a while we see assertion failures at line 163 of src/pop3-login/client-authenticate.c or line 200 of src/imap-login/client-authenticate.c
I think I've managed to get this also sometimes when killing dovecot-
auth or doing something else "wrong" (I don't remember exactly), but
not during normal operation. I should have noted down how to reproduce
this when I still knew how to reproduce it..
I missed this originally, but on reflection I notice in the logs that every time the "record size too small" error appears, inevitably the user involved is over disk quota and out of grace period. For
example,
Yes, this is most likely what's the problem. I remember testing v1.0
quite a lot in "out of disk space" conditions, but I haven't really
done any such testing with v1.1. I'll try that and see if I can get
these problems reproduced and fixed.
On Sun, 2008-10-05 at 19:20 -0400, Steven Winikoff wrote:
Oct 3 17:52:21 alcor dovecot: imap[14973] XXXX, NNN.NNN.NN.NNN: Corrupted transaction log file /home/XXXX/mail/.imap/sent-mail/dovecot.index.log: record size too small (type=0xcd91, offset=8180, size=0)
This and some other bugs are now fixed by this: http://hg.dovecot.org/dovecot-1.1/rev/262c52055f2a
participants (2)
-
Steven Winikoff
-
Timo Sirainen