On Fri, 2011-12-09 at 12:13 -0800, Brad Schuetz wrote:
I've been using dovecot for years, been working great. However recently I've come across two issues.
The first issue unfortunately I have little information on, mail_debug hasn't provided anything useful either (in fact it looks like the login request that fails doesn't even get logged at all).
The *_debug settings aren't very helpful in debugging random failures.
During the morning rush of email, this server has around 11k mailboxes on it, it *appears* that one of the auth processes dies resulting in "dovecot: imap-login: Error: read(imap) failed: Connection reset by peer" errors followed by "dovecot: imap-login: Internal login failure ..." in the logs.
If auth process dies unexpectedly, master always logs an error, such as:
Dec 10 07:15:34 auth: Fatal: master: service(auth): child 27895 killed with signal 11 (core dumped)
Anyway, "read(imap) failed: Connection reset by peer" can happen if you reach the service imap { process_limit }. But then there should be a warning logged about it also:
Dec 10 07:17:39 master: Warning: service(imap): process_limit reached, client connections are being dropped
So, that error message alone shouldn't be happening..
The best I've gotten was a lucky strace once (at the time I didn't realize it was so lucky or I would have saved the output) that indicated the imap-login daemon was failing to connect to the auth process.
If that happens, there would also be an error message logged about it. One thing that v2.0 doesn't log about is if auth socket gets disconnected, but that would result in different problems. v2.1 logs about that too.
The second issue is lmtp/lda (tried both) delivery to a mailbox that has filesystem quotas enabled and the group quota is maxed is resulting in the below panic and crash.
11:21:07 [err] dovecot: lmtp(29691, admin@[redacted].com): Error: o_stream_send_istream(/email/d/r/[redacted]/[redacted]/admin/Maildir/tmp/1323458467.M245978P29691.fenrir.omnis.com) failed: Disk quota exceeded 11:21:07 [crit] dovecot: lmtp(29691, admin@[redacted].com): Panic: file ostream-zlib.c: line 144 (o_stream_zlib_send_flush): assertion failed: (zs->avail_in == 0)
I couldn't reproduce this with my tests, but http://hg.dovecot.org/dovecot-2.0/rev/75daa638281b should fix it.