A quick status report on how 1.0rc8 behaved in service for a few hours with several hundred simultaneous users, at a site very new to dovecot.
Oh, and a question at the end.
Summary: Reasonable for a first shot but one significant problem, requiring backing off.
Background:
We have a long-established UW-IMAP service for a user population of about 20,000 based on a few Linux (Redhat) machines running IMAP/POP and inbound delivery. We try to ensure, but cannot guarantee, that all activity for a given user takes place within one machine. Each machine mounts the INBOX area ("/var/spool/mail"; traditional UNIX mbox) via NFS with tight NFS arguments ("noac,actimeo=0", etc.) and similarly mounts the users' folder areas which are subdirs of their home directories. (We know that Mark Crispin recommends against NFS for UW-IMAP, but we seem to have been OK.)
There is also some processing: .forward->"| procmail"->folder-or-inbox
Each machine typically has several hundred simultaneous IMAP connections.
This has basically worked well, but the UW-IMAP loading has been heavy.
The plan:
In an ideal world, I would like to restructure the above. But our world is not ideal, so we have to stay with the structure. But we are looking for a transparent (user perspective) migration to dovecot.
The dovecot experience:
Yesterday, I quietly adjusted a DNS entry to redirect one of the live email hostnames at an additional machine in the "farm", running dovecot 1.0rc8, including deliver/LDA (and taking into account some post-rc7 dovecot changes in this area).
On an earlier, smaller-scale test, one problem had been some periods of "temporary authentication failures". Increasing "login_processes_count" and "login_max_processes_count" (each by a factor of 8) seems to have fixed this, and I'm not aware of any problems in that area yesterday.
It basically went well. But just over two hours hours later I had to back off, because of a significant dovecot problem, namely that dovecot crashed, almost silently. The only traces of this event in the log file seem to be: Oct 10 16:26:12 [...] dovecot: child 24525 (login) returned error 89 Oct 10 16:26:14 [...] dovecot: Login process died too early - shutting down
Any thoughts? Any fixes? If the problem needs debugging (or additional data/log collection) how might that be attempted in this environment?
--
: David Lee I.T. Service : : Senior Systems Programmer Computer Centre : : Durham University : : http://www.dur.ac.uk/t.d.lee/ South Road : : Durham DH1 3LE : : Phone: +44 191 334 2752 U.K. :
On Wed, 2006-10-11 at 15:54 +0100, David Lee wrote:
It basically went well. But just over two hours hours later I had to back off, because of a significant dovecot problem, namely that dovecot crashed, almost silently. The only traces of this event in the log file seem to be: Oct 10 16:26:12 [...] dovecot: child 24525 (login) returned error 89 Oct 10 16:26:14 [...] dovecot: Login process died too early - shutting down
Looks like this is happening to some people now.. Unfortunately I can't really do anything with this little information. There's a bug in logging, which I think is fixed by this patch:
http://dovecot.org/list/dovecot-cvs/2006-October/006473.html
After knowing what exactly the error is I could debug it further.
Le Wed, 11 Oct 2006 18:26:14 +0300 Timo Sirainen tss@iki.fi écrit:
On Wed, 2006-10-11 at 15:54 +0100, David Lee wrote:
It basically went well. But just over two hours hours later I had to back off, because of a significant dovecot problem, namely that dovecot crashed, almost silently. The only traces of this event in the log file seem to be: Oct 10 16:26:12 [...] dovecot: child 24525 (login) returned error 89 Oct 10 16:26:14 [...] dovecot: Login process died too early - shutting down
Looks like this is happening to some people now.. Unfortunately I can't really do anything with this little information. There's a bug in logging, which I think is fixed by this patch:
http://dovecot.org/list/dovecot-cvs/2006-October/006473.html
After knowing what exactly the error is I could debug it further.
Hi, I do have the same kind of problem, but with dovecot 1.0rc6 (I will upgrade today to rc8). Once a week dovecot died with only 'Login process died too early linux 2.6 kernel and MySQL for user/pass.
- shutting down' in the log. It only happens on my POP3/IMAP proxies. I use a
# dovecot --version 1.0.rc6 # dovecot --build-options Build options: ioloop=poll notify=dnotify ipv6 openssl SQL drivers: mysql Passdb: checkpassword pam passwd passwd-file shadow sql Userdb: checkpassword passwd prefetch passwd-file sql static
What should I active/do to produce a exploitable debug trace ?
Timo if you need additional info feel free top ask.
-- Laurent Papier - 03 88 75 80 50 Admin. système - SdV Plurimedia - http://www.sdv.fr/
participants (3)
-
David Lee
-
Laurent Papier
-
Timo Sirainen