[Dovecot] pop3-login process leakage
Hi,
I've recently deployed dovecot on our servers, to replace courier-imapd. I'm delighted with its features and performance, but there's a major problem - it's leaking pop3-login processes.
We have about 60 client machines, each collecting mail once every two minutes. In this configuration, the number of pop3-login processes increases by about ten an hour - apparently without bound.
My workaround is to increase the open files rlimit, and the kernel overall limit on open files. This postpones disaster for long enough that killing and restarting dovecot nightly avoids problems.
A possibly related problem is that about 1% of attempted POP3 logins fail. At the server, all I see is a syslog entry like this: Nov 3 22:17:21 greenwich pop3-login: Disconnected: Inactivity [10.76.30.246]
At the client, there is a long (about 60-second) timeout. Then the client automatically retries, and generally succeeds.
There are *not* exactly as many such disconnects as there are stray pop3-login processes - I counted roughly 230 inactivity disconnects in the time it took for 190 stray processes to accumulate.
We are running dovecot 0.99.10 on OpenBSD 3.2 . The configuration is fairly normal, except that we are using a custom userdb and passdb (compiled as shared objects): auth_userdb = passwd8 auth_passdb = smb
passwd8 is just userdb-passwd.c with a trivial tweak to truncate presented user names to eight characters.
smb is an shim that delegates authentication to smbclient, to check passwords against NT domain controllers.
I've placed the source code for these two modules in: http://www.nsict.org/~clive/misc/dovecot-2003-11-04/ ...along with my dovecot.conf, and a file that illustrates the growth in the number of pop3-login processes over time.
So far as I'm aware, those modules are loaded by the dovecot-auth process, not the pop3-login processes, and are therefore unlikely to be the problem?
The client machines are LAN-connected, and running a mixture of Outlook (2000/XP) and Outlook Express (5.5 and 6). There are also a couple of people using Mozilla. Most are collecting e-mail via POP3, a few via IMAP.
This issue didn't show up under beta-test loading before I deployed dovecot. Since it seems only to be exhibited (or, at least, noticeably exhibited) under live load, it's very hard for me to diagnose things further without disrupting service.
I'd be very grateful if anyone could suggest what might be wrong.
Regards,
--Clive.
On Tue, 2003-11-04 at 14:37, Clive Jones wrote:
I've recently deployed dovecot on our servers, to replace courier-imapd. I'm delighted with its features and performance, but there's a major problem - it's leaking pop3-login processes.
Do this to one of the processes:
gdb /usr/local/libexec/dovecot/pop3-login <pid of pop3-login process>
p clients->nodes_count p main_refcount p auth_reconnect p auth_waiting_handshake_count
Timo,
Thanks for your message. Unfortunately, dovecot was running on a live server with debugging disabled in the kernel. It was a while before I could reboot it into a different kernel. (-8
Do this to one of the processes:
gdb /usr/local/libexec/dovecot/pop3-login <pid of pop3-login process>
Attaching to program `/usr/local/libexec/dovecot/pop3-login', process 23957 Reading symbols from /usr/libexec/ld.so...done. Reading symbols from /usr/lib/libssl.so.6.0...done. Reading symbols from /usr/lib/libcrypto.so.8.0...done. Reading symbols from /usr/lib/libc.so.28.5...done. 0x40142733 in poll ()
p clients->nodes_count
$1 = 0
p main_refcount
$2 = 2
p auth_reconnect
$3 = 0
p auth_waiting_handshake_count
$4 = 0
What's the verdict?
Regards,
--Clive.
On Tue, 2003-11-18 at 13:24, Clive Jones wrote:
gdb /usr/local/libexec/dovecot/pop3-login <pid of pop3-login process>
p clients->nodes_count
$1 = 0
p main_refcount
$2 = 2
p auth_reconnect
$3 = 0
p auth_waiting_handshake_count
$4 = 0
What's the verdict?
Looks like this process hasn't even accepted a pop3 connection yet.. Do they all do this? Ask these too:
p closing_down p process_per_connection p ssl_proxies->nodes_count p io_master
And have you changed login_processes_count, login_max_processes_count or login_max_logging_users settings?
On Tue, 2003-11-18 at 13:24, Clive Jones wrote:
Looks like this process hasn't even accepted a pop3 connection yet.. Do they all do this?
It seems there are a tiny handful of live, working instances (which I don't want to touch, since they represent instances being used by real clients that I don't want to degrade service for) and many hundreds of old ones left lying around. Those all look essentially similar, though I must admit I've not checked every single one of them. (-8
Ask these too:
p closing_down
$1 = 1
p process_per_connection
$2 = 1
p ssl_proxies->nodes_count
Error accessing memory address 0x10: Invalid argument. (gdb) p ssl_proxies $3 = (struct hash_table *) 0x0
p io_master
$4 = (struct io *) 0x151e0
(gdb) p *io_master
$5 = {next = 0x0, fd = 3, condition = 1, destroyed = 0,
callback = 0x4c30
And have you changed login_processes_count, login_max_processes_count or login_max_logging_users settings?
No - they've been left at the default.
Regards,
--Clive.
Hmm. I've now upgraded to Dovecot 0.99.10.4, and it doesn't appear to have helped much, if at all.
Between 18:28 yesterday and 10:58 today, the number of pop3-login processes has risen from 12 to 186.
Just for reference, I've gdb-ed the earliest of those processes again:
(gdb) p clients->nodes_count
$1 = 0
(gdb) p main_refcount
$2 = 2
(gdb) p auth_reconnect
$3 = 0
(gdb) p auth_waiting_handshake_count
$4 = 0
(gdb) p closing_down
$5 = 1
(gdb) p process_per_connection
$6 = 1
(gdb) p ssl_proxies->nodes_count
Error accessing memory address 0x10: Invalid argument.
(gdb) p ssl_proxies
$7 = (struct hash_table *) 0x0
(gdb) p io_master
$8 = (struct io *) 0x151e0
(gdb) p *io_master
$9 = {next = 0x0, fd = 3, condition = 1, destroyed = 0,
callback = 0x4c50
This looks exactly as before, except that the master_input function has moved by 32 bytes.
There's clearly still something going wrong, somewhere. )-8
Regards,
--Clive.
participants (2)
-
Clive Jones
-
Timo Sirainen