[Dovecot] imap-login crashes after upgrade to Dovecot 1.2.8 Linux 64-bit

Chris Wakelin c.d.wakelin at reading.ac.uk
Fri Dec 18 03:01:40 EET 2009


Hi all,

We moved our Dovecot installation first of all to have all folders and
inboxes on NetApps via NFS (with indexes local) then from a physical
Solaris 8 4-way UltraSparc server running Dovecot 1.2.5 (32-bit) to a
virtualised Ubuntu 8.04 64-bit server with a single virtual processor
(in VMWare ESX 3.5) running Dovecot 1.2.8 (64-bit).

Since the change, many of our users have experienced random
disconnections when proxied in imap-login (either using the internal SSL
proxy, or to an external IMAP server) and the error logs contain things
like:

> Dec 18 00:27:58 imap-login: Info: Disconnected: Connection queue full (auth failed, 1 attempts): user=<user0>, method=PLAIN, rip=<ip0>, lip=134.225.32.156, TLS
> Dec 18 00:27:58 imap-login: Panic: file ioloop.c: line 39 (io_add): assertion failed: (fd >= 0)
> Dec 18 00:27:58 IMAP 9486 <user1> <ip1> : Info: Connection closed bytes=224/1753
> Dec 18 00:27:58 IMAP 10116 <user2> <ip2> : Info: Connection closed bytes=82/707

...

> Dec 18 00:27:58 IMAP 10296 <user3> <ip3> : Info: Disconnected in IDLE bytes=44/435
> Dec 18 00:27:58 IMAP 10298 <user3> <ip3> : Info: Disconnected in IDLE bytes=263/1662
> Dec 18 00:27:58 dovecot: Error: child 5124 (login) killed with signal 6 (core dumped) (latest ip=<ip0>)
> Dec 18 00:27:58 imap-login: Info: Aborted login (no auth attempts): rip=<ip0>, lip=134.225.32.156, TLS: Disconnected

and a backtrace gives:

> (gdb) bt
> #0  0x00007ff1030ad095 in raise () from /lib/libc.so.6
> #1  0x00007ff1030aeaf0 in abort () from /lib/libc.so.6
> #2  0x00000000004112d5 in default_fatal_finish (type=<value optimized out>, status=0) at failures.c:160
> #3  0x0000000000411333 in i_internal_fatal_handler (type=LOG_TYPE_PANIC, status=0, fmt=<value optimized out>, args=<value optimized out>) at failures.c:443
> #4  0x0000000000410996 in i_panic (format=0x6 <Address 0x6 out of bounds>) at failures.c:207
> #5  0x000000000041420e in io_add (fd=-1, condition=IO_READ, callback=0x404db0 <client_input>, context=0x6af000) at ioloop.c:39
> #6  0x00000000004059db in client_auth_failed (client=0x6af000, nodelay=true) at client-authenticate.c:103
> #7  0x0000000000405e07 in client_handle_args (client=0x6af000, args=<value optimized out>, success=true, nodelay_r=0x7fff571009af)
>     at client-authenticate.c:198
> #8  0x0000000000406214 in sasl_callback (_client=0x6af000, reply=SASL_SERVER_REPLY_SUCCESS, data=0x0, args=0x6361d0) at client-authenticate.c:277
> #9  0x000000000040eb13 in auth_client_input_ok (conn=0x75b6c8, args=<value optimized out>) at auth-server-request.c:196
> #10 0x000000000040dbf3 in auth_client_input (conn=0x75b6c8) at auth-server-connection.c:136
> #11 0x0000000000414aa8 in io_loop_handler_run (ioloop=<value optimized out>) at ioloop-epoll.c:208
> #12 0x0000000000413b9d in io_loop_run (ioloop=0x6552b0) at ioloop.c:335
> #13 0x0000000000408e81 in main (argc=2, argv=0x7fff57100c58, envp=0x7fff57100c70) at main.c:494

Up until this afternoon we had "login_process_per_connection = yes", and
login_max_connections at the default (256).

This evening I tried with "login_process_per_connection = no" with no
problems, except more CPU load and needing to increase
login_max_processes_count (and strictly speaking the max fds set with
ulimit -n would need to be increased, though I didn't do that as it
would have involved killing and restarting the master process - come
back Solaris plimit!)

Just after midnight, I switched back to "login_process_per_connection =
no" and set "login_max_connections = 32" and managed to get core dumps
enabled, hence the above!

Apart from this problem, the virtual machine has been doing extremely
well, so it's not CPU or I/O load related.

I'm not sure whether the problem is caused by the switch from Dovecot
1.2.5 to 1.2.8, Solaris to Linux, physical to virtual, or 32-bit to 64-bit!

Perhaps we changed too much at once, but it did mean we kept the service
going while power was turned off in our main machine room last weekend ...

Best Wishes,
Chris

-- 
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
Christopher Wakelin,                           c.d.wakelin at reading.ac.uk
IT Services Centre, The University of Reading,  Tel: +44 (0)118 378 8439
Whiteknights, Reading, RG6 2AF, UK              Fax: +44 (0)118 975 3094


More information about the dovecot mailing list