[Dovecot] Imap segfault on login with version 2.0.8 & 2.0.11
Hi everyone,
I'm trying to upgrade/test our configuration files from Dovecot 1.2 to 2.0.11. I used the Dovecot config convert command, created a RHEL 5 x86_64 rpm, and tested it. I can connect, but once I login, the IMAP service segfaults. I've also reproduced this on version 2.0.8.
I installed the debug RPM (for the stripped symbols), took a look at the core dump in gdb, and just got the _start method in ld_preload and a broken pointer in the backtrace. I tried to strace everything but the segfault doesn't happen then :-). I'm not sure what to do to get better information.
I attached my client session, dovecot -n output, gdb trace, maillog, and the core dump.
I published the RPMs I made at: http://c3205012.r12.cf0.rackcdn.com/dovecot-2.0.11-1_125.x86_64.rpm http://c3205012.r12.cf0.rackcdn.com/dovecot-debuginfo-2.0.11-1_125.x86_64.rp...
My platform is RHEL 5 with kernel 2.6.18 on x86_64.
Is there anything else I can provide for more info?
Thanks, Nick
Nick VonHollen Rackspace Software Developer Desk: 540-443-2003 (internal 505-2003) Personal Cell: 757-710-7038
On Wed, 2011-04-06 at 18:53 -0400, Nicholas VonHollen wrote:
I'm trying to upgrade/test our configuration files from Dovecot 1.2 to 2.0.11. I used the Dovecot config convert command, created a RHEL 5 x86_64 rpm, and tested it. I can connect, but once I login, the IMAP service segfaults. I've also reproduced this on version 2.0.8.
Try what happens when running imap directly:
/usr/lib/dovecot/imap -u user@domain
Does it still crash? If not, you could try if running it via valgrind logs anything useful:
service imap { executable = /usr/bin/valgrind /usr/lib/dovecot/imap }
Hi Timo,
Thanks for helping me out. It looks like no matter what I do, changing the executable line doesn't help. I even changed it to "executable = /bin/false" and it still reports a segfault. I verified the config with "dovecot -n", stopped using the "-c /etc/dovecot.conf", moved the conf file to the appropriate place, re-verified it, and I still can't affect it. I assume the crash is happening post-fork, pre-exec, if there is custom code for launching child processes. Since strace 'fixes' the problem, I'll try to figure it out under gdb, but to be honest, my C debugging skills are very rusty.
Any ideas?
Logs with "executable = /bin/false" (same as before): ... Apr 7 17:29:04 localhost dovecot: auth: Debug: Module loaded: /usr/lib64/dovecot/auth/libmech_gssapi.so Apr 7 17:29:04 localhost dovecot: auth: Debug: passwd-file /etc/dovecot-master.passwd: Read 1 users Apr 7 17:29:04 localhost dovecot: auth: Debug: pam(foo,127.0.0.1): lookup service=system-auth Apr 7 17:29:05 localhost dovecot: auth: Debug: pam(foo,127.0.0.1): #1/1 style=1 msg=Password: Apr 7 17:29:05 localhost dovecot: auth: Debug: client out: OK 1 user=foo Apr 7 17:29:05 localhost dovecot: master: Error: service(imap): child 9690 killed with signal 11 (core dumps disabled) Apr 7 17:29:05 localhost dovecot: master: Error: service(imap): command startup failed, throttling Apr 7 17:29:05 localhost dovecot: imap-login: Error: read(imap) failed: Connection reset by peer Apr 7 17:29:05 localhost dovecot: imap-login: Internal login failure (pid=9680 id=1) (auth failed, 1 attempts): user=<foo>, method=PLAIN, rip=127.0.0.1, lip=127.0.0.1, secured Apr 7 17:29:05 localhost dovecot: auth: Debug: client in: CANCEL 1
Thanks, Nick
"Timo Sirainen" <tss@iki.fi> said:
On Wed, 2011-04-06 at 18:53 -0400, Nicholas VonHollen wrote:
I'm trying to upgrade/test our configuration files from Dovecot 1.2 to 2.0.11. I used the Dovecot config convert command, created a RHEL 5 x86_64 rpm, and tested it. I can connect, but once I login, the IMAP service segfaults. I've also reproduced this on version 2.0.8.
Try what happens when running imap directly:
/usr/lib/dovecot/imap -u user@domain
Does it still crash? If not, you could try if running it via valgrind logs anything useful:
service imap { executable = /usr/bin/valgrind /usr/lib/dovecot/imap }
Nick VonHollen Rackspace Software Developer Desk: 540-443-2003 (internal 505-2003) Personal Cell: 757-710-7038
On 8.4.2011, at 0.17, Nicholas VonHollen wrote:
Thanks for helping me out. It looks like no matter what I do, changing the executable line doesn't help. I even changed it to "executable = /bin/false" and it still reports a segfault. I verified the config with "dovecot -n", stopped using the "-c /etc/dovecot.conf", moved the conf file to the appropriate place, re-verified it, and I still can't affect it. I assume the crash is happening post-fork, pre-exec, if there is custom code for launching child processes. Since strace 'fixes' the problem, I'll try to figure it out under gdb, but to be honest, my C debugging skills are very rusty.
Ah, yes:
Apr 7 17:29:05 localhost dovecot: master: Error: service(imap): child 9690 killed with signal 11 (core dumps disabled)
It has "master:" prefix so it's pre-exec. But you got a core file earlier, right? You should be able to get a usable gdb backtrace then with "gdb dovecot core".
"Timo Sirainen" <tss@iki.fi> said:
On 8.4.2011, at 0.17, Nicholas VonHollen wrote:
Thanks for helping me out. It looks like no matter what I do, changing the executable line doesn't help. I even changed it to "executable = /bin/false" and it still reports a segfault. I verified the config with "dovecot -n", stopped using the "-c /etc/dovecot.conf", moved the conf file to the appropriate place, re-verified it, and I still can't affect it. I assume the crash is happening post-fork, pre-exec, if there is custom code for launching child processes. Since strace 'fixes' the problem, I'll try to figure it out under gdb, but to be honest, my C debugging skills are very rusty.
Ah, yes:
Apr 7 17:29:05 localhost dovecot: master: Error: service(imap): child 9690 killed with signal 11 (core dumps disabled)
It has "master:" prefix so it's pre-exec. But you got a core file earlier, right? You should be able to get a usable gdb backtrace then with "gdb dovecot core".
GDB still complains "warning: core file may not match specified executable file." when using anything but /bin/false. I tried using "gdb dovecot corefile" but got the same trace with nothing but _start. Is it possible to crash post-exec and pre-main? I'm not sure that makes sense, lol. You can't really pass pointers to child processes, so maybe ld is somehow screwed up? I can try it on a different OS with a similar RPM.
Nick VonHollen Rackspace Software Developer Desk: 540-443-2003 (internal 505-2003) Personal Cell: 757-710-7038
I feel pretty dumb.
When I converted the config file, I somehow set vsz_limit to 1k. The child process tries to load its core libraries, runs out of virtual memory, and is killed.
We have a lot of limits set in the config file, and I've seen warnings for other limits, but not this one.
Hopefully, someone else sees this and doesn't do the same thing. Would it make sense to raise some warnings for the vsz_limit itself? At least for some insanely low number like <= 1M?
Thanks, Nick
"Nicholas VonHollen" <nickv@mailtrust.com> said:
"Timo Sirainen" <tss@iki.fi> said:
On 8.4.2011, at 0.17, Nicholas VonHollen wrote:
Thanks for helping me out. It looks like no matter what I do, changing the executable line doesn't help. I even changed it to "executable = /bin/false" and it still reports a segfault. I verified the config with "dovecot -n", stopped using the "-c /etc/dovecot.conf", moved the conf file to the appropriate place, re-verified it, and I still can't affect it. I assume the crash is happening post-fork, pre-exec, if there is custom code for launching child processes. Since strace 'fixes' the problem, I'll try to figure it out under gdb, but to be honest, my C debugging skills are very rusty.
Ah, yes:
Apr 7 17:29:05 localhost dovecot: master: Error: service(imap): child 9690 killed with signal 11 (core dumps disabled)
It has "master:" prefix so it's pre-exec. But you got a core file earlier, right? You should be able to get a usable gdb backtrace then with "gdb dovecot core".
GDB still complains "warning: core file may not match specified executable file." when using anything but /bin/false. I tried using "gdb dovecot corefile" but got the same trace with nothing but _start. Is it possible to crash post-exec and pre-main? I'm not sure that makes sense, lol. You can't really pass pointers to child processes, so maybe ld is somehow screwed up? I can try it on a different OS with a similar RPM.
Nick VonHollen Rackspace Software Developer Desk: 540-443-2003 (internal 505-2003) Personal Cell: 757-710-7038
Nick VonHollen Rackspace Software Developer Desk: 540-443-2003 (internal 505-2003) Personal Cell: 757-710-7038
On 9.4.2011, at 1.01, Nick VonHollen wrote:
When I converted the config file, I somehow set vsz_limit to 1k. The child process tries to load its core libraries, runs out of virtual memory, and is killed.
We have a lot of limits set in the config file, and I've seen warnings for other limits, but not this one.
Hopefully, someone else sees this and doesn't do the same thing. Would it make sense to raise some warnings for the vsz_limit itself? At least for some insanely low number like <= 1M?
There is such a warning! Unfortunately it's for numbers under 1 kB :) Yeah, I guess it could be safely increased quite a lot.
Hi Nicholas,
I'm trying to upgrade/test our configuration files from Dovecot 1.2 to 2.0.11. I used the Dovecot config convert command, created a RHEL 5 x86_64 rpm, and tested it. I can connect, but once I login, the IMAP service segfaults. I've also reproduced this on version 2.0.8.
Don't know if this is related: I had a problem with dovecot-2.0.11 under AIX 5.3, it segfaulted right after connect. The reason was that struct login_binary remained uninitialized in login-common/main.c The problem disappeared after configuring with --disable-shared. Maybe this is worth a try.
Regards, Klaus
participants (4)
-
Klaus Desinger
-
Nicholas VonHollen
-
Nick VonHollen
-
Timo Sirainen