When we upgrade our kernels from 2.6.32.2 to 3.2.16 something strange has happened. On high traffic dovecot/auth looks like not responding.
We found a lot of this lines at the log: dovecot: pop3-login: Error: net_connect_unix(pop3) failed: Resource temporarily unavailable (...) and clients stop authorizing
Some other errors follow in the wake of: dovecot: pop3: Error: Raw backtrace: /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x373ca) [0x7768a3ca] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x3743b) [0x7768a43b] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x7766048b] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x4593a) [0x7769893a] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(io_add+0xaf) [0x7769757f] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(master_service_init_finish+0x19a) [0x77683c2a] -> dovecot/pop3(main+0xfc) [0x804a90c] -> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x774c04d3] -> dovecot/pop3() [0x804aba9] dovecot: pop3: Error: Raw backtrace: /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x373ca) [0x7768a3ca] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x3743b) [0x7768a43b] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x7766048b] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x4593a) [0x7769893a] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(io_add+0xaf) [0x7769757f] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(master_service_init_finish+0x19a) [0x77683c2a] -> dovecot/pop3(main+0xfc) [0x804a90c] -> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x774c04d3] -> dovecot/pop3() [0x804aba9] dovecot: master: Error: service(pop3): child 18756 killed with signal 6 (core dumped) dovecot: master: Error: service(pop3): child 18756 killed with signal 6 (core dumped) dovecot: master: Error: service(pop3): command startup failed, throttling dovecot: master: Error: service(pop3): command startup failed, throttling dovecot: pop3-login: Error: Raw backtrace: /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x373ca) [0x776b73ca] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x3743b) [0x776b743b] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x7768d48b] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x4593a) [0x776c593a] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(io_add+0xaf) [0x776c457f] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(master_service_init_finish+0x19a) [0x776b0c2a] -> /opt/dovecot2/lib/dovecot/libdovecot-login.so.0(main+0x143) [0x77705383] -> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x774ed4d3] -> dovecot/pop3-login() [0x8049471] dovecot: pop3-login: Error: Raw backtrace: /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x373ca) [0x776fd3ca] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x3743b) [0x776fd43b] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x776d348b] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x4593a) [0x7770b93a] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(io_add+0xaf) [0x7770a57f] -> /opt/dovecot2/lib/dovecot/libdovecot.so.0(master_service_init_finish+0x19a) [0x776f6c2a] -> /opt/dovecot2/lib/dovecot/libdovecot-login.so.0(main+0x143) [0x7774b383] -> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x775334d3] -> dovecot/pop3-login() [0x8049471]
And example stack trace (from pop3, pop3-login throws almost the same):
#0 0x776f6424 in __kernel_vsyscall ()
No symbol table info available.
#1 0x7744d1ef in __GI_raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
resultvar = <optimized out>
resultvar = <optimized out>
pid = 2002518004
selftid = 25476
#2 0x77450835 in __GI_abort () at abort.c:91
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0x9bce4a8,
sa_sigaction = 0x9bce4a8}, sa_mask = {__val = {163409408, 2002781570,
163374248, 603, 163374280,
604, 163374280, 2001703379, 0, 2002790760, 2140717252,
163374280, 0, 2003786736, 2002596704, 0, 2002618953, 2003087348,
2140717196, 0, 163409408,
2001286473, 163374248, 10, 2000834616, 2002534400, 604,
2003087348, 604, 2002791863, 4294967295, 10}}, sa_flags = 2140717316,
sa_restorer = 0x7764bd84
We do some investigation. Restart helped for a moment, but problem was returning. When we back to older kernel - everything back to normal.
We compile Dovecot with poll instead of epoll (--with-ioloop=poll) and this works for us.
Is any problem with epoll on 3.2.x kernels? Or maybe this is Dovecot problem? Maybe this is not connected with epoll, but epoll is interferes with this.
Problem on Dovecot 2.0.11 and 2.0.20.
-- Len7hir