There is a known Problem with epoll, at least on Red Hat / CentOS, this bugzilla may give you additional info (comments of Timo inside)
https://bugzilla.redhat.com/show_bug.cgi?id=681578
Regards
Javier
El 30/05/2012 17:45, Konrad . escribió:
When we upgrade our kernels from 2.6.32.2 to 3.2.16 something strange has happened. On high traffic dovecot/auth looks like not responding.
We found a lot of this lines at the log: dovecot: pop3-login: Error: net_connect_unix(pop3) failed: Resource temporarily unavailable
(...) and clients stop authorizing
Some other errors follow in the
wake of:
dovecot: pop3: Error: Raw backtrace:
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x373ca) [0x7768a3ca] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x3743b) [0x7768a43b] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x7766048b] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x4593a) [0x7769893a] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(io_add+0xaf) [0x7769757f] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(master_service_init_finish+0x19a)
[0x77683c2a] -> dovecot/pop3(main+0xfc) [0x804a90c] ->
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x774c04d3] ->
dovecot/pop3() [0x804aba9]
dovecot: pop3: Error: Raw backtrace:
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x373ca) [0x7768a3ca] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x3743b) [0x7768a43b] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x7766048b] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x4593a) [0x7769893a] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(io_add+0xaf) [0x7769757f] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(master_service_init_finish+0x19a)
[0x77683c2a] -> dovecot/pop3(main+0xfc) [0x804a90c] ->
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x774c04d3] ->
dovecot/pop3() [0x804aba9]
dovecot: master: Error: service(pop3): child 18756 killed with signal 6 (core dumped) dovecot: master: Error: service(pop3): child 18756 killed with signal 6 (core dumped)
dovecot: master: Error: service(pop3): command startup failed, throttling
dovecot: master: Error: service(pop3): command startup failed, throttling dovecot: pop3-login: Error: Raw backtrace:
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x373ca) [0x776b73ca] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x3743b) [0x776b743b] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x7768d48b] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x4593a) [0x776c593a] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(io_add+0xaf) [0x776c457f] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(master_service_init_finish+0x19a)
[0x776b0c2a] ->
/opt/dovecot2/lib/dovecot/libdovecot-login.so.0(main+0x143)
[0x77705383] -> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)
[0x774ed4d3] -> dovecot/pop3-login() [0x8049471]
dovecot: pop3-login: Error: Raw backtrace:
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x373ca) [0x776fd3ca] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x3743b) [0x776fd43b] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x776d348b] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x4593a) [0x7770b93a] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(io_add+0xaf) [0x7770a57f] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(master_service_init_finish+0x19a)
[0x776f6c2a] ->
/opt/dovecot2/lib/dovecot/libdovecot-login.so.0(main+0x143)
[0x7774b383] -> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)
[0x775334d3] -> dovecot/pop3-login() [0x8049471]
And example stack
trace (from pop3, pop3-login throws almost the same):
#0 0x776f6424 in __kernel_vsyscall () No symbol table info available. #1 0x7744d1ef in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
resultvar =
resultvar = pid = 2002518004 selftid = 25476 #2 0x77450835 in __GI_abort () at abort.c:91 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x9bce4a8, sa_sigaction = 0x9bce4a8}, sa_mask = {__val = {163409408, 2002781570, 163374248, 603, 163374280, 604, 163374280, 2001703379, 0, 2002790760, 2140717252,
163374280, 0, 2003786736, 2002596704, 0, 2002618953, 2003087348,
2140717196, 0, 163409408,
2001286473, 163374248, 10, 2000834616, 2002534400, 604, 2003087348, 604, 2002791863, 4294967295, 10}}, sa_flags = 2140717316, sa_restorer = 0x7764bd84 } sigs = {__val = {32, 0 }} #3 0x77603390 in default_fatal_finish (type=, status=) at failures.c:187 backtrace = 0x9bce098
"/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x3837a) [0x7760337a] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(+0x383eb) [0x776033eb] ->
/opt/dovecot2/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x775d877a"...
#4 0x776033eb in i_internal_fatal_handler (ctx=0x7f98c174,
format=0x9bd6db0 "(25476) epoll_ctl(%s, %d) failed: %m",
args=0x7f98c194 "\323bw06")
at failures.c:688 status = 0 #5 0x775d877a in i_panic (format=0x7762d364 "epoll_ctl(%s, %d) failed: %m") at failures.c:263 ctx = {type = LOG_TYPE_PANIC, exit_status = 0, timestamp = 0x0} args = 0x7f98c194 "\323bw06" f = #6 0x77611d8a in io_loop_handle_add (io=0x9bd6d78) at ioloop-epoll.c:119 ---Type to continue, or q to quit--- ctx = 0x9bd64d8 list = event = {events = 24, data = {ptr = 0x9bd6da0, fd = 163409312, u32 = 163409312, u64 = 163409312}} op = 1 first = true #7 0x776109af in io_add (fd=6, condition=IO_ERROR, callback=0x775fd030 , context=0x9bd6300) at
ioloop.c:54
io = __FUNCTION__ = "io_add" #8 0x775fca9a in master_service_init_finish (service=0x9bd6300) at
master-service.c:381
sigint_flags = st = {st_dev = 8, __pad1 = 0, __st_ino = 1286567453, st_mode = 4480, st_nlink = 1, st_uid = 0, st_gid = 0, st_rdev = 0, __pad2 = 0, st_size = 0, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1338389522, tv_nsec = 269436805}, st_mtim = {tv_sec = 1338389522, tv_nsec = 269436805}, st_ctim = {
tv_sec = 1338389522, tv_nsec = 269436805}, st_ino = 1286567453}
value
count = 1 __FUNCTION__ = "master_service_init_finish" #9 0x0804a954 in main (argc=1, argv=0x7f98c414) at main.c:232 set_roots = {0x804ee40, 0x0} service_flags = storage_service_flags = MAIL_STORAGE_SERVICE_FLAG_DISALLOW_ROOT postlogin_socket_path = 0x0
username = 0x0
- everything back to normal.
c =
We do some investigation. Restart helped for a moment, but problem was returning. When we back to older kernel
We compile Dovecot with poll instead
this works for us.
Is any
of epoll (--with-ioloop=poll) and problem with epoll on 3.2.x kernels? Or maybe this is Dovecot problem?
Maybe this is not connected with epoll, but epoll is interferes with this.
Problem on Dovecot 2.0.11 and 2.0.20.