-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
See the end of the message for a possible delete of a closed fd from epoll() problem.
Every now and then I get a Panic after a successfull LMTP delivery:
2014-11-12 13:25:31 lmtp(20890): Info: Connect from local 2014-11-12 13:25:32 lmtp(20890): Info: Disconnect from local: Connection closed (in reset) 2014-11-12 13:25:34 lmtp(20890): Info: Connect from local 2014-11-12 13:25:35 lmtp(20890): Info: Disconnect from local: Connection closed (in reset) 2014-11-12 13:26:11 lmtp(20890): Info: Connect from local 2014-11-12 13:26:12 lmtp(20890): Info: Disconnect from local: Connection closed (in reset) 2014-11-12 13:26:12 lmtp(20890): Info: Connect from local 2014-11-12 13:26:13 lmtp(20890): Info: Disconnect from local: Connection closed (in reset) 2014-11-12 13:26:13 lmtp(20890): Info: Connect from local 2014-11-12 13:26:14 lmtp(20890): Info: Disconnect from local: Connection closed (in reset) 2014-11-12 13:26:15 lmtp(20890): Info: Connect from local 2014-11-12 13:26:15 lmtp(20890): Info: Disconnect from local: Connection closed (in reset) 2014-11-12 13:27:05 lmtp(20890): Info: Connect from local 2014-11-12 13:27:06 lmtp(20890): Info: Disconnect from local: Connection closed (in reset) 2014-11-12 13:27:07 lmtp(20890): Info: Connect from local 2014-11-12 13:27:07 lmtp(20890): Info: Disconnect from local: Connection closed (in reset) 2014-11-12 13:27:08 lmtp(20890): Info: Connect from local 2014-11-12 13:27:08 lmtp(20890): Info: Disconnect from local: Connection closed (in reset) 2014-11-12 13:27:18 lmtp(20890): Info: Connect from local 2014-11-12 13:27:18 lmtp(20890): Info: Disconnect from local: Successful quit 2014-11-12 13:27:25 lmtp(20890): Info: Connect from local 2014-11-12 13:27:25 lmtp(20890): Info: Disconnect from local: Successful quit lmtp(18385): Panic: epoll_ctl(del, 11) failed: Bad file descriptor Error: Raw backtrace: /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(+0x934c6) [0x7fafdc2c04c6] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(+0x949d0) [0x7fafdc2c19d0] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(i_fatal+0) [0x7fafdc2c0879] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(io_loop_handle_remove+0x129) [0x7fafdc2dcc85] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(+0xacf16) [0x7fafdc2d9f16] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(io_remove+0x1d) [0x7fafdc2d9f5d] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(+0xbcd5c) [0x7fafdc2e9d5c] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(+0xbce13) [0x7fafdc2e9e13] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(io_stream_unref+0x7d) [0x7fafdc2cb30d] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(o_stream_unref+0x82) [0x7fafdc2e86ae] -> /usr/local/dovecot-2.2.15/lib/dovecot/libssl_iostream_openssl.so(+0x5b93) [0! x7fafdb69cb93] -> /usr/local/dovecot-2.2.15/lib/dovecot/libssl_iostream_openssl.so(+0x5ce9) [0x7fafdb69cce9] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(ssl_iostream_unref+0x36) [0x7fafdc2b8bb9] -> /usr/local/dovecot-2.2.15/lib/dovecot/libssl_iostream_openssl.so(+0x911a) [0x7fafdb6a011a] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(io_stream_unref+0x8e) [0x7fafdc2cb31e] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(o_stream_unref+0x82) [0x7fafdc2e86ae] -> dovecot2.2/lmtp idling [0x40560d] -> dovecot2.2/lmtp idling [0x405637] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(io_loop_call_io+0xcd) [0x7fafdc2dae5b] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(io_loop_handler_run_internal+0x1e4) [0x7fafdc2dcec0] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0x18) [0x7fafdc2daff6] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(io_loop_run+0xaf) [0x7fafd! c2daf4f] -> /usr/local/dovecot-2.2.15/lib/dovecot/libdovecot.so.0(master_service_run+0x2e) [0x7fafdc25b1ad] -> dovecot2.2/lmtp idling [0x4050d6] -> /lib/libc.so.6(__libc_start_main+0xfd) [0x7fafdbee9c8d] -> dovecot2.2/lmtp idling [0x404db9] 2014-11-12 12:46:34 lmtp(18385): Fatal: master: service(lmtp): child 18385 killed with signal 6 (core dumped)
Core was generated by `dovecot2.2/lmtp'.
Program terminated with signal 6, Aborted.
#0 0x00007fafdbefd1b5 in raise () from /lib/libc.so.6
(gdb) bt
#0 0x00007fafdbefd1b5 in raise () from /lib/libc.so.6
#1 0x00007fafdbefffc0 in abort () from /lib/libc.so.6
#2 0x00007fafdc2c0515 in default_fatal_finish (type=LOG_TYPE_PANIC, status=0) at failures.c:202
#3 0x00007fafdc2c19d0 in i_internal_fatal_handler (ctx=0x7fff64e1b960, format=0x7fafdc3109fc "%s", args=0x7fff64e1b940) at failures.c:666
#4 0x00007fafdc2c0879 in i_panic (format=0x7fafdc3109fc "%s") at failures.c:276
#5 0x00007fafdc2dcc85 in io_loop_handle_remove (io=0x2093920, closed=false) at ioloop-epoll.c:150
#6 0x00007fafdc2d9f16 in io_remove_full (_io=0x205f078, closed=false) at ioloop.c:143
#7 0x00007fafdc2d9f5d in io_remove (io=0x205f078) at ioloop.c:157
#8 0x00007fafdc2e9d5c in stream_closed (fstream=0x205ef90) at ostream-file.c:57
#9 0x00007fafdc2e9e13 in o_stream_file_close (stream=0x205ef90, close_parent=false) at ostream-file.c:78
#10 0x00007fafdc2cb30d in io_stream_unref (stream=0x205ef90) at iostream.c:41
#11 0x00007fafdc2e86ae in o_stream_unref (_stream=0x2079308) at ostream.c:88
#12 0x00007fafdb69cb93 in openssl_iostream_free (ssl_io=0x20792e0) at iostream-openssl.c:273
#13 0x00007fafdb69cce9 in openssl_iostream_unref (ssl_io=0x20792e0) at iostream-openssl.c:289
#14 0x00007fafdc2b8bb9 in ssl_iostream_unref (_ssl_io=0x2079670) at iostream-ssl.c:126
#15 0x00007fafdb6a011a in o_stream_ssl_destroy (stream=0x2079590) at ostream-openssl.c:28
#16 0x00007fafdc2cb31e in io_stream_unref (stream=0x2079590) at iostream.c:42
#17 0x00007fafdc2e86ae in o_stream_unref (_stream=0x7fff64e1bcd8) at ostream.c:88
#18 0x000000000040560d in client_input_handle (client=0x205e160) at client.c:139
#19 0x0000000000405637 in client_input (client=0x205e160) at client.c:146
#20 0x00007fafdc2dae5b in io_loop_call_io (io=0x20c3c50) at ioloop.c:501
#21 0x00007fafdc2dcec0 in io_loop_handler_run_internal (ioloop=0x2039720) at ioloop-epoll.c:220
#22 0x00007fafdc2daff6 in io_loop_handler_run (ioloop=0x2039720) at ioloop.c:548
#23 0x00007fafdc2daf4f in io_loop_run (ioloop=0x2039720) at ioloop.c:525
#24 0x00007fafdc25b1ad in master_service_run (service=0x20395b0, callback=0x404e74
strace shows this:
read(11, "\27\3\1\0 q\r\252\3551\21\237l\33\330\33\303\340\306l\334k\0360p\303)HF\331\234g"..., 1261) = 37 read(11, 0xf09908, 1224) = -1 EAGAIN (Resource temporarily unavailable) epoll_ctl(10, EPOLL_CTL_MOD, 11, {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=15754432, u64=15754432}}) = 0 epoll_ctl(10, EPOLL_CTL_MOD, 11, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=15754432, u64=15754432}}) = 0 write(11, "\27\3\1\0 \204\375\373!\253\263n\204\274duj/\n\202\236\373\342\303[\22\17\264\10\23\346\225"..., 90) = 90 epoll_ctl(10, EPOLL_CTL_MOD, 11, {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=15754432, u64=15754432}}) = 0 write(2, "\1\01028256 prefix=lmtp(28256): \n", 29) = 29 write(2, "\1\00228256 Disconnect from local: S"..., 47) = 47 epoll_ctl(10, EPOLL_CTL_MOD, 11, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=15754432, u64=15754432}}) = 0 epoll_ctl(10, EPOLL_CTL_MOD, 11, {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=15754432, u64=15754432}}) = 0 epoll_ctl(10, EPOLL_CTL_MOD, 11, {EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=15754432, u64=15754432}}) = 0 epoll_ctl(10, EPOLL_CTL_DEL, 11, {0, {u32=15754432, u64=15754432}}) = 0 write(11, "\25\3\1\0 \232h\200\203\3\350\34\265\1\305\6\302\276\272BD \236\353\267 \1\24\16\327\16I"..., 37) = 37 epoll_ctl(10, EPOLL_CTL_ADD, 11, {EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=15754432, u64=15754432}}) = 0 close(11) = 0 write(5, "`n\0\0M\0\0\0\1\0\0\0", 12) = 12 epoll_ctl(10, EPOLL_CTL_DEL, 11, {0, {u32=15754432, u64=15754432}}) = -1 EBADF (Bad file descriptor) write(2, "\1\00628256 epoll_ctl(del, 11) faile"..., 55) = 55
lsof shows for fd's 10 and 11:
lmtp 28256 root 10u 0000 0,9 0 548 anon_inode lmtp 28256 root 11u unix 0xffff880039336100 0t0 633297 /var/run/dovecot2.2/lmtp
my man epoll says:
Q6 Will closing a file descriptor cause it to be removed from all epoll sets automatically?
A6 Yes, but be aware of the following point. A file descriptor is a reference to an open file description (see open(2)). Whenever a descriptor is duplicated via
dup(2), dup2(2), fcntl(2) F_DUPFD, or fork(2), a new file descriptor referring to the same open file description is created. An open file description continues to
exist until all file descriptors referring to it have been closed. A file descriptor is removed from an epoll set only after all the file descriptors referring to
the underlying open file description have been closed (or before if the descriptor is explicitly removed using epoll_ctl() EPOLL_CTL_DEL). This means that even after
a file descriptor that is part of an epoll set has been closed, events may be reported for that file descriptor if other file descriptors referring to the same under‐
lying file description remain open.
So the close(11) should be enough to remove the fd 11 from epoll(10)
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux)
iQEVAwUBVGNvlHz1H7kL/d9rAQKYcAf/dJKQGFqi8wZBwG1GevEc+DYJeZYGSUxB YOtMV+szUrVyl+vD0GZNyXKotO6/xn0gipeV1tLTZ1xIg7NIsVcJ2/DC1WENenrE YePmuuX/TQNpj7nOcQiATFtOtenqV9itwm5zrhTN1vzjyVbEnPwhVjsaOAnaUk9u HN/BdxTazDd7QTEQmJNMKJBFyg/Im0ktJpo/eAKuysv/mcZ9HOorTQhtocK7Goro YqOWacuEZVA0TUtD1hqDn9OfvEymp0oPoM8gEYON7lK4OQ9d6+hWMpkH5BG5x+7Z ZPmq9xUJW17tcbe2imKkXw+RBRNofgf5GyhMwNVoLhbOKa29tYWqXw== =yIiy -----END PGP SIGNATURE-----