[Dovecot] Panic in 1.1.6 file ioloop.c: line 206
Hi all,
I just had a panic with dovecot 1.1.6, two days after replacing 1.0.15 with it in production.
Here are the logs immediately preceding the panic, I think they're related. It seems the user connected simultaneously from two different clients, from his own computer and a webmail running on the server, then manipulated mail folders on one side, and logged out. The panic occurred 3 seconds after the logout.
Is there a setting I might have missed that could avoid that?
TIA,
Laurent
Nov 27 19:25:59 balif dovecot: [ID 107833 local2.error] IMAP(user): stat() failed with mbox file /home/user/r&AOk-union-DA: No such file or directory Nov 27 19:25:59 balif last message repeated 1 time Nov 27 19:25:59 balif dovecot: [ID 107833 local2.info] IMAP(user): Disconnected: Logged out bytes=436/4810193 Nov 27 19:25:59 balif dovecot: [ID 107833 local2.error] IMAP(user): file_dotlock_open() failed with file /var/dovecot/index/user/.imap/R&AOk-union-DAFC-DCBE-DMCT/dovecot.index.log: No such file or directory Nov 27 19:25:59 balif last message repeated 1 time Nov 27 19:25:59 balif dovecot: [ID 107833 local2.error] IMAP(user): stat() failed with mbox file /home/user/R&AOk-union-DAFC-DCBE-DMCT: No such file or directory Nov 27 19:25:59 balif dovecot: [ID 107833 local2.info] IMAP(user): Disconnected: Logged out bytes=494/314696 Nov 27 19:25:59 balif dovecot: [ID 107833 local2.error] IMAP(user): file_dotlock_open() failed with file /var/dovecot/index/user/.imap/r&AOk-union-directeurs/dovecot.index.log: No such file or directory Nov 27 19:25:59 balif last message repeated 1 time Nov 27 19:25:59 balif dovecot: [ID 107833 local2.error] IMAP(user): stat() failed with mbox file /home/user/r&AOk-union-directeurs: No such file or directory Nov 27 19:25:59 balif dovecot: [ID 107833 local2.info] IMAP(user): Disconnected: Logged out bytes=470/1138733 Nov 27 19:25:59 balif dovecot: [ID 107833 local2.error] IMAP(user): file_dotlock_open() failed with file /var/dovecot/index/user/.imap/NACRE/dovecot.index.log: No such file or directory Nov 27 19:25:59 balif last message repeated 1 time Nov 27 19:25:59 balif dovecot: [ID 107833 local2.error] IMAP(user): stat() failed with mbox file /home/user/NACRE: No such file or directory Nov 27 19:25:59 balif last message repeated 1 time Nov 27 19:25:59 balif dovecot: [ID 107833 local2.info] IMAP(user): Disconnected: Logged out bytes=2429/174842484 Nov 27 19:25:59 balif dovecot: [ID 107833 local2.info] IMAP(user): Disconnected: Logged out bytes=467/10178 Nov 27 19:26:02 balif dovecot: [ID 107833 local2.crit] Panic: file ioloop.c: line 206: assertion failed: (tv_r->tv_sec >= 0 && tv_r->tv_usec >= 0) Nov 27 19:26:02 balif dovecot: [ID 398108 local2.error] Raw backtrace: 0x24d90 -> 0x2851c -> 0x286ac -> 0x28728 -> 0x28de0 -> 0x2877c -> 0x1f56c -> 0x15c40
-- / Leader de Projet & Communauté | I'm working, but not speaking for \ G11N http://fr.opensolaris.org | Bull Services http://www.bull.com / FOSUG http://guses.org |
On Fri, 2008-11-28 at 09:24 +0100, Laurent Blume wrote:
Here are the logs immediately preceding the panic, I think they're related. It seems the user connected simultaneously from two different clients, from his own computer and a webmail running on the server, then manipulated mail folders on one side, and logged out. The panic occurred 3 seconds after the logout.
Is there a setting I might have missed that could avoid that?
No.
Nov 27 19:25:59 balif dovecot: [ID 107833 local2.error] IMAP(user): stat() failed with mbox file /home/user/r&AOk-union-DA: No such file or directory
These errors shouldn't be logged. This should fix it: http://hg.dovecot.org/dovecot-1.1/rev/e24a2528ce21
Nov 27 19:26:02 balif dovecot: [ID 107833 local2.crit] Panic: file ioloop.c: line 206: assertion failed: (tv_r->tv_sec >= 0 && tv_r->tv_usec >= 0)
I can't really think of why this would have happened. You wouldn't happen to have a core dump available from this? That would be very helpful in figuring out why this happened.
But anyway I did some changes to that code, maybe it'll give a different assert-crash the next time: http://hg.dovecot.org/dovecot-1.1/rev/75f36619846f
Timo Sirainen a écrit :
I can't really think of why this would have happened. You wouldn't happen to have a core dump available from this? That would be very helpful in figuring out why this happened.
I have, here are the pstacks:
$ pstack core.dovecot.7911 core 'core.dovecot.7911' of 7911: /usr/local/dovecot/sbin/dovecot -c /usr/local/dovecot/etc/dovecot.conf ff1c1adc _lwp_kill (6, 0, 0, ff1a4d94, ffffffff, 6) + 8 ff140218 abort (384bc, 1, 4a584, aa1a0, ff1ed2d8, 0) + 110 00024c28 default_fatal_finish (4, 0, 38be0, ffbffb88, 38400, 55e78) + 68 00024d90 i_panic (38be0, 4c8f8, ce, 4c904, 20, 1e7a0) + 2c 0002851c timeout_get_wait_time (7c, ffbffc00, 0, 3e8, 38800, fffffe82)
- 15c 000286ac io_loop_handle_timeouts_real (591b0, 5cc70, 17714, 4dc00, 38c00, 4b878) + 120 00028728 io_loop_handle_timeouts (591b0, ef, 3d0, 10624c00, e8b250, 3a2c9400) + 10 00028de0 io_loop_handler_run (591b0, 1, 0, 38c00, 0, 5cbd0) + 60 0002877c io_loop_run (591b0, 80000000, 80, 0, 80000000, 80000000) + 3c 0001f56c main (36000, 4d000, 0, 4dc00, 0, 0) + 4f0 00015c40 _start (0, 0, 0, 0, 0, 0) + 108
$ pstack core.imap.7944 core 'core.imap.7944' of 7944: imap ff1c1adc _lwp_kill (6, 0, 0, ff1a4d94, ffffffff, 6) + 8 ff140218 abort (e0df0, 1, f3f88, aa1a0, ff1ed2d8, 0) + 110 000b4784 default_fatal_finish (4, 0, e1d58, ffbff818, e0c00, f5cbc) + 68 000b48ec i_panic (e1d58, 10c000, ce, 10c00c, a0, b4f00) + 2c 000bbfe0 timeout_get_wait_time (26c, ffbff890, 0, 1388, e1c00, ffffff73) + 15c 000bc170 io_loop_handle_timeouts_real (11b318, 115f80, 6f6d4, 10d000, e1c00, f5cfc) + 120 000bc1ec io_loop_handle_timeouts (11b318, 3, 1388, 10624c00, 0, 0) + 10 000bc8a4 io_loop_handler_run (11b318, 1, 0, e1c00, 80000000, 115aa0) + 60 000bc240 io_loop_run (11b318, 80000000, 80, 0, 80000000, 80000000) + 3c 00038a10 main (1, ffbffa94, ffbffa9c, f4400, 10d000, 10d000) + a0 000215a8 _start (0, 0, 0, 0, 0, 0) + 108
Does it help?
But anyway I did some changes to that code, maybe it'll give a different assert-crash the next time: http://hg.dovecot.org/dovecot-1.1/rev/75f36619846f
Thanks, I'll try to reproduce the problem and apply those fixes,
Laurent
/ Leader de Projet & Communauté | I'm working, but not speaking for \ G11N http://fr.opensolaris.org | Bull Services http://www.bull.com / FOSUG http://guses.org |
On Mon, 2008-12-01 at 14:15 +0100, Laurent Blume wrote:
Timo Sirainen a écrit :
I can't really think of why this would have happened. You wouldn't happen to have a core dump available from this? That would be very helpful in figuring out why this happened.
I have, here are the pstacks:
I was thinking more about looking at what the variables' values were. Like with gdb:
fr 5 p *timeout p *tv_r p *tv_now
participants (2)
-
Laurent Blume
-
Timo Sirainen