[Dovecot] 1.2.9 imap crash with backtrace
From the log:
Jan 3 20:26:07 postamt dovecot: IMAP(nixxxxx): copy: uid=3526, dest=Deleted Messages, msgid=<DC8A4BE5-41A0-4240-8F07-090E95414F7A@charite.de>, size=2726 Jan 3 20:26:07 postamt dovecot: IMAP(nixxxxx): delete: uid=3526, msgid=<DC8A4BE5-41A0-4240-8F07-090E95414F7A@charite.de> Jan 3 20:26:11 postamt dovecot: IMAP(nixxxxx): expunge: uid=3526, msgid=<DC8A4BE5-41A0-4240-8F07-090E95414F7A@charite.de>, size=2726 Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): copy: uid=415, box=Deleted Messages, dest=INBOX, msgid=<d539ffeb1001030758n1acc33c1m1fc67e0f1d227e92@mail.gmail.com>, size=3604 Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): /home/n/i/nixxxxx/Maildir/dovecot-uidlist: Duplicate file entry at line 1: 1262534293.M822904P8880.postamt.charite.de,S=3604,W=3690:2,RSaf (uid 3528 -> 3531) Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): Panic: file maildir-uidlist.c: line 403 (maildir_uidlist_records_array_delete): assertion failed: (pos != NULL) Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): Raw backtrace: imap [0x80f1631] -> imap [0x80f16a2] -> imap [0x80f1019] -> imap [0x8088fa6] -> imap(maildir_uidlist_refresh+0x9e9) [0x8089fc9] -> imap [0x8086704] -> imap(maildir_storage_sync_init+0x125) [0x8086be5] -> imap(imap_sync_init+0x54) [0x8070434] -> imap [0x8062aa2] -> imap [0x80a619c] -> imap(io_loop_handle_timeouts+0xe9) [0x80f9959] -> imap(io_loop_handler_run+0x82) [0x80fa222] -> imap(io_loop_run+0x20) [0x80f96f0] -> imap(main+0x5d2) [0x8070f52] -> /lib/i686/cmov/libc.so.6(__libc_start_main+0xe5) [0xb75bcb55] -> imap [0x80602d1] Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): delete: uid=415, box=Deleted Messages, msgid=<d539ffeb1001030758n1acc33c1m1fc67e0f1d227e92@mail.gmail.com> Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): expunge: uid=415, box=Deleted Messages, msgid=<d539ffeb1001030758n1acc33c1m1fc67e0f1d227e92@mail.gmail.com>, size=3604 Jan 3 20:26:45 postamt dovecot: IMAP(nixxxxx): Disconnected: Logged out bytes=63917/47203 Jan 3 20:26:45 postamt dovecot: IMAP(nixxxxx): Disconnected: Logged out bytes=360/168380 Jan 3 20:26:45 postamt dovecot: IMAP(nixxxxx): Disconnected: Logged out bytes=670290/672304
GNU gdb (GDB) 7.0-debian Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/local/libexec/dovecot/imap...done. Reading symbols from /lib/i686/cmov/libdl.so.2...Reading symbols from /usr/lib/debug/lib/i686/cmov/libdl-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/libdl.so.2 Reading symbols from /lib/i686/cmov/librt.so.1...Reading symbols from /usr/lib/debug/lib/i686/cmov/librt-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/librt.so.1 Reading symbols from /lib/i686/cmov/libc.so.6...Reading symbols from /usr/lib/debug/lib/i686/cmov/libc-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/libc.so.6 Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /lib/i686/cmov/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/i686/cmov/libpthread-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/libpthread.so.0 Reading symbols from /usr/local/lib/dovecot/imap/lib10_quota_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib10_quota_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib11_imap_quota_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib11_imap_quota_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib11_trash_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib11_trash_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_autocreate_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_autocreate_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_fts_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_fts_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_mail_log_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_mail_log_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_zlib_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_zlib_plugin.so Reading symbols from /usr/lib/libz.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libbz2.so.1.0...(no debugging symbols found)...done. Loaded symbols for /lib/libbz2.so.1.0 Reading symbols from /usr/local/lib/dovecot/imap/lib21_fts_squat_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib21_fts_squat_plugin.so Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/libgcc_s.so.1 Core was generated by `imap [nixxxxx 87.162.56.10]'. Program terminated with signal 6, Aborted. #0 0xb7708424 in __kernel_vsyscall () #0 0xb7708424 in __kernel_vsyscall () No symbol table info available. #1 0xb75d08e0 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 resultvar = <value optimized out> pid = -1217490956 selftid = 25023 #2 0xb75d3e15 in *__GI_abort () at abort.c:88 act = {__sigaction_handler = {sa_handler = 0x9505708, sa_sigaction = 0x9505708}, sa_mask = {__val = {3219827960, 135197034, 3077474016, 0, 3219828080, 3219828068, 128, 0, 0, 3219828144, 3077727856, 134566866, 549, 0, 0, 0, 1, 135270350, 128, 1, 3219828068, 156260072, 69, 3219828196, 3219828040, 135198630, 156260653, 3219828068, 3219828072, 550, 156260104, 3076940867}}, sa_flags = 550, sa_restorer = 0x8106dd9 <write_full+57>} sigs = {__val = {32, 0 <repeats 31 times>}} #3 0x080f1645 in default_fatal_finish (type=<value optimized out>, status=0) at failures.c:160 backtrace = 0x95052d8 "imap [0x80f1631] -> imap [0x80f16a2] -> imap [0x80f1019] -> imap [0x8088fa6] -> imap(maildir_uidlist_refresh+0x9e9) [0x8089fc9] -> imap [0x8086704] -> imap(maildir_storage_sync_init+0x125) [0x8086be5]"... #4 0x080f16a2 in i_internal_fatal_handler (type=LOG_TYPE_PANIC, status=0, fmt=0x8109e2c "file %s: line %d (%s): assertion failed: (%s)", args=0xbfeaae44 ":\355\020\b\223\001") at failures.c:443 No locals. #5 0x080f1019 in i_panic ( format=0x8109e2c "file %s: line %d (%s): assertion failed: (%s)") at failures.c:207 No locals. #6 0x08088fa6 in maildir_uidlist_records_array_delete (uidlist=0x951b0b0, rec=0x9534500) at maildir-uidlist.c:403 recs = 0x9536fa0 __PRETTY_FUNCTION__ = "maildir_uidlist_records_array_delete" #7 0x08089fc9 in maildir_uidlist_next (uidlist=0x951b0b0) at maildir-uidlist.c:544 rec = 0x95346b8 uid = 3531 #8 maildir_uidlist_update_read (uidlist=0x951b0b0) at maildir-uidlist.c:741 storage = 0x950f4b8 orig_uid_validity = 1186140978 fd = 14 orig_next_uid = 3531 input = 0x95347a0 st = {st_dev = 65024, __pad1 = 0, __st_ino = 5228663, st_mode = 33152, st_nlink = 1, st_uid = 50357, st_gid = 100, st_rdev = 0, __pad2 = 0, st_size = 763, st_blksize = 4096, st_blocks = 8, st_atim = { tv_sec = 1262546767, tv_nsec = 269770575}, st_mtim = { tv_sec = 1262546802, tv_nsec = 221806856}, st_ctim = { tv_sec = 1262546802, tv_nsec = 221806856}, st_ino = 5228663} last_read_offset = 693 #9 maildir_uidlist_refresh (uidlist=0x951b0b0) at maildir-uidlist.c:889 i = 0 retry = <value optimized out> ret = <value optimized out> #10 0x08086704 in have_recent_messages (ctx=0x9505210, forced=false, find_uid=0x0, lost_files_r=0xbfeaaffb) at maildir-sync.c:659 hdr = <value optimized out> #11 maildir_sync_get_changes (ctx=0x9505210, forced=false, find_uid=0x0, lost_files_r=0xbfeaaffb) at maildir-sync.c:679 flags = <value optimized out> #12 maildir_sync_context (ctx=0x9505210, forced=false, find_uid=0x0, lost_files_r=0xbfeaaffb) at maildir-sync.c:714 sync_flags = <value optimized out> flags = 135325664 new_changed = <value optimized out> cur_changed = <value optimized out> lock_failure = <value optimized out> fname = 0x9515030 "/home/n/i/nixxxxx/Maildir" ret = <value optimized out> __PRETTY_FUNCTION__ = "maildir_sync_context" #13 0x08086be5 in maildir_storage_sync_init (box=0x9514e98, flags=0) at maildir-sync.c:924 _data_stack_cur_id = 4 ctx = 0x9505210 lost_files = <value optimized out> ret = 0 __PRETTY_FUNCTION__ = "maildir_storage_sync_init" #14 0x08070434 in imap_sync_init (client=0x950fba0, box=0x9514e98, imap_flags=0, flags=0) at imap-sync.c:146 ctx = 0x95191a0 __PRETTY_FUNCTION__ = "imap_sync_init" #15 0x08062aa2 in idle_sync_now (box=<value optimized out>, ctx=0x950feb8) at cmd-idle.c:127 __PRETTY_FUNCTION__ = "idle_sync_now" #16 0x080a619c in notify_delay_callback (ibox=0x9514e98) at index-mailbox-check.c:49 No locals. #17 0x080f9959 in io_loop_handle_timeouts_real (ioloop=0x950d9b0) at ioloop.c:313 item = 0x9563c78 tv = {tv_sec = 0, tv_usec = 0} tv_call = {tv_sec = 1262546802, tv_usec = 336801} t_id = 3 #18 io_loop_handle_timeouts (ioloop=0x950d9b0) at ioloop.c:324 _data_stack_cur_id = 2 #19 0x080fa222 in io_loop_handler_run (ioloop=0x950d9b0) at ioloop-epoll.c:180 ctx = 0x950dab8 event = <value optimized out> list = <value optimized out> io = <value optimized out> tv = {tv_sec = 0, tv_usec = 108829} t_id = <value optimized out> msecs = 109 ret = 0 i = <value optimized out> j = <value optimized out> call = <value optimized out> #20 0x080f96f0 in io_loop_run (ioloop=0x950d9b0) at ioloop.c:335 No locals. #21 0x08070f52 in main (argc=) at main.c:327 No locals. $1 = {prev = 0x0, next = 0x0, client = 0x950fba0, pool = 0x950fe18, tag = 0x950fea8 "104.6", name = 0x950feb0 "IDLE", args = 0x0, cmd_flags = 10, func = 0x80628b0 <cmd_idle_continue>, context = 0x950feb8, module_contexts = {arr = {buffer = 0x950fe70, element_size = 4}, v = 0x950fe70, v_modifiable = 0x950fe70}, parser = 0x9512620, state = CLIENT_COMMAND_STATE_WAIT_INPUT, sync = 0x0, uid = 0, cancel = 0, param_error = 0, search_save_result = 0, temp_executed = 0} (gdb) quit
----- End forwarded message -----
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
Ralf, you're seeing the same panic I've been going back and forth with Timo on for the last two weeks. Do you think you could run a couple of quick gdb commands? Timo asked me for these last week, but I thought it might be useful if they came from you as well for comparison. It also looks like you have optimization turned off which is great for debugging.
From Timo...
Could you do in gdb something like:
fr 6 (or whatever frame gives usable results) p *rec p count p *recs[0] p *recs[1] p *recs[..up until count-1]
If count is large, the main things I want to know are:
Is the array sorted (recs[n]->uid < recs[n+1]->uid always)?
is rec->uid anywhere in the recs array?
On 01/04/2010 05:09 AM, Ralf Hildebrandt wrote:
From the log:
Jan 3 20:26:07 postamt dovecot: IMAP(nixxxxx): copy: uid=3526, dest=Deleted Messages, msgid=<DC8A4BE5-41A0-4240-8F07-090E95414F7A@charite.de>, size=2726 Jan 3 20:26:07 postamt dovecot: IMAP(nixxxxx): delete: uid=3526, msgid=<DC8A4BE5-41A0-4240-8F07-090E95414F7A@charite.de> Jan 3 20:26:11 postamt dovecot: IMAP(nixxxxx): expunge: uid=3526, msgid=<DC8A4BE5-41A0-4240-8F07-090E95414F7A@charite.de>, size=2726 Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): copy: uid=415, box=Deleted Messages, dest=INBOX, msgid=<d539ffeb1001030758n1acc33c1m1fc67e0f1d227e92@mail.gmail.com>, size=3604 Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): /home/n/i/nixxxxx/Maildir/dovecot-uidlist: Duplicate file entry at line 1: 1262534293.M822904P8880.postamt.charite.de,S=3604,W=3690:2,RSaf (uid 3528 -> 3531) Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): Panic: file maildir-uidlist.c: line 403 (maildir_uidlist_records_array_delete): assertion failed: (pos != NULL) Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): Raw backtrace: imap [0x80f1631] -> imap [0x80f16a2] -> imap [0x80f1019] -> imap [0x8088fa6] -> imap(maildir_uidlist_refresh+0x9e9) [0x8089fc9] -> imap [0x8086704] -> imap(maildir_storage_sync_init+0x125) [0x8086be5] -> imap(imap_sync_init+0x54) [0x8070434] -> imap [0x8062aa2] -> imap [0x80a619c] -> imap(io_loop_handle_timeouts+0xe9) [0x80f9959] -> imap(io_loop_handler_run+0x82) [0x80fa222] -> imap(io_loop_run+0x20) [0x80f96f0] -> imap(main+0x5d2) [0x8070f52] -> /lib/i686/cmov/libc.so.6(__libc_start_main+0xe5) [0xb75bcb55] -> imap [0x80602d1] Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): delete: uid=415, box=Deleted Messages, msgid=<d539ffeb1001030758n1acc33c1m1fc67e0f1d227e92@mail.gmail.com> Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): expunge: uid=415, box=Deleted Messages, msgid=<d539ffeb1001030758n1acc33c1m1fc67e0f1d227e92@mail.gmail.com>, size=3604 Jan 3 20:26:45 postamt dovecot: IMAP(nixxxxx): Disconnected: Logged out bytes=63917/47203 Jan 3 20:26:45 postamt dovecot: IMAP(nixxxxx): Disconnected: Logged out bytes=360/168380 Jan 3 20:26:45 postamt dovecot: IMAP(nixxxxx): Disconnected: Logged out bytes=670290/672304
GNU gdb (GDB) 7.0-debian Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later<http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/local/libexec/dovecot/imap...done. Reading symbols from /lib/i686/cmov/libdl.so.2...Reading symbols from /usr/lib/debug/lib/i686/cmov/libdl-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/libdl.so.2 Reading symbols from /lib/i686/cmov/librt.so.1...Reading symbols from /usr/lib/debug/lib/i686/cmov/librt-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/librt.so.1 Reading symbols from /lib/i686/cmov/libc.so.6...Reading symbols from /usr/lib/debug/lib/i686/cmov/libc-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/libc.so.6 Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /lib/i686/cmov/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/i686/cmov/libpthread-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/libpthread.so.0 Reading symbols from /usr/local/lib/dovecot/imap/lib10_quota_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib10_quota_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib11_imap_quota_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib11_imap_quota_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib11_trash_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib11_trash_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_autocreate_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_autocreate_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_fts_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_fts_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_mail_log_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_mail_log_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_zlib_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_zlib_plugin.so Reading symbols from /usr/lib/libz.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libbz2.so.1.0...(no debugging symbols found)...done. Loaded symbols for /lib/libbz2.so.1.0 Reading symbols from /usr/local/lib/dovecot/imap/lib21_fts_squat_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib21_fts_squat_plugin.so Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/libgcc_s.so.1 Core was generated by `imap [nixxxxx 87.162.56.10]'. Program terminated with signal 6, Aborted. #0 0xb7708424 in __kernel_vsyscall () #0 0xb7708424 in __kernel_vsyscall () No symbol table info available. #1 0xb75d08e0 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 resultvar =<value optimized out> pid = -1217490956 selftid = 25023 #2 0xb75d3e15 in *__GI_abort () at abort.c:88 act = {__sigaction_handler = {sa_handler = 0x9505708, sa_sigaction = 0x9505708}, sa_mask = {__val = {3219827960, 135197034, 3077474016, 0, 3219828080, 3219828068, 128, 0, 0, 3219828144, 3077727856, 134566866, 549, 0, 0, 0, 1, 135270350, 128, 1, 3219828068, 156260072, 69, 3219828196, 3219828040, 135198630, 156260653, 3219828068, 3219828072, 550, 156260104, 3076940867}}, sa_flags = 550, sa_restorer = 0x8106dd9<write_full+57>} sigs = {__val = {32, 0<repeats 31 times>}} #3 0x080f1645 in default_fatal_finish (type=<value optimized out>, status=0) at failures.c:160 backtrace = 0x95052d8 "imap [0x80f1631] -> imap [0x80f16a2] -> imap [0x80f1019] -> imap [0x8088fa6] -> imap(maildir_uidlist_refresh+0x9e9) [0x8089fc9] -> imap [0x8086704] -> imap(maildir_storage_sync_init+0x125) [0x8086be5]"... #4 0x080f16a2 in i_internal_fatal_handler (type=LOG_TYPE_PANIC, status=0, fmt=0x8109e2c "file %s: line %d (%s): assertion failed: (%s)", args=0xbfeaae44 ":\355\020\b\223\001") at failures.c:443 No locals. #5 0x080f1019 in i_panic ( format=0x8109e2c "file %s: line %d (%s): assertion failed: (%s)") at failures.c:207 No locals. #6 0x08088fa6 in maildir_uidlist_records_array_delete (uidlist=0x951b0b0, rec=0x9534500) at maildir-uidlist.c:403 recs = 0x9536fa0 __PRETTY_FUNCTION__ = "maildir_uidlist_records_array_delete" #7 0x08089fc9 in maildir_uidlist_next (uidlist=0x951b0b0) at maildir-uidlist.c:544 rec = 0x95346b8 uid = 3531 #8 maildir_uidlist_update_read (uidlist=0x951b0b0) at maildir-uidlist.c:741 storage = 0x950f4b8 orig_uid_validity = 1186140978 fd = 14 orig_next_uid = 3531 input = 0x95347a0 st = {st_dev = 65024, __pad1 = 0, __st_ino = 5228663, st_mode = 33152, st_nlink = 1, st_uid = 50357, st_gid = 100, st_rdev = 0, __pad2 = 0, st_size = 763, st_blksize = 4096, st_blocks = 8, st_atim = { tv_sec = 1262546767, tv_nsec = 269770575}, st_mtim = { tv_sec = 1262546802, tv_nsec = 221806856}, st_ctim = { tv_sec = 1262546802, tv_nsec = 221806856}, st_ino = 5228663} last_read_offset = 693 #9 maildir_uidlist_refresh (uidlist=0x951b0b0) at maildir-uidlist.c:889 i = 0 retry =<value optimized out> ret =<value optimized out> #10 0x08086704 in have_recent_messages (ctx=0x9505210, forced=false, find_uid=0x0, lost_files_r=0xbfeaaffb) at maildir-sync.c:659 hdr =<value optimized out> #11 maildir_sync_get_changes (ctx=0x9505210, forced=false, find_uid=0x0, lost_files_r=0xbfeaaffb) at maildir-sync.c:679 flags =<value optimized out> #12 maildir_sync_context (ctx=0x9505210, forced=false, find_uid=0x0, lost_files_r=0xbfeaaffb) at maildir-sync.c:714 sync_flags =<value optimized out> flags = 135325664 new_changed =<value optimized out> cur_changed =<value optimized out> lock_failure =<value optimized out> fname = 0x9515030 "/home/n/i/nixxxxx/Maildir" ret =<value optimized out> __PRETTY_FUNCTION__ = "maildir_sync_context" #13 0x08086be5 in maildir_storage_sync_init (box=0x9514e98, flags=0) at maildir-sync.c:924 _data_stack_cur_id = 4 ctx = 0x9505210 lost_files =<value optimized out> ret = 0 __PRETTY_FUNCTION__ = "maildir_storage_sync_init" #14 0x08070434 in imap_sync_init (client=0x950fba0, box=0x9514e98, imap_flags=0, flags=0) at imap-sync.c:146 ctx = 0x95191a0 __PRETTY_FUNCTION__ = "imap_sync_init" #15 0x08062aa2 in idle_sync_now (box=<value optimized out>, ctx=0x950feb8) at cmd-idle.c:127 __PRETTY_FUNCTION__ = "idle_sync_now" #16 0x080a619c in notify_delay_callback (ibox=0x9514e98) at index-mailbox-check.c:49 No locals. #17 0x080f9959 in io_loop_handle_timeouts_real (ioloop=0x950d9b0) at ioloop.c:313 item = 0x9563c78 tv = {tv_sec = 0, tv_usec = 0} tv_call = {tv_sec = 1262546802, tv_usec = 336801} t_id = 3 #18 io_loop_handle_timeouts (ioloop=0x950d9b0) at ioloop.c:324 _data_stack_cur_id = 2 #19 0x080fa222 in io_loop_handler_run (ioloop=0x950d9b0) at ioloop-epoll.c:180 ctx = 0x950dab8 event =<value optimized out> list =<value optimized out> io =<value optimized out> tv = {tv_sec = 0, tv_usec = 108829} t_id =<value optimized out> msecs = 109 ret = 0 i =<value optimized out> j =<value optimized out> call =<value optimized out> #20 0x080f96f0 in io_loop_run (ioloop=0x950d9b0) at ioloop.c:335 No locals. #21 0x08070f52 in main (argc=) at main.c:327 No locals. $1 = {prev = 0x0, next = 0x0, client = 0x950fba0, pool = 0x950fe18, tag = 0x950fea8 "104.6", name = 0x950feb0 "IDLE", args = 0x0, cmd_flags = 10, func = 0x80628b0<cmd_idle_continue>, context = 0x950feb8, module_contexts = {arr = {buffer = 0x950fe70, element_size = 4}, v = 0x950fe70, v_modifiable = 0x950fe70}, parser = 0x9512620, state = CLIENT_COMMAND_STATE_WAIT_INPUT, sync = 0x0, uid = 0, cancel = 0, param_error = 0, search_save_result = 0, temp_executed = 0} (gdb) quit
----- End forwarded message -----
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
- David Halik <dhalik@jla.rutgers.edu>:
Ralf, you're seeing the same panic I've been going back and forth with Timo on for the last two weeks. Do you think you could run a couple of quick gdb commands? Timo asked me for these last week, but I thought it might be useful if they came from you as well for comparison. It also looks like you have optimization turned off which is great for debugging.
From Timo...
Could you do in gdb something like:
fr 6 (or whatever frame gives usable results)
Yes, it's frame 6
p *rec
(gdb) p *rec $2 = {uid = 1721, flags = 8, filename = 0x9c4e2b0 "1247223070.M198961P26186.postamt.charite.de,S=4080,W=4182:2,Sa", extensions = 0x0}
p count
(gdb) p count $3 = <value optimized out>
Sorry :(
p *recs[0]
(gdb) p *recs[0] $4 = {uid = 1962, flags = 8, filename = 0x9c531a8 "1262613886.M868332P16003.postamt.charite.de,W=39324,S=38791:2,Sa", extensions = 0x0}
p *recs[1] p *recs[..up until count-1]
Since count is optimized out, how should I know? OK, trial and error:
(gdb) p *recs[0] $16 = {uid = 1962, flags = 8, filename = 0x9c531a8 "1262613886.M868332P16003.postamt.charite.de,W=39324,S=38791:2,Sa", extensions = 0x0} (gdb) p *recs[1] $17 = {uid = 1963, flags = 8, filename = 0x9c53200 "1262613886.M868333P16003.postamt.charite.de,W=3061,S=2987:2,Sa", extensions = 0x0} (gdb) p *recs[2] $18 = {uid = 1964, flags = 8, filename = 0x9c53250 "1262613886.M868334P16003.postamt.charite.de,W=407687,S=402422:2,Sa", extensions = 0x0} (gdb) p *recs[3] $19 = {uid = 1965, flags = 8, filename = 0x9c532a8 "1262613886.M868335P16003.postamt.charite.de,W=3576,S=3496:2,RSa", extensions = 0x0} (gdb) p *recs[4] $20 = {uid = 1966, flags = 8, filename = 0x9c532f8 "1262613886.M868336P16003.postamt.charite.de,W=20045,S=19472:2,Sa", extensions = 0x0} (gdb) p *recs[5] $21 = {uid = 1967, flags = 8, filename = 0x9c53350 "1262613886.M868337P16003.postamt.charite.de,W=99371,S=97957:2,Sa", extensions = 0x0} (gdb) p *recs[6] $22 = {uid = 1968, flags = 8, filename = 0x9c533a8 "1262613886.M868338P16003.postamt.charite.de,W=4552,S=4453:2,Sa", extensions = 0x0}
...
(gdb) p *recs[50] $23 = {uid = 2012, flags = 8, filename = 0x9c54260 "1262613886.M868382P16003.postamt.charite.de,W=29656,S=28920:2,Sa", extensions = 0x0} (gdb) p *recs[51] $24 = {uid = 2013, flags = 8, filename = 0x9c542b8 "1262613886.M868383P16003.postamt.charite.de,W=83093,S=81992:2,Sa", extensions = 0x0} (gdb) p *recs[52] $25 = {uid = 2014, flags = 8, filename = 0x9c54310 "1262613886.M868384P16003.postamt.charite.de,W=2104892,S=2077799:2,Sb", extensions = 0x0} (gdb) p *recs[53] $26 = {uid = 2015, flags = 8, filename = 0x9c54368 "1262613886.M868385P16003.postamt.charite.de,W=89389,S=88208:2,Sa", extensions = 0x0} (gdb) p *recs[54] $27 = {uid = 2016, flags = 8, filename = 0x9c543c0 "1262613886.M868386P16003.postamt.charite.de,W=12822,S=12457:2,Sa", extensions = 0x0} (gdb) p *recs[55] $28 = {uid = 2017, flags = 8, filename = 0x9c54428 "1262613886.M868387P16003.postamt.charite.de,W=14356,S=14096:2,Sa", extensions = 0x0} (gdb) p *recs[56] $29 = {uid = 2018, flags = 8, filename = 0x9c54480 "1262613886.M868388P16003.postamt.charite.de,W=15126,S=14718:2,Sa", extensions = 0x0} (gdb) p *recs[57] Cannot access memory at address 0x0
If count is large, the main things I want to know are:
- Is the array sorted (recs[n]->uid < recs[n+1]->uid always)?
Yes, it's sorted!
- is rec->uid anywhere in the recs array?
I don't understand this question. All the entries recs[0] ... recs[56] are containing entries like the ones above.
On 01/04/2010 05:09 AM, Ralf Hildebrandt wrote:
From the log:
Jan 3 20:26:07 postamt dovecot: IMAP(nixxxxx): copy: uid=3526, dest=Deleted Messages, msgid=<DC8A4BE5-41A0-4240-8F07-090E95414F7A@charite.de>, size=2726 Jan 3 20:26:07 postamt dovecot: IMAP(nixxxxx): delete: uid=3526, msgid=<DC8A4BE5-41A0-4240-8F07-090E95414F7A@charite.de> Jan 3 20:26:11 postamt dovecot: IMAP(nixxxxx): expunge: uid=3526, msgid=<DC8A4BE5-41A0-4240-8F07-090E95414F7A@charite.de>, size=2726 Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): copy: uid=415, box=Deleted Messages, dest=INBOX, msgid=<d539ffeb1001030758n1acc33c1m1fc67e0f1d227e92@mail.gmail.com>, size=3604 Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): /home/n/i/nixxxxx/Maildir/dovecot-uidlist: Duplicate file entry at line 1: 1262534293.M822904P8880.postamt.charite.de,S=3604,W=3690:2,RSaf (uid 3528 -> 3531) Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): Panic: file maildir-uidlist.c: line 403 (maildir_uidlist_records_array_delete): assertion failed: (pos != NULL) Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): Raw backtrace: imap [0x80f1631] -> imap [0x80f16a2] -> imap [0x80f1019] -> imap [0x8088fa6] -> imap(maildir_uidlist_refresh+0x9e9) [0x8089fc9] -> imap [0x8086704] -> imap(maildir_storage_sync_init+0x125) [0x8086be5] -> imap(imap_sync_init+0x54) [0x8070434] -> imap [0x8062aa2] -> imap [0x80a619c] -> imap(io_loop_handle_timeouts+0xe9) [0x80f9959] -> imap(io_loop_handler_run+0x82) [0x80fa222] -> imap(io_loop_run+0x20) [0x80f96f0] -> imap(main+0x5d2) [0x8070f52] -> /lib/i686/cmov/libc.so.6(__libc_start_main+0xe5) [0xb75bcb55] -> imap [0x80602d1] Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): delete: uid=415, box=Deleted Messages, msgid=<d539ffeb1001030758n1acc33c1m1fc67e0f1d227e92@mail.gmail.com> Jan 3 20:26:42 postamt dovecot: IMAP(nixxxxx): expunge: uid=415, box=Deleted Messages, msgid=<d539ffeb1001030758n1acc33c1m1fc67e0f1d227e92@mail.gmail.com>, size=3604 Jan 3 20:26:45 postamt dovecot: IMAP(nixxxxx): Disconnected: Logged out bytes=63917/47203 Jan 3 20:26:45 postamt dovecot: IMAP(nixxxxx): Disconnected: Logged out bytes=360/168380 Jan 3 20:26:45 postamt dovecot: IMAP(nixxxxx): Disconnected: Logged out bytes=670290/672304
GNU gdb (GDB) 7.0-debian Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later<http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/local/libexec/dovecot/imap...done. Reading symbols from /lib/i686/cmov/libdl.so.2...Reading symbols from /usr/lib/debug/lib/i686/cmov/libdl-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/libdl.so.2 Reading symbols from /lib/i686/cmov/librt.so.1...Reading symbols from /usr/lib/debug/lib/i686/cmov/librt-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/librt.so.1 Reading symbols from /lib/i686/cmov/libc.so.6...Reading symbols from /usr/lib/debug/lib/i686/cmov/libc-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/libc.so.6 Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /lib/i686/cmov/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/i686/cmov/libpthread-2.10.2.so...done. (no debugging symbols found)...done. Loaded symbols for /lib/i686/cmov/libpthread.so.0 Reading symbols from /usr/local/lib/dovecot/imap/lib10_quota_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib10_quota_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib11_imap_quota_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib11_imap_quota_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib11_trash_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib11_trash_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_autocreate_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_autocreate_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_fts_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_fts_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_mail_log_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_mail_log_plugin.so Reading symbols from /usr/local/lib/dovecot/imap/lib20_zlib_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib20_zlib_plugin.so Reading symbols from /usr/lib/libz.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libbz2.so.1.0...(no debugging symbols found)...done. Loaded symbols for /lib/libbz2.so.1.0 Reading symbols from /usr/local/lib/dovecot/imap/lib21_fts_squat_plugin.so...done. Loaded symbols for /usr/local/lib/dovecot/imap/lib21_fts_squat_plugin.so Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/libgcc_s.so.1 Core was generated by `imap [nixxxxx 87.162.56.10]'. Program terminated with signal 6, Aborted. #0 0xb7708424 in __kernel_vsyscall () #0 0xb7708424 in __kernel_vsyscall () No symbol table info available. #1 0xb75d08e0 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 resultvar =<value optimized out> pid = -1217490956 selftid = 25023 #2 0xb75d3e15 in *__GI_abort () at abort.c:88 act = {__sigaction_handler = {sa_handler = 0x9505708, sa_sigaction = 0x9505708}, sa_mask = {__val = {3219827960, 135197034, 3077474016, 0, 3219828080, 3219828068, 128, 0, 0, 3219828144, 3077727856, 134566866, 549, 0, 0, 0, 1, 135270350, 128, 1, 3219828068, 156260072, 69, 3219828196, 3219828040, 135198630, 156260653, 3219828068, 3219828072, 550, 156260104, 3076940867}}, sa_flags = 550, sa_restorer = 0x8106dd9<write_full+57>} sigs = {__val = {32, 0<repeats 31 times>}} #3 0x080f1645 in default_fatal_finish (type=<value optimized out>, status=0) at failures.c:160 backtrace = 0x95052d8 "imap [0x80f1631] -> imap [0x80f16a2] -> imap [0x80f1019] -> imap [0x8088fa6] -> imap(maildir_uidlist_refresh+0x9e9) [0x8089fc9] -> imap [0x8086704] -> imap(maildir_storage_sync_init+0x125) [0x8086be5]"... #4 0x080f16a2 in i_internal_fatal_handler (type=LOG_TYPE_PANIC, status=0, fmt=0x8109e2c "file %s: line %d (%s): assertion failed: (%s)", args=0xbfeaae44 ":\355\020\b\223\001") at failures.c:443 No locals. #5 0x080f1019 in i_panic ( format=0x8109e2c "file %s: line %d (%s): assertion failed: (%s)") at failures.c:207 No locals. #6 0x08088fa6 in maildir_uidlist_records_array_delete (uidlist=0x951b0b0, rec=0x9534500) at maildir-uidlist.c:403 recs = 0x9536fa0 __PRETTY_FUNCTION__ = "maildir_uidlist_records_array_delete" #7 0x08089fc9 in maildir_uidlist_next (uidlist=0x951b0b0) at maildir-uidlist.c:544 rec = 0x95346b8 uid = 3531 #8 maildir_uidlist_update_read (uidlist=0x951b0b0) at maildir-uidlist.c:741 storage = 0x950f4b8 orig_uid_validity = 1186140978 fd = 14 orig_next_uid = 3531 input = 0x95347a0 st = {st_dev = 65024, __pad1 = 0, __st_ino = 5228663, st_mode = 33152, st_nlink = 1, st_uid = 50357, st_gid = 100, st_rdev = 0, __pad2 = 0, st_size = 763, st_blksize = 4096, st_blocks = 8, st_atim = { tv_sec = 1262546767, tv_nsec = 269770575}, st_mtim = { tv_sec = 1262546802, tv_nsec = 221806856}, st_ctim = { tv_sec = 1262546802, tv_nsec = 221806856}, st_ino = 5228663} last_read_offset = 693 #9 maildir_uidlist_refresh (uidlist=0x951b0b0) at maildir-uidlist.c:889 i = 0 retry =<value optimized out> ret =<value optimized out> #10 0x08086704 in have_recent_messages (ctx=0x9505210, forced=false, find_uid=0x0, lost_files_r=0xbfeaaffb) at maildir-sync.c:659 hdr =<value optimized out> #11 maildir_sync_get_changes (ctx=0x9505210, forced=false, find_uid=0x0, lost_files_r=0xbfeaaffb) at maildir-sync.c:679 flags =<value optimized out> #12 maildir_sync_context (ctx=0x9505210, forced=false, find_uid=0x0, lost_files_r=0xbfeaaffb) at maildir-sync.c:714 sync_flags =<value optimized out> flags = 135325664 new_changed =<value optimized out> cur_changed =<value optimized out> lock_failure =<value optimized out> fname = 0x9515030 "/home/n/i/nixxxxx/Maildir" ret =<value optimized out> __PRETTY_FUNCTION__ = "maildir_sync_context" #13 0x08086be5 in maildir_storage_sync_init (box=0x9514e98, flags=0) at maildir-sync.c:924 _data_stack_cur_id = 4 ctx = 0x9505210 lost_files =<value optimized out> ret = 0 __PRETTY_FUNCTION__ = "maildir_storage_sync_init" #14 0x08070434 in imap_sync_init (client=0x950fba0, box=0x9514e98, imap_flags=0, flags=0) at imap-sync.c:146 ctx = 0x95191a0 __PRETTY_FUNCTION__ = "imap_sync_init" #15 0x08062aa2 in idle_sync_now (box=<value optimized out>, ctx=0x950feb8) at cmd-idle.c:127 __PRETTY_FUNCTION__ = "idle_sync_now" #16 0x080a619c in notify_delay_callback (ibox=0x9514e98) at index-mailbox-check.c:49 No locals. #17 0x080f9959 in io_loop_handle_timeouts_real (ioloop=0x950d9b0) at ioloop.c:313 item = 0x9563c78 tv = {tv_sec = 0, tv_usec = 0} tv_call = {tv_sec = 1262546802, tv_usec = 336801} t_id = 3 #18 io_loop_handle_timeouts (ioloop=0x950d9b0) at ioloop.c:324 _data_stack_cur_id = 2 #19 0x080fa222 in io_loop_handler_run (ioloop=0x950d9b0) at ioloop-epoll.c:180 ctx = 0x950dab8 event =<value optimized out> list =<value optimized out> io =<value optimized out> tv = {tv_sec = 0, tv_usec = 108829} t_id =<value optimized out> msecs = 109 ret = 0 i =<value optimized out> j =<value optimized out> call =<value optimized out> #20 0x080f96f0 in io_loop_run (ioloop=0x950d9b0) at ioloop.c:335 No locals. #21 0x08070f52 in main (argc=) at main.c:327 No locals. $1 = {prev = 0x0, next = 0x0, client = 0x950fba0, pool = 0x950fe18, tag = 0x950fea8 "104.6", name = 0x950feb0 "IDLE", args = 0x0, cmd_flags = 10, func = 0x80628b0<cmd_idle_continue>, context = 0x950feb8, module_contexts = {arr = {buffer = 0x950fe70, element_size = 4}, v = 0x950fe70, v_modifiable = 0x950fe70}, parser = 0x9512620, state = CLIENT_COMMAND_STATE_WAIT_INPUT, sync = 0x0, uid = 0, cancel = 0, param_error = 0, search_save_result = 0, temp_executed = 0} (gdb) quit
----- End forwarded message -----
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
Does that output help?
- Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>:
- David Halik <dhalik@jla.rutgers.edu>:
Ralf, you're seeing the same panic I've been going back and forth with Timo on for the last two weeks. Do you think you could run a couple of quick gdb commands? Timo asked me for these last week, but I thought it might be useful if they came from you as well for comparison. It also looks like you have optimization turned off which is great for debugging.
From Timo...
Could you do in gdb something like:
fr 6 (or whatever frame gives usable results)
Yes, it's frame 6
p *rec
(gdb) p *rec $2 = {uid = 1721, flags = 8, filename = 0x9c4e2b0 "1247223070.M198961P26186.postamt.charite.de,S=4080,W=4182:2,Sa", extensions = 0x0} ...
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
On 8.1.2010, at 10.34, Ralf Hildebrandt wrote:
p *rec
(gdb) p *rec $2 = {uid = 1721, flags = 8, filename = 0x9c4e2b0 "1247223070.M198961P26186.postamt.charite.de,S=4080,W=4182:2,Sa", extensions = 0x0}
rec->uid is 1721.
(gdb) p *recs[0] $4 = {uid = 1962, flags = 8, filename = 0x9c531a8 "1262613886.M868332P16003.postamt.charite.de,W=39324,S=38791:2,Sa", extensions = 0x0} ..
- is rec->uid anywhere in the recs array?
I don't understand this question. All the entries recs[0] ... recs[56] are containing entries like the ones above.
1721 is not in the recs[] list, since it's sorted and the first one is 1962.
So there's something weird going on why it's in the filename hash table, but not in the array. I'll try to figure it out later..
Timo,
-----Original Message----- From: Timo Sirainen
1721 is not in the recs[] list, since it's sorted and the first one is
So there's something weird going on why it's in the filename hash
table, but
not in the array. I'll try to figure it out later..
I hope your move is going well, and you get settled in and your internet hooked up soon. It's got to be a rough process!
Just for the record, we continue to see this crash fairly frequently with a small subset of our users, enough so that they have started to complain to the helpdesk staff about their mail 'disappearing and then reappearing.' One user in particular has a mail client left open from three hosts and has hit it 23 times in the last week, and 10 times today.
If there's any more information I can collect or anything I can do to help get this resolved, please let me know!
-Brad
Same here. I laughed because our help desk started sending us the exact same complaints and then today I got a little bit of a red nose when a director's mail "disappeared" in a meeting. ;) Whoops.
It looks like users who end up with the off by 1 uid list rebuild and crash experience and empty inbox until the list is rebuilt and refreshed. I saw one user who experience the crash and then spewed about 15K lines of "Duplicate message, uid -> uid". Since that takes awhile they probably couldn't see anything in the meantime.
Anyway, since we're hearing the complaints I went and remounted our IMAP servers with the "noac" NFS option today. So far it seems to have swept the problem under the rug, but our NFS server's Ops went from an average of 3-5K to 10K-20K, and the cpu went from 10% to 50% of constant load... so this is *definitely* only a temp fix. Hopefully Timo will have time to analyze the problem once his move is all finished.
-Dave
I hope your move is going well, and you get settled in and your internet hooked up soon. It's got to be a rough process!
Just for the record, we continue to see this crash fairly frequently with a small subset of our users, enough so that they have started to complain to the helpdesk staff about their mail 'disappearing and then reappearing.' One user in particular has a mail client left open from three hosts and has hit it 23 times in the last week, and 10 times today.
If there's any more information I can collect or anything I can do to help get this resolved, please let me know!
-Brad
FYI, we backed out of the "noac" change today. When our 20K accounts started coming to work the NetApp NFS server was pushing 70% CPU usage and 25K NFS Ops/s, which resulted in all kinds of other havoc as normal services started becoming slow. This server usally runs around 25% and 5K, so such a large increase of load was too much to handle.
During the 12 hour window I didn't see a single uid error as expected, but the fix was worse than the problem.
On 01/13/2010 07:41 PM, David Halik wrote:
Same here. I laughed because our help desk started sending us the exact same complaints and then today I got a little bit of a red nose when a director's mail "disappeared" in a meeting. ;) Whoops.
It looks like users who end up with the off by 1 uid list rebuild and crash experience and empty inbox until the list is rebuilt and refreshed. I saw one user who experience the crash and then spewed about 15K lines of "Duplicate message, uid -> uid". Since that takes awhile they probably couldn't see anything in the meantime.
Anyway, since we're hearing the complaints I went and remounted our IMAP servers with the "noac" NFS option today. So far it seems to have swept the problem under the rug, but our NFS server's Ops went from an average of 3-5K to 10K-20K, and the cpu went from 10% to 50% of constant load... so this is *definitely* only a temp fix. Hopefully Timo will have time to analyze the problem once his move is all finished.
-Dave
I hope your move is going well, and you get settled in and your internet hooked up soon. It's got to be a rough process!
Just for the record, we continue to see this crash fairly frequently with a small subset of our users, enough so that they have started to complain to the helpdesk staff about their mail 'disappearing and then reappearing.' One user in particular has a mail client left open from three hosts and has hit it 23 times in the last week, and 10 times today.
If there's any more information I can collect or anything I can do to help get this resolved, please let me know!
-Brad
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
Hi David,
On 1/14/10 3:13 PM, "David Halik" <dhalik@jla.rutgers.edu> wrote:
FYI, we backed out of the "noac" change today. When our 20K accounts started coming to work the NetApp NFS server was pushing 70% CPU usage and 25K NFS Ops/s, which resulted in all kinds of other havoc as normal services started becoming slow. This server usally runs around 25% and 5K, so such a large increase of load was too much to handle.
During the 12 hour window I didn't see a single uid error as expected, but the fix was worse than the problem.
We're pretty loathe to go back to noac as well. We will probably disable process log throttling (mail_log_max_lines_per_sec = 0) to increase the reindex speed until Timo comes up with a fix for the crash. This should at least help the users "get their mail back" in a more reasonable timeframe.
-Brad
We're seeing this same coredump on a few dozen customers as well. Also Netapp/NFS.
cor
On 12.1.2010, at 16.47, Timo Sirainen wrote:
So there's something weird going on why it's in the filename hash table, but not in the array. I'll try to figure it out later..
This should fix it (and hopefully doesn't expose another bug): http://hg.dovecot.org/dovecot-1.2/rev/a1177c6cf8c7
On 20.1.2010, at 11.45, Timo Sirainen wrote:
On 12.1.2010, at 16.47, Timo Sirainen wrote:
So there's something weird going on why it's in the filename hash table, but not in the array. I'll try to figure it out later..
This should fix it (and hopefully doesn't expose another bug): http://hg.dovecot.org/dovecot-1.2/rev/a1177c6cf8c7
I think that fixes all of the "duplicate entry in uidlist" errors that have been reported recently. So please test soon and report how it goes over a few days, and I'll release 1.2.10.
On 01/20/2010 04:49 AM, Timo Sirainen wrote:
On 20.1.2010, at 11.45, Timo Sirainen wrote:
On 12.1.2010, at 16.47, Timo Sirainen wrote:
So there's something weird going on why it's in the filename hash table, but not in the array. I'll try to figure it out later..
This should fix it (and hopefully doesn't expose another bug): http://hg.dovecot.org/dovecot-1.2/rev/a1177c6cf8c7
I think that fixes all of the "duplicate entry in uidlist" errors that have been reported recently. So please test soon and report how it goes over a few days, and I'll release 1.2.10.
That seems to have done the trick. I've been running it for the past two hours in production and have not seen a single dump. The behavior looks like it handles it correctly now rather than dumping. I'll report back in a day or two to confirm that it has been stable, but so far so good. Thank you very much Timo.
Here's an example of stale nfs -> duplicate files instance with the patch...
Jan 20 13:06:17 gehenna14.rutgers.edu dovecot: IMAP(user): fdatasync(/rci/nqu/rci/u2/user/dovecot/.INBOX/dovecot-uidlist) failed: Stale NFS file handle Jan 20 13:06:17 gehenna14.rutgers.edu dovecot: IMAP(user): /rci/nqu/rci/u2/user/dovecot/.INBOX/dovecot-uidlist: next_uid was lowered (9531 -> 8975, hdr=8975) Jan 20 13:13:59 gehenna13.rutgers.edu dovecot: IMAP(user): /rci/nqu/rci/u2/user/dovecot/.INBOX/dovecot-uidlist: Duplicate file entry at line 3: 1259771459.M806388P24066V04240001I002E8FC6_0.gehenna7.rutgers.edu,S=3927:2,S (uid 8420 -> 9532) Jan 20 13:13:59 gehenna13.rutgers.edu dovecot: IMAP(user): /rci/nqu/rci/u2/user/dovecot/.INBOX/dovecot-uidlist: Duplicate file entry at line 4: 1259771993.M755737P14220V045C0002I002E8FC7_0.gehenna8.rutgers.edu,S=2768:2,S (uid 8421 -> 9533) ##### SNIP ##### Jan 20 13:13:59 gehenna13.rutgers.edu dovecot: IMAP(user): /rci/nqu/rci/u2/user/dovecot/.INBOX/dovecot-uidlist: Duplicate file entry at line 330: 1261199619.M995718P21199V03E80002I00B1AC1D_0.gehenna9.rutgers.edu,S=18949:2,Sa (uid 8747 -> 9859)
Notice it's dump free at the end! ;)
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
On 20.1.2010, at 20.24, David Halik wrote:
Jan 20 13:13:59 gehenna13.rutgers.edu dovecot: IMAP(user): /rci/nqu/rci/u2/user/dovecot/.INBOX/dovecot-uidlist: Duplicate file entry at line 3: 1259771459.M806388P24066V04240001I002E8FC6_0.gehenna7.rutgers.edu,S=3927:2,S (uid 8420 -> 9532) .. Notice it's dump free at the end! ;)
Well, that's good, although I had also hoped that the "Duplicate file entry" errors would have gone away.
Good news and bad news.
The good news is that I've been core dump free for about 24 hours. The duplicate message issue is still there, so if you wanted that fix then I guess the patch is not complete yet, otherwise it looks good.
The bad news is that I'm now convinced that another problem I've been experiencing is real and not related to any of this since it's still persists.
Randomly for a few releases now myself and my coworkers have been seeing a weird bug where old mail marked as "read" suddenly becomes marked as "new" again when going into a folder. Sometimes it's as bad as the entire folder deciding it is all unread when in fact is was all read, other times it's randomly a few pieces of mail that constantly revert to unread status. I've watched this on my personal server as well as heard complaints from our user population.
I don't believe this is an NFS issue because my personal account is a standalone server with both a local index and maildir running 1.2.9. Also, I'm only ever connected from one place at a time. Just this morning when I got to work I logged in with Thunderbird and my "dovecot" mailing list folder suddenly had 3,000 new messages in it, all of which had been previous read. I marked the folder as read, then a little while later I went into the older and they all reverted to new again. I've also seen it where if I try deleting one of these messages marked as new, it disappears for a second, then magically reappears in my folder as new.
Unfortunately, I'm not seeing a single log message related to this. As you can tell, I watch the logs like a hawk, yet nothing ever appears to throw an error or any message whatsoever. All my test cases have been with Thunderbird 3.0, I'm not sure if other clients are seeing this.
Any idea?
On 01/20/2010 01:45 PM, Timo Sirainen wrote:
On 20.1.2010, at 20.24, David Halik wrote:
Jan 20 13:13:59 gehenna13.rutgers.edu dovecot: IMAP(user): /rci/nqu/rci/u2/user/dovecot/.INBOX/dovecot-uidlist: Duplicate file entry at line 3: 1259771459.M806388P24066V04240001I002E8FC6_0.gehenna7.rutgers.edu,S=3927:2,S (uid 8420 -> 9532)
..
Notice it's dump free at the end! ;)
Well, that's good, although I had also hoped that the "Duplicate file entry" errors would have gone away.
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
On 21/01/2010 17:53, David Halik wrote:
Good news and bad news.
The good news is that I've been core dump free for about 24 hours. The duplicate message issue is still there, so if you wanted that fix then I guess the patch is not complete yet, otherwise it looks good.
The bad news is that I'm now convinced that another problem I've been experiencing is real and not related to any of this since it's still persists.
Randomly for a few releases now myself and my coworkers have been seeing a weird bug where old mail marked as "read" suddenly becomes marked as "new" again when going into a folder. Sometimes it's as bad as the entire folder deciding it is all unread when in fact is was all read, other times it's randomly a few pieces of mail that constantly revert to unread status. I've watched this on my personal server as well as heard complaints from our user population.
I don't believe this is an NFS issue because my personal account is a standalone server with both a local index and maildir running 1.2.9. Also, I'm only ever connected from one place at a time. Just this morning when I got to work I logged in with Thunderbird and my "dovecot" mailing list folder suddenly had 3,000 new messages in it, all of which had been previous read. I marked the folder as read, then a little while later I went into the older and they all reverted to new again. I've also seen it where if I try deleting one of these messages marked as new, it disappears for a second, then magically reappears in my folder as new.
Unfortunately, I'm not seeing a single log message related to this. As you can tell, I watch the logs like a hawk, yet nothing ever appears to throw an error or any message whatsoever. All my test cases have been with Thunderbird 3.0, I'm not sure if other clients a re seeing this.
Hello, same problem here. I use Thunderbird 3.0.1. The strange thing is that with the same version of TB on my PC at home I don't have this issue. The same TB on my work PC has this issue and only on one account (that has Dovecot as imap server) but I have other account (some with dovecot, other with courier-imap) where this does not happen, so I think is TB related.
Also here no NFS, ext3.
On 01/21/2010 12:00 PM, mailing@securitylabs.it wrote:
Hello, same problem here. I use Thunderbird 3.0.1. The strange thing is that with the same version of TB on my PC at home I don't have this issue. The same TB on my work PC has this issue and only on one account (that has Dovecot as imap server) but I have other account (some with dovecot, other with courier-imap) where this does not happen, so I think is TB related.
Also here no NFS, ext3.
Single Solaris 9 box here, without NFS and I'm seeing it. Users having problems with it on my linux dovecat farm as well, so it can't be NFS related.
This seems to have started since TB 3.0 came out. Especially, with the new indexing feature they added. I'm not seeing it with Pine or SQMail accessing the same servers. The problem is all my users are convinced it's my mail servers. ;) However, it does seem to only happen with my dovecot backends. I haven't seen this with my non-dovecot accounts.
Is it possible the TB 3.0 has some issues with dovecot specifically that we're not aware of?
Has anyone seen any Thunderbird mail out there related to this? Google hasn't been helpful yet.
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
On 21.1.2010, at 19.10, David Halik wrote:
This seems to have started since TB 3.0 came out. Especially, with the new indexing feature they added. I'm not seeing it with Pine or SQMail accessing the same servers. The problem is all my users are convinced it's my mail servers. ;) However, it does seem to only happen with my dovecot backends. I haven't seen this with my non-dovecot accounts.
Is it possible the TB 3.0 has some issues with dovecot specifically that we're not aware of?
QRESYNC maybe?
On 21.1.2010, at 19.13, Timo Sirainen wrote:
On 21.1.2010, at 19.10, David Halik wrote:
This seems to have started since TB 3.0 came out. Especially, with the new indexing feature they added. I'm not seeing it with Pine or SQMail accessing the same servers. The problem is all my users are convinced it's my mail servers. ;) However, it does seem to only happen with my dovecot backends. I haven't seen this with my non-dovecot accounts.
Is it possible the TB 3.0 has some issues with dovecot specifically that we're not aware of?
QRESYNC maybe?
CONDSTORE I mean. Looks like others have had the same problem: http://getsatisfaction.com/mozilla_messaging/topics/thunderbird_3_rc1_seems_...
On 01/21/2010 12:14 PM, Timo Sirainen wrote:
On 21.1.2010, at 19.13, Timo Sirainen wrote:
On 21.1.2010, at 19.10, David Halik wrote:
This seems to have started since TB 3.0 came out. Especially, with the new indexing feature they added. I'm not seeing it with Pine or SQMail accessing the same servers. The problem is all my users are convinced it's my mail servers. ;) However, it does seem to only happen with my dovecot backends. I haven't seen this with my non-dovecot accounts.
Is it possible the TB 3.0 has some issues with dovecot specifically that we're not aware of?
QRESYNC maybe?
CONDSTORE I mean. Looks like others have had the same problem: http://getsatisfaction.com/mozilla_messaging/topics/thunderbird_3_rc1_seems_...
Yep, this is definitely the same problem.
https://bugzilla.mozilla.org/show_bug.cgi?id=517461
Apparently, the fix was supposed to happen in 3.0.1. but people are still reporting the problem. At least they're aware of it and working on it.
Thanks Timo.
--
David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu
Am 21.01.2010 18:14, schrieb Timo Sirainen:
On 21.1.2010, at 19.13, Timo Sirainen wrote:
On 21.1.2010, at 19.10, David Halik wrote:
This seems to have started since TB 3.0 came out. Especially, with the new indexing feature they added. I'm not seeing it with Pine or SQMail accessing the same servers. The problem is all my users are convinced it's my mail servers. ;) However, it does seem to only happen with my dovecot backends. I haven't seen this with my non-dovecot accounts.
Is it possible the TB 3.0 has some issues with dovecot specifically that we're not aware of?
QRESYNC maybe?
CONDSTORE I mean. Looks like others have had the same problem: http://getsatisfaction.com/mozilla_messaging/topics/thunderbird_3_rc1_seems_...
There is a bug report for Thunderbird 3.0 and CONDSTORE:
https://bugzilla.mozilla.org/show_bug.cgi?id=524902
Setting
mail.server.default.use_condstore = false
in Thunderbird disables CONDSTORE support and gets rid of this issue...
Hi! I'm pretty unsure this is only a TB3 problem. Saw the same result with TB 2.0, but I'm unsure if this was after changing something on the server. After 2 days of mailproblems we downgraded back to Dovecot 1.1 yesterday and no one reported this bug since this downgrade.
Greetings,
Thorsten
- Timo Sirainen <tss@iki.fi>:
This should fix it (and hopefully doesn't expose another bug): http://hg.dovecot.org/dovecot-1.2/rev/a1177c6cf8c7
I think that fixes all of the "duplicate entry in uidlist" errors that have been reported recently. So please test soon and report how it goes over a few days, and I'll release 1.2.10.
I applied it today. Let's see how it goes.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
participants (8)
-
Brandon Davidson
-
Cor Bosman
-
David Halik
-
e-frog
-
mailing@securitylabs.it
-
Ralf Hildebrandt
-
Thorsten Habich
-
Timo Sirainen