[Dovecot] Asserion Failure in Current CVS
Just installed the version from CVS as of Oct. 27.
I noticed three problems quite quickly:
Still seeing "(imap) killed with signal 14"
My INBOX closed with "access error" after reading it for a bit. There's nothing in the logs or anything, but this hasn't happened for quite a while now. There where quite a lot of incoming messages at the time and I was marking messages as deleted using pine. After reverting to the last version of dovecot that I was using and restarting pine, it opened quickly (no reindexing) and the deleted emails where marked as deleted. I don't know if this helps much ;-)
The rest is a backtrace, etc. of an assertion failure:
dovecot: Oct 27 16:58:12 Error: 13481 imap(username): file mbox-sync-rewrite.c: line 393 (mbox_sync_read_and_move): assertion failed: (need_space == (uoff_t)-mails[idx].space)
dovecot: Oct 27 16:58:29 Error: child 13481 (imap) killed with signal 6
(gdb) bt full #0 0xffffe410 in ?? () No symbol table info available. #1 0xbfffeec8 in ?? () No symbol table info available. #2 0x00000006 in ?? () No symbol table info available. #3 0x000034a9 in ?? () No symbol table info available. #4 0xb7ef1d41 in raise () from /lib/tls/libc.so.6 No symbol table info available. #5 0xb7ef3529 in abort () from /lib/tls/libc.so.6 No symbol table info available. #6 0x0809b35a in i_internal_panic_handler ( fmt=0x6 <Address 0x6 out of bounds>, args=0x34a9 <Address 0x34a9 out of bounds>) at failures.c:375 No locals. #7 0x0809b784 in i_panic ( format=0x80a88ec "file %s: line %d (%s): assertion failed: (%s)") at failures.c:173 No locals. #8 0x08073c33 in mbox_sync_rewrite (sync_ctx=0xbffff7d0, mail_ctx=0xbffff740, end_offset=Unhandled dwarf expression opcode 0x93 ) at mbox-sync-rewrite.c:447 mails = Variable "mails" is not available.
(gdb) bt #0 0xffffe410 in ?? () #1 0xbfffeec8 in ?? () #2 0x00000006 in ?? () #3 0x000034a9 in ?? () #4 0xb7ef1d41 in raise () from /lib/tls/libc.so.6 #5 0xb7ef3529 in abort () from /lib/tls/libc.so.6 #6 0x0809b35a in i_internal_panic_handler ( fmt=0x6 <Address 0x6 out of bounds>, args=0x34a9 <Address 0x34a9 out of bounds>) at failures.c:375 #7 0x0809b784 in i_panic ( format=0x80a88ec "file %s: line %d (%s): assertion failed: (%s)") at failures.c:173 #8 0x08073c33 in mbox_sync_rewrite (sync_ctx=0xbffff7d0, mail_ctx=0xbffff740, end_offset=Unhandled dwarf expression opcode 0x93 ) at mbox-sync-rewrite.c:447 #9 0x0806f4c2 in mbox_sync_loop (sync_ctx=0xbffff7d0, mail_ctx=0xbffff740, partial=Variable "partial" is not available. ) at mbox-sync.c:827 #10 0x080706ef in mbox_sync (mbox=0x80c6c08, flags=Variable "flags" is not available. ) at mbox-sync.c:1447 #11 0x08069ca6 in mbox_storage_close (box=0x80c6c08) at mbox-storage.c:973 #12 0x0808f7cd in mailbox_close (box=0x0) at mail-storage.c:303 #13 0x08057c19 in cmd_logout (cmd=0x80bfbf8) at cmd-logout.c:18 #14 0x08059a8b in _client_input (context=0x80bfbb8) at client.c:338 #15 0x080a0bba in io_loop_handler_run (ioloop=0x80be9b0) at ioloop-poll.c:190 #16 0x080a0139 in io_loop_run (ioloop=0x80be9b0) at ioloop.c:230 #17 0x080609d6 in main (argc=1, argv=0xbffffa44, envp=0xbffffa4c) at main.c:232
(gdb) frame #8 0x08073c33 in mbox_sync_rewrite (sync_ctx=0xbffff7d0, mail_ctx=0xbffff740, end_offset=Unhandled dwarf expression opcode 0x93 ) at mbox-sync-rewrite.c:447 447 i_assert(extra_space < OFF_T_MAX); (gdb) p extra_space Unhandled dwarf expression opcode 0x93
On Thu, 2005-10-27 at 17:58 -0400, Todd Burroughs wrote:
Just installed the version from CVS as of Oct. 27.
I noticed three problems quite quickly:
Still seeing "(imap) killed with signal 14"
Well, this happens just because I've forgotten to ignore SIGALRM signals :) So I guess it happens if it ever has to wait more than a second for a fcntl lock.
My INBOX closed with "access error" after reading it for a bit. There's nothing in the logs or anything, but this hasn't happened for quite a while now. There where quite a lot of incoming messages at the time and I was marking messages as deleted using pine. After reverting to the last version of dovecot that I was using and restarting pine, it opened quickly (no reindexing) and the deleted emails where marked as deleted. I don't know if this helps much ;-)
Well, I don't know what "access error" means.. And do you mean you didn't try again with the new Dovecot, just reverted to old one?
dovecot: Oct 27 16:58:12 Error: 13481 imap(username): file mbox-sync-rewrite.c: line 393 (mbox_sync_read_and_move): assertion failed: (need_space == (uoff_t)-mails[idx].space)
Is this from older Dovecot too? I hoped I had fixed this yesterday.
On Fri, 28 Oct 2005, Timo Sirainen wrote:
On Thu, 2005-10-27 at 17:58 -0400, Todd Burroughs wrote:
Just installed the version from CVS as of Oct. 27.
I noticed three problems quite quickly:
My INBOX closed with "access error" after reading it for a bit. There's nothing in the logs or anything, but this hasn't happened for quite a while now. There where quite a lot of incoming messages at the time and I was marking messages as deleted using pine. After reverting to the last version of dovecot that I was using and restarting pine, it opened quickly (no reindexing) and the deleted emails where marked as deleted. I don't know if this helps much ;-)
Well, I don't know what "access error" means.. And do you mean you didn't try again with the new Dovecot, just reverted to old one?
I reverted back to the old one, as the new one was generating a lot of errors. As far as I can tell, "access error" means that the imap server dropped connection. Usually I see a crash when this happens. I'm running today's CVS version now and haven't seen any errors, although it is late Friday... This may have just been a coincidence.
dovecot: Oct 27 16:58:12 Error: 13481 imap(username): file mbox-sync-rewrite.c: line 393 (mbox_sync_read_and_move): assertion failed: (need_space == (uoff_t)-mails[idx].space)
Is this from older Dovecot too? I hoped I had fixed this yesterday.
This is from the newer one. The last Changelog entry was this:
2005-10-27 15:44 Timo Sirainen <tss@iki.fi>
* configure.in: typofix
This user hasn't logged into the newest version yet. I see a couple entries in the new Changelog regarding mbox-sync-rewrite.c, so maybe it is fixed.
Todd
On Fri, 28 Oct 2005, Timo Sirainen wrote:
On Thu, 2005-10-27 at 17:58 -0400, Todd Burroughs wrote:
dovecot: Oct 27 16:58:12 Error: 13481 imap(username): file mbox-sync-rewrite.c: line 393 (mbox_sync_read_and_move): assertion failed: (need_space == (uoff_t)-mails[idx].space)
Is this from older Dovecot too? I hoped I had fixed this yesterday.
I'm now running CVS version as of Oct 28. A couple users have gotten a very similar assertion failure. Here's the log entry and backtrace:
dovecot: Nov 01 05:07:18 Error: 8434 imap(username): file mbox-sync-rewrite.c: line 383 (mbox_sync_read_and_move): assertion failed: (seq == mail_ctx->seq)
dovecot: Nov 01 05:07:20 Error: child 8434 (imap) killed with signal 6
(gdb) bt full #0 0xffffe410 in ?? () No symbol table info available. #1 0xbfffee48 in ?? () No symbol table info available. #2 0x00000006 in ?? () No symbol table info available. #3 0x000020f2 in ?? () No symbol table info available. #4 0xb7ef1d41 in raise () from /lib/tls/libc.so.6 No symbol table info available. #5 0xb7ef3529 in abort () from /lib/tls/libc.so.6 No symbol table info available. #6 0x0809b4fa in i_internal_panic_handler ( fmt=0x6 <Address 0x6 out of bounds>, args=0x20f2 <Address 0x20f2 out of bounds>) at failures.c:375 No locals. #7 0x0809b924 in i_panic ( format=0x80a8a8c "file %s: line %d (%s): assertion failed: (%s)") at failures.c:173 No locals. #8 0x08073cb3 in mbox_sync_rewrite (sync_ctx=0xbffff760, mail_ctx=0xbffff6d0, end_offset=Unhandled dwarf expression opcode 0x93 ) at mbox-sync-rewrite.c:454 mails = Variable "mails" is not available.
Todd Burroughs wrote:
I'm now running CVS version as of Oct 28. A couple users have gotten a very similar assertion failure. Here's the log entry and backtrace:
dovecot: Nov 01 05:07:18 Error: 8434 imap(username): file mbox-sync-rewrite.c: line 383 (mbox_sync_read_and_move): assertion failed: (seq == mail_ctx->seq)
I have reported this one before (running CVS from Oct 31) and sent backtraces as well. With this version of dovecot I've also seen assertion failures:
file buffer.c: line 230 (buffer_copy): assertion failed: (src_pos <= src->used)
file mbox-sync-rewrite.c: line 400 (mbox_sync_read_and_move): assertion failed: (need_space == (uoff_t)-mails[idx].space)
file mbox-sync.c: line 1297 (mbox_sync_handle_eof_updates): assertion failed: (file_size >= sync_ctx->expunged_space + trailer_size)
Have given backtraces of all these failures to the dovecot-list. Hope to see fixes in near future. ;-)
- Anders
--
Anders Lund <Anders.Lund@uninett.no> .~.
UNINETT, N-7465 Trondheim, Norway / V
Phone: +47 73 55 79 08 | Fax: +47 73 55 79 01 /( )
^ ^
Anders Lund wrote:
Have given backtraces of all these failures to the dovecot-list. Hope to see fixes in near future. ;-)
If you are not yet using Dovecot LDA could you please try it out and see if problems persist?
I am quite baffled about number of asserts you are seeing, even I have not lot of real users using alpha series and mbox I'm still hitting it pretty hard in lab environment but I am not able to produce any asserts with current code. Then again, this may just tell that I have to improve my testing methods ..
-- Tomi Hakala
Tomi Hakala wrote:
Anders Lund wrote:
Have given backtraces of all these failures to the dovecot-list. Hope to see fixes in near future. ;-)
If you are not yet using Dovecot LDA could you please try it out and see if problems persist?
I will consider looking more into Dovecot LDA, but I'm a bit reluctant to "try it out" on my server to see if the problems disappear. In the first place I wanted to keep my delivery separate from Dovecot, and just use Dovecot for IMAP. I have thought of looking more into Dovecot LDA to see if SIEVE would give me better support for a simple way for my users to add vacation messages, filters, etc., but not until later. I still hope to see fixes for the assertion failures in Dovecot CVS, but if everything would be solved by running Dovecot LDA I guess I will have to look more at this solution. Anyhow, I don't understand what is so special about my setup, and why I should get these failures.
- Anders
--
Anders Lund <Anders.Lund@uninett.no> .~.
UNINETT, N-7465 Trondheim, Norway / V
Phone: +47 73 55 79 08 | Fax: +47 73 55 79 01 /( )
^ ^
Tomi Hakala wrote:
Anders Lund wrote:
Anyhow, I don't understand what is so special about my setup, and why I should get these failures.
All I can think of is that you have some locking issue which is causing mbox corruption.
When it comes to locking I have this:
Postfix: mailbox_delivery_lock = fcntl, dotlock
Dovecot: lock_method = fcntl mbox_read_locks = fcntl dotlock mbox_write_locks = fcntl dotlock
"procmail -v" gives me: Locking strategies: dotlocking, fcntl()
I thought this should work fine, but if you spot any possible problems give me a hint.
- Anders
--
Anders Lund <Anders.Lund@uninett.no> .~.
UNINETT, N-7465 Trondheim, Norway / V
Phone: +47 73 55 79 08 | Fax: +47 73 55 79 01 /( )
^ ^
Anders Lund wrote:
Tomi Hakala wrote:
Anders Lund wrote:
Anyhow, I don't understand what is so special about my setup, and why I should get these failures.
All I can think of is that you have some locking issue which is causing mbox corruption.
When it comes to locking I have this:
Postfix: mailbox_delivery_lock = fcntl, dotlock
Dovecot: lock_method = fcntl mbox_read_locks = fcntl dotlock mbox_write_locks = fcntl dotlock
"procmail -v" gives me: Locking strategies: dotlocking, fcntl()
I thought this should work fine, but if you spot any possible problems give me a hint.
The documentation for dovecot and other mail software says, not only the locking methods but also the order of the methods used should match for all software accessing an mbox file.
For Postfix and Dovecot, the methods are the same for your configuration, but procmail uses them the other way round. So if you use procmail for mail delivery, this *could* cause problems, I guess.
Can you try to set locking for postfix and dovecot mbox to "dotlock fcntl" instead of "fcntl dotlock" and see if that improves your situation.
Best Regards, Michael Paesold
PS. This is my first post to this list, so Hello! to everyone.
Michael Paesold wrote:
For Postfix and Dovecot, the methods are the same for your configuration, but procmail uses them the other way round. So if you use procmail for mail delivery, this *could* cause problems, I guess.
Can you try to set locking for postfix and dovecot mbox to "dotlock fcntl" instead of "fcntl dotlock" and see if that improves your situation.
I've implemented this change now.
Anyone know it it's possible to change the way Procmail does locking without recompiling?
My setup is now:
Postfix: mailbox_delivery_lock = dotlock, fcntl
Dovecot: lock_method = dotlock mbox_read_locks = dotlock fcntl mbox_write_locks = dotlock fcntl
"procmail -v" gives me: Locking strategies: dotlocking, fcntl()
Had to set
mmap_disable=yes
for Dovecot for this to work, otherwise I get:
lock_method=dotlock and mmap_disable=no combination isn't supported. You don't _really_ want it anyway.
So, cross my fingers and hope for improvements. :)
- Anders
--
Anders Lund <Anders.Lund@uninett.no> .~.
UNINETT, N-7465 Trondheim, Norway / V
Phone: +47 73 55 79 08 | Fax: +47 73 55 79 01 /( )
^ ^
Anders Lund wrote:
Dovecot: lock_method = dotlock
You should leave lock_method as fcntl. It is only used for index access, which only dovecot uses, AFAIK.
mbox_read_locks = dotlock fcntl mbox_write_locks = dotlock fcntl
"procmail -v" gives me: Locking strategies: dotlocking, fcntl()
Had to set
mmap_disable=yes
for Dovecot for this to work, otherwise I get:
lock_method=dotlock and mmap_disable=no combination isn't supported. You don't _really_ want it anyway.
Just change back lock_method to fcntl. Then you can set mmap_disable to no.
Best Regards, Michael Paesold
On Thu, 3 Nov 2005, Anders Lund wrote:
Todd Burroughs wrote:
I'm now running CVS version as of Oct 28. A couple users have gotten a very similar assertion failure. Here's the log entry and backtrace: (seq == mail_ctx->seq)
I have reported this one before (running CVS from Oct 31) and sent backtraces as well. With this version of dovecot I've also seen assertion failures:
Anders,
Try running the current CVS version. Timo fixed a few things since Oct. 31 and I haven't seen an assert or any failure in 5 days now.
Todd
Todd Burroughs wrote:
Anders,
Try running the current CVS version. Timo fixed a few things since Oct. 31 and I haven't seen an assert or any failure in 5 days now.
Just upgraded to nightly from 2005-11-09 now. Tailing my log for assertion failures now and crossing my fingers again. ;-)
Correcting my settings for locking in Postfix and Dovecot (to have the same priorities as Procmail) might have given some good results as well.
And for one user I discovered another problem: this at least made the mboxes readable for pine, which had previous
- running "file" on mboxes showed: "ISO-8859 mail text, with CRLF, LF line terminators" which I fixed with "dos2unix" on the mboxes (still some of these have "ISO-8859 mail text, with CR, LF line terminators")
- in addition some messages had wrong date stamp, which I fixed with "sed -i 's/Jan 1 /Jan 01 /' *"
problems. It also seemed to remove "assertion failed" with dovecot for this user.
- Anders
--
Anders Lund <Anders.Lund@uninett.no> .~.
UNINETT, N-7465 Trondheim, Norway / V
Phone: +47 73 55 79 08 | Fax: +47 73 55 79 01 /( )
^ ^
Anders Lund wrote:
Todd Burroughs wrote:
Anders,
Try running the current CVS version. Timo fixed a few things since Oct. 31 and I haven't seen an assert or any failure in 5 days now.
Just upgraded to nightly from 2005-11-09 now. Tailing my log for assertion failures now and crossing my fingers again. ;-)
Correcting my settings for locking in Postfix and Dovecot (to have the same priorities as Procmail) might have given some good results as well.
And for one user I discovered another problem: this at least made the mboxes readable for pine, which had previous
- running "file" on mboxes showed: "ISO-8859 mail text, with CRLF, LF line terminators" which I fixed with "dos2unix" on the mboxes (still some of these have "ISO-8859 mail text, with CR, LF line terminators")
- in addition some messages had wrong date stamp, which I fixed with "sed -i 's/Jan 1 /Jan 01 /' *"
problems. It also seemed to remove "assertion failed" with dovecot for this user.
Just a short update: Since upgrade to the nightly build from 2005-11-09 I've had no assertion failed. It seems a lot more stable on my system now, so the combination of:
- setting correct locking priority (dotlock before fcntl in Dovecot/Postfix, to comply with Procmail)
- correcting mboxes (typically this is mboxes moved from local Thunderbird folders on laptops running Windows to the IMAP server)
- upgrading to the latest nightly from 2005-11-09 seemed to be a good solution for me. Thanks for the suggestions I've got!
One final issue I've seen: Thunderbird 1.0.7 sometimes seems to go into a state of being unable to contact the IMAP server (when accessing a new folder). When this has happened I've quit Thunderbird, but can still see imap process running on the server. If I restart Thunderbird I get a new imap process, and the old one is still running. This is perhaps a client issue? Have others seen this?
- Anders
--
Anders Lund <Anders.Lund@uninett.no> .~.
UNINETT, N-7465 Trondheim, Norway / V
Phone: +47 73 55 79 08 | Fax: +47 73 55 79 01 /( )
^ ^
Anders Lund wrote:
Anders Lund wrote:
Todd Burroughs wrote:
Anders,
Try running the current CVS version. Timo fixed a few things since Oct. 31 and I haven't seen an assert or any failure in 5 days now.
Just upgraded to nightly from 2005-11-09 now. Tailing my log for assertion failures now and crossing my fingers again. ;-)
New assertion failed for "imap" with nightly from 2005-11-09.
mbox format for the user.
file mbox-sync-rewrite.c: line 106 (mbox_sync_headers_add_space): assertion failed: (start_pos < data_size)
gdb 'bt" show:
#0 0xb7ed483b in raise () from /lib/tls/libc.so.6 #1 0xb7ed5fa2 in abort () from /lib/tls/libc.so.6 #2 0x080a29e5 in i_internal_panic_handler (fmt=0x0, args=0x0) at failures.c:375 #3 0x080a2479 in i_panic (format=0x0) at failures.c:173 #4 0x08076747 in mbox_sync_headers_add_space (ctx=0xbffff210, size=17788) at mbox-sync-rewrite.c:111 #5 0x08076ac4 in mbox_sync_try_rewrite (ctx=0xbffff210, move_diff=0) at mbox-sync-rewrite.c:252 #6 0x08072227 in mbox_sync_handle_header (mail_ctx=0xbffff210) at mbox-sync.c:736 #7 0x08072a94 in mbox_sync_loop (sync_ctx=0xbffff310, mail_ctx=0xbffff210, partial=1) at mbox-sync.c:1107 #8 0x08073872 in mbox_sync_do (sync_ctx=0xbffff310, flags=24) at mbox-sync.c:1448 #9 0x08073f2c in mbox_sync (mbox=0x80d09d0, flags=24) at mbox-sync.c:1695 #10 0x080743a4 in mbox_storage_sync_init (box=0x80d09d0, flags=3) at mbox-sync.c:1773 #11 0x08095cf6 in mailbox_sync_init (box=0x0, flags=6) at mail-storage.c:336 #12 0x0805fa1a in imap_sync_init (client=0x80ca9b0, box=0x80d09d0, flags=0) at imap-sync.c:44 #13 0x0805ff3a in cmd_sync (cmd=0x80ca9f0, flags=3, tagline=0x80b0c79 "OK Check completed.") at imap-sync.c:225 #14 0x08055b40 in cmd_check (cmd=0x80ca9f0) at cmd-check.c:11 #15 0x08059b0a in client_handle_input (cmd=0x80ca9f0) at client.c:338 #16 0x08059c2d in _client_input (context=0x80ca9b0) at client.c:390 #17 0x080a8968 in io_loop_handler_run (ioloop=0x80c99b0) at ioloop-poll.c:190 #18 0x080a7da8 in io_loop_run (ioloop=0x80c99b0) at ioloop.c:230 #19 0x08061bb9 in main (argc=1, argv=0x0, envp=0x0) at main.c:233
- Anders
participants (5)
-
Anders Lund
-
Michael Paesold
-
Timo Sirainen
-
Todd Burroughs
-
Tomi Hakala