[Dovecot] Mbox corruption - Inbox beginning with 'FFrom' or 'FrFrom'

Dimos Alevizos dalevizo at otenet.gr
Thu Feb 13 13:52:30 UTC 2014


At last, I was finally able to find time and run a backtrace, I've 
attached the file.
If you don't remember (and hey, you can blame you after all this time) 
the problem is only with a couple (~10) of users among the hundreds of 
thousands currently using the service.

Please let me know if I should provide any other info, logs or whatever.

Dimos Alevizos

-------- Original Message --------
Subject: Re: [Dovecot] Mbox corruption - Inbox beginning with 'FFrom' or 
'FrFrom'
From: Timo Sirainen <tss at iki.fi>
To: Dimos Alevizos <dalevizo at otenet.gr>
CC: Dimitris Paouris <dpaou at otenet.gr>, Dovecot Mailing List 
<dovecot at dovecot.org>
Date: 02/11/2013 01:16 μμ

> Well, if that patch didn’t work, then the problem is elsewhere. There aren’t many other good possibilities left though.. How about adding this patch, it should be even safer than the previous one:
>
> http://hg.dovecot.org/dovecot-2.2/rev/d3062d066593
>
> On 30.10.2013, at 12.42, Dimos Alevizos <dalevizo at otenet.gr> wrote:
>
>> I'm afraid it doesn't seem to be working.
>> I've compiled a patched 2.2.6 dovecot with the patch you sent and installed it in a production server (had to be 2.2.6 cause we've upgraded all the rest since I begun this thread months ago) and although we still have mbox corruptions (rarely as before) the server isn't crashing :
>>
>> Oct 30 11:15:19 pop04 dovecot: pop3-login: Login: user=<artower at otenet.gr>, method=PLAIN, rip=85.72.232.35, lip=83.235.66.43, mpid=24419, secured, session=<+0ywxfHpIQBVSOgj>
>> Oct 30 11:15:20 pop04 dovecot: pop3(artower at otenet.gr): Disconnected: Logged out top=0/0, retr=0/0, del=0/1336, size=471029518
>> Oct 30 11:19:12 pop04 dovecot: lmtp(2863, artower at otenet.gr): r7U3KnyhcFIvCwAAckDtvw: msgid=<E1VbRvh-00040e-Ol at cpmail.force24.dedicated.catalyst2.com>: size=17823 saved mail to INBOX
>> Oct 30 11:33:12 pop04 dovecot: pop3-login: Login: user=<artower at otenet.gr>, method=PLAIN, rip=85.72.224.94, lip=83.235.66.43, mpid=600, secured, session=<vT2aBfLpxQBVSOBe>
>> Oct 30 11:33:12 pop04 dovecot: pop3(artower at otenet.gr): Error: Syncing INBOX failed: Mailbox isn't a valid mbox file
>> Oct 30 11:33:12 pop04 dovecot: pop3(artower at otenet.gr): Error: Couldn't init INBOX: Mailbox isn't a valid mbox file
>> Oct 30 11:33:12 pop04 dovecot: pop3(artower at otenet.gr): Mailbox init failed top=0/0, retr=0/0, del=0/0, size=0
>> Oct 30 11:33:33 pop04 dovecot: lmtp(16314, artower at otenet.gr): Au4vIMqucFK6PwAAckDtvw: msgid=<004401ced552$bb5ecd70$321c6850$@planet.nl>: size=7975817 save failed to INBOX: Mailbox isn't a valid mbox file
>>
>> Perhaps the patch is only valid for 2.1.16 and needs to be modified for 2.2.6 ?
>>
>> Thank you for your time
>> Dimos Alevizos
>>
>>
>> -------- Original Message --------
>> Subject: Re: [Dovecot] Mbox corruption - Inbox beginning with 'FFrom' or 'FrFrom'
>> From: Timo Sirainen <tss at iki.fi>
>> To: Dimos Alevizos <dalevizo at otenet.gr>
>> CC: dovecot at dovecot.org, Dimitris Paouris <dpaou at otenet.gr>
>> Date: 26/06/2013 06:59 μμ
>>
>>> It crashes one specific IMAP/POP3 session, so others are unaffected. The potential problems:
>>>
>>>   * It might cause the user's mbox to become crashing constantly. so first crash -> client reconnects -> client attempts the same operation -> crash again. Then again, this might not happen, it depends.
>>>
>>>   * The mbox file would probably become slightly more corrupted than normally, because it doesn't finish moving data around. No data should get actually lost, but some parts could become duplicated (e.g. some headers or even mails, possibly causing UID renumbering = redownloading).
>>>
>>> So not ideal in production, but shouldn't be too bad either, especially if you just wait for the first crash and then immediately switch to the old unpatched version.
>>>
>>> On 26.6.2013, at 10.21, Dimos Alevizos <dalevizo at otenet.gr> wrote:
>>>
>>>> Hi,
>>>>
>>>> I haven't had the time to compile it yet, but a question just occurred.
>>>> Given that it's so rare and we can't reproduce it on a dev server, how safe is this to use on a production server ?
>>>> When you say "crash" you mean the whole dovecot server or that specific client's child ?
>>>>
>>>> D.
>>>>
>>>> -------- Original Message --------
>>>> Subject: Re: [Dovecot] Mbox corruption - Inbox beginning with 'FFrom' or 'FrFrom'
>>>> From: dalevizo <dalevizo at otenet.gr>
>>>> To: Timo Sirainen <tss at iki.fi>
>>>> CC: dovecot at dovecot.org, Dimitris Paouris <dpaou at otenet.gr>
>>>> Date: 24/06/2013 01:41 μμ
>>>>
>>>>> Thanx I'll try the patch as soon as possible and I'll let you know.
>>>>> It is indeed very rare. We're only seeing 4-5 corruptions in about 13
>>>>> million logins per day.
>>>>> I've been trying to convince our design team that we should move to
>>>>> maildir, but the truth is that it's quite a change, and we're way too
>>>>> busy to deal with everything else AND a migration from mbox to maildir.
>>>>>
>>>>> D.
>>>>>
>>>>> On Mon 24/06/2013 13:16, Timo Sirainen wrote:
>>>>>> On 19.6.2013, at 16.00, Dimos Alevizos <dalevizo at otenet.gr> wrote:
>>>>>>
>>>>>>> we're having some problems with our dovecot setup.
>>>>>>> I've seen similar problems in the mailing list some years ago but alas wasn't able to find a solution.
>>>>>>>
>>>>>>> Our setup is as follows :
>>>>>>> An MX farm (postfix) sends mails via LMTP to a director farm (dovecot 2.1.12) which proxies pop3/imap/lmtp traffic to a dovecot farm (dovecot 2.1.16).
>>>>>>> All mailboxes and indexes are on NFS and all servers are Centos.
>>>>>>>
>>>>>>> The problem is that at times we see mailboxes (all of them are in mbox format) beginning with FFrom or FrFrom and of course dovecot says it's not a valid mbox file.
>>>>>>
>>>>>> This is quite an old bug, but it happens rarely enough that I haven't been able to reproduce and fix it. Actually people hadn't complained about it for a long time now, so I had assumed it had somehow gotten fixed already.
>>>>>>
>>>>>> With the attached debug patch it should crash instead of (completely) corrupting the mbox file. Debugging the resulting core file with gdb could be useful in figuring this out.
>>>>>>
>>>>>> Although I wouldn't recommend mbox format for any big installation anyway..
>>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
Feb 13 15:41:08 pop06 dovecot: imap-login: Login: user=<someuser at somedomain>, method=PLAIN, rip=195.170.0.170, lip=83.235.66.45, mpid=8157, session=<E/1110nyQADDqgCq>
Feb 13 15:41:09 pop06 dovecot: imap(someuser at somedomain): Panic: file mbox-sync.c: line 655 (mbox_sync_handle_header): assertion failed: (mail_ctx->mail.from_offset + move_diff != 1 && mail_ctx->mail.from_offset + move_diff != 2)
Feb 13 15:41:09 pop06 dovecot: imap(someuser at somedomain): Error: Raw backtrace: /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf551923fa] -> /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55192466] -> /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55191583] -> /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0 [0x2adf54e7e9a7] -> /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mbox_sync+0x620) [0x2adf54e7f160] -> /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mbox_storage_sync_init+0x83) [0x2adf54e80093] -> /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mailbox_sync_init+0x31) [0x2adf54e99da1] -> /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mailbox_sync+0x27) [0x2adf54e9abc7] -> dovecot/imap [someuser at somedomain 195.170.0.170 SELECT](cmd_select_full+0x219) [0x4111c9] -> dovecot/imap [someuser at somedomain 195.170.0.170 SELECT](cmd_select+0xb) [0x411afb] -> dovecot/imap [someuser at somedomain 195.170.0.170 SELECT](command_exec+0x37) [0x4165e7] -> dovecot/imap [d
Feb 13 15:41:09 pop06 dovecot: imap(someuser at somedomain): Fatal: master: service(imap): child 8157 killed with signal 6 (core dumped)

root at pop06:someuser $ gdb /opt/dovecot-2.2.9/libexec/imap core.8157 
GNU gdb (GDB) CentOS (7.0.1-45.el5.centos)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /opt/dovecot-2.2.9/libexec/imap...done.
[New Thread 8157]
Reading symbols from /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0...done.
Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0
Reading symbols from /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0...done.
Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /opt/dovecot-2.2.9/lib/dovecot/lib10_quota_plugin.so...done.
Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib10_quota_plugin.so
Reading symbols from /opt/dovecot-2.2.9/lib/dovecot/lib11_imap_quota_plugin.so...done.
Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib11_imap_quota_plugin.so
Reading symbols from /opt/dovecot-2.2.9/lib/dovecot/lib15_notify_plugin.so...done.
Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib15_notify_plugin.so
Reading symbols from /opt/dovecot-2.2.9/lib/dovecot/lib20_fts_plugin.so...done.
Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib20_fts_plugin.so
Reading symbols from /opt/dovecot-2.2.9/lib/dovecot/lib21_fts_solr_plugin.so...done.
Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib21_fts_solr_plugin.so
Reading symbols from /lib64/libexpat.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib64/libexpat.so.0
Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgcc_s.so.1

warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fff49dfd000
Core was generated by `dovecot/imap [someuser at somed'.
Program terminated with signal 6, Aborted.
#0  0x0000003efe030265 in raise () from /lib64/libc.so.6
(gdb) bt full
#0  0x0000003efe030265 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x0000003efe031d10 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x00002adf5519241e in default_fatal_finish (type=<value optimized out>, status=0) at failures.c:192
        backtrace = 0x14d90370 "/opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf551923fa] -> /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55192466] -> /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55191583] ->"...
#3  0x00002adf55192466 in i_internal_fatal_handler (ctx=0x7fff49d91340, format=<value optimized out>, args=<value optimized out>) at failures.c:653
        status = 0
#4  0x00002adf55191583 in i_panic (format=0x1fdd <Address 0x1fdd out of bounds>) at failures.c:264
        ctx = {type = LOG_TYPE_PANIC, exit_status = 0, timestamp = 0x0}
        args = {{gp_offset = 40, fp_offset = 48, overflow_arg_area = 0x7fff49d91410, reg_save_area = 0x7fff49d91350}}
#5  0x00002adf54e7e9a7 in mbox_sync_handle_header (sync_ctx=0x7fff49d935e0, mail_ctx=0x7fff49d937d0, partial=true) at mbox-sync.c:654
        orig_from_offset = 14200194
        move_diff = -14200194
        ret = <value optimized out>
        __FUNCTION__ = "mbox_sync_handle_header"
#6  mbox_sync_loop (sync_ctx=0x7fff49d935e0, mail_ctx=0x7fff49d937d0, partial=true) at mbox-sync.c:1138
        rec = 0x14dc0d20
        uid = <value optimized out>
        messages_count = 302
        offset = 14200194
        ret = <value optimized out>
        expunged = false
        skipped_mails = false
        uids_broken = false
#7  0x00002adf54e7f160 in mbox_sync_int (mbox=0x14db5b30, flags=MBOX_SYNC_UNDIRTY) at mbox-sync.c:1619
        index_sync_ctx = 0x14dd3710
        sync_view = 0x14dd3780
        ret = <value optimized out>
        delay_writes = true
        readonly = false
        trans = 0x14dbf420
        sync_ctx = {mbox = 0x14db5b30, flags = MBOX_SYNC_UNDIRTY, input = 0x14dd4fe8, file_input = 0x14dd4e18, write_fd = 15, orig_mtime = 1392290551, orig_atime = 1391079897, 
          orig_size = 29081201, last_stat = {st_dev = 25, st_ino = 30254297, st_nlink = 1, st_mode = 33152, st_uid = 1904649, st_gid = 550, pad0 = 0, st_rdev = 0, st_size = 29081201, 
            st_blksize = 32768, st_blocks = 56920, st_atim = {tv_sec = 1391079897, tv_nsec = 0}, st_mtim = {tv_sec = 1392290551, tv_nsec = 0}, st_ctim = {tv_sec = 1392290551, 
              tv_nsec = 961280000}, __unused = {0, 0, 0}}, index_sync_ctx = 0x14dd3710, sync_view = 0x14dd3780, t = 0x14dbf420, reset_hdr = {major_version = 0 '\000', minor_version = 0 '\000', 
            base_header_size = 0, header_size = 0, record_size = 0, compat_flags = 0 '\000', unused = "\000\000", indexid = 0, flags = 0, uid_validity = 0, next_uid = 0, messages_count = 0, 
            unused_old_recent_messages_count = 0, seen_messages_count = 0, deleted_messages_count = 0, first_recent_uid = 0, first_unseen_uid_lowwater = 0, first_deleted_uid_lowwater = 0, 
            log_file_seq = 0, log_file_tail_offset = 0, log_file_head_offset = 0, unused_old_sync_size = 0, unused_old_sync_stamp = 0, day_stamp = 0, day_first_uid = {0, 0, 0, 0, 0, 0, 0, 0}}, 
          hdr = 0x14dbf090, header = 0x14dcb070, from_line = 0x14dcd100, base_uid_validity = 1387910574, base_uid_last = 439, base_uid_last_offset = 0, mails = {arr = {buffer = 0x14dcb0b0, 
              element_size = 64}, v = 0x14dcb0b0, v_modifiable = 0x14dcb0b0}, sync_changes = 0x14dcb0f0, mail_keyword_pool = 0x14dbf780, saved_keywords_pool = 0x14de7480, prev_msg_uid = 242, 
          next_uid = 440, idx_next_uid = 440, seq = 105, idx_seq = 105, need_space_seq = 0, last_nonrecent_uid = 0, expunged_space = 14200194, space_diff = 0, dest_first_mail = 1, 
          first_mail_crlf_expunged = 0, keep_recent = 0, readonly = 0, delay_writes = 1, renumber_uids = 0, moved_offsets = 0, ext_modified = 0, index_reset = 0, errors = 0}
        sync_flags = <value optimized out>
        changed = <value optimized out>
        __FUNCTION__ = "mbox_sync_int"
#8  mbox_sync (mbox=0x14db5b30, flags=MBOX_SYNC_UNDIRTY) at mbox-sync.c:2000
        lock_id = 3
        ret = <value optimized out>
        __FUNCTION__ = "mbox_sync"
#9  0x00002adf54e80093 in mbox_storage_sync_init (box=0x14db5b30, flags=65) at mbox-sync.c:2049
        mbox = 0x1fdd
        mbox_sync_flags = 8157
        ret = 0
#10 0x00002adf54e99da1 in mailbox_sync_init (box=0x14db5b30, flags=65) at mail-storage.c:1668
        _data_stack_cur_id = 4
        ctx = <value optimized out>
#11 0x00002adf54e9abc7 in mailbox_sync (box=0x1fdd, flags=8157) at mail-storage.c:1716
        ctx = 0x14d99110
        status = {sync_delayed_expunges = 0}
#12 0x00000000004111c9 in select_open (cmd=0x14db3310, readonly=false) at cmd-select.c:313
No locals.
#13 cmd_select_full (cmd=0x14db3310, readonly=false) at cmd-select.c:439
        client = 0x14db2b00
        ctx = 0x14db3408
        args = 0x14d9ba98
        list_args = 0x101010101010101
        mailbox = 0x14d902b8 "INBOX"
        error = <value optimized out>
        ret = <value optimized out>
        __FUNCTION__ = "cmd_select_full"
---Type <return> to continue, or q <return> to quit---
#14 0x0000000000411afb in cmd_select (cmd=0x1fdd) at cmd-select.c:448
No locals.
#15 0x00000000004165e7 in command_exec (cmd=0x14db3310) at imap-commands.c:158
        hook = 0x14d99110
        ret = <value optimized out>
#16 0x00000000004153db in client_command_input (cmd=0x1fdd) at imap-client.c:780
        client = 0x14db2b00
        command = <value optimized out>
        __FUNCTION__ = "client_command_input"
#17 0x0000000000415495 in client_command_input (cmd=0x14db3310) at imap-client.c:841
        client = 0x14db2b00
        command = 0x6
        __FUNCTION__ = "client_command_input"
#18 0x000000000041564f in client_handle_next_command (client=0x14db2b00) at imap-client.c:879
No locals.
#19 client_handle_input (client=0x14db2b00) at imap-client.c:891
        _data_stack_cur_id = 3
        remove_io = <value optimized out>
        handled_commands = false
        __FUNCTION__ = "client_handle_input"
#20 0x0000000000415a52 in client_input (client=0x14db2b00) at imap-client.c:933
        cmd = <value optimized out>
        output = 0x14db0c00
        bytes = 19
        __FUNCTION__ = "client_input"
#21 0x00002adf551a1676 in io_loop_call_io (io=0x14db0cd0) at ioloop.c:388
        ioloop = 0x14d98740
        t_id = 2
#22 0x00002adf551a2c7d in io_loop_handler_run (ioloop=<value optimized out>) at ioloop-epoll.c:220
        ctx = <value optimized out>
        event = 0x14d9a250
        list = 0x14db0d20
        io = 0x1fdd
        tv = {tv_sec = 3, tv_usec = 790044}
        events_count = <value optimized out>
        msecs = <value optimized out>
        ret = 1
        i = 0
        call = false
        __FUNCTION__ = "io_loop_handler_run"
#23 0x00002adf551a15f8 in io_loop_run (ioloop=0x14d98740) at ioloop.c:412
        __FUNCTION__ = "io_loop_run"
#24 0x00002adf551528d3 in master_service_run (service=0x14d985d0, callback=0x1fdd) at master-service.c:566
No locals.
#25 0x000000000041ef07 in main (argc=2, argv=0x14d98390) at main.c:400
        login_set = {auth_socket_path = 0x14d90060 "\210", postlogin_socket_path = 0x14d900b8 "", postlogin_timeout_secs = 60, callback = 0x41f100 <login_client_connected>, 
          failure_callback = 0x41f0a0 <login_client_failed>, request_auth_token = 1}
        service_flags = <value optimized out>
        storage_service_flags = MAIL_STORAGE_SERVICE_FLAG_DISALLOW_ROOT
        username = 0x0
        c = <value optimized out>
        set_roots = {0x426b20, 0x0}
(gdb) 


More information about the dovecot mailing list