Hello list, hello Dovecot developers,
this week, I discovered a serious bug in Dovecot, that lead to several broken mails on our servers. The bug corrupts the first few characters of the mail header during saving. On our setup, it was almost always only the very first line of text, that was corrupted.
Depending on the IMAP client (they seem to request different header fields, ... during mail access), the bug causes the imap process to hang up the TCP connection and log errors like this:
imap(USERNAME)<4767><TeQP4ASOTK5/AAAB>: Error: Corrupted record in index cache file /IMAP/mail/mailboxes/USERNAME/mdbox/mailboxes/Trash/dbox-Mails/dovecot.index.cache: UID 489113: Broken fields in mailbox Trash: read(attachments-connector(zlib(/IMAP/mail/mailboxes/USERNAME/mdbox/storage/m.813))): FETCH BODY[HEADER.FIELDS (RETURN-PATH SUBJECT)] got too little data: 2 vs 122
In our case that finally grabbed my attention, the client was the users iphone that did not display any new messages but his Thunderbird did.
The bug seems to be triggered by a bad "interaction" of attachment_dir option and zlib plugin. If you use both, you most likely are affected, too, except you only use zlib plugin for reading previously compressed stored mails. That's also the workaround we use now: zlib plugin only enabled in mail_plugins but no plugin/zlib_save set.
The bug occurs on very specific mails. Due to privacy reasons I could not provide sample mails here. Storing such mails seems to trigger the bug reproducible.
I attached a very minimal doveconf -n config, that can be used to trigger the bug. If one of the developers is interested, I can try to generate an "anonymized" version of such a specific mail that still causes the issue. I discovered the bug on our productive systems, running latest Dovecot 2.2 release, but the latest 2.3 I used during debugging is affected, too.
During debugging, I also found one hint, that might help find the bug: If you store a problematic mail with zlib_save=gz (or zlib_save=bz2) and then disable the zlib plugin in mail_plugins, you can call
doveadm fetch -u test hdr all | grep -v ^hdr: | gzip --decompress
on test's mailbox with only that one broken mail. This will display the beginning of the rfc822 mail text until gzip terminates with "gzip: stdin: unexpected end of file", approximately after twice the length of the mail HEADER. This might indicate, that dovecot stores the uncompressed size of the header in it's data structures although the mail is stored compressed.
I also found a very efficient way to find all affected mails in our setup:
doveadm -f flow fetch -A 'user guid mailbox uid seq flags hdr' all |
grep -a "^[^ ]+ user=" |
grep -avF ' hdr=Return-path: ' |
grep -av '.* hdr=[[:print:][:space:]]*$'
(runtime for ~6M mails on our servers was 20-30min)
This can be even more optimized if you have a powerful storage system with GNU parallel:
doveadm user '*' | parallel "doveadm -f flow fetch -u '{}' 'user guid mailbox uid seq flags hdr' all | grep -a '^user=' | grep -avF ' hdr=Return-path: ' | grep -av '.* hdr=[[:print:][:space:]]*$' || true" (runtime for ~6M mails on our servers was ~4min)
The command will give you a list of mails that possibly are affected, check the full output of
doveadm fetch -u USERNAME hdr guid GUID | less
to verify that the header is really broken.
On our systems I found 39 mails within ~12M mails.
I was able to recover these mails "manually" by reconstructing the Return-Path header line, importing the fixed mails and expunging the corrupt ones. Before importing, I had to disable zlib_save option obviously.
Best regards,
Patrick Cernko <pcernko@mpi-klsb.mpg.de> +49 681 9325 5815 Joint Administration: Information Services and Technology Max-Planck-Institute fuer Informatik & Softwaresysteme