[Dovecot] dovecot index errors since 2.1.12
Hi,
After upgrading to dovecot 2.1.12, I see a lot of these errors:
2.1.12: /etc/dovecot/dovecot.conf
OS: Linux 2.6.32-279.14.1.el6.x86_64 x86_64 CentOS release 6.3 (Final)
Filesystem: ext4 with mdbox storage
Dec 10 15:21:04 mail dovecot: imap(user@example.org): Error: Cached message size smaller than expected (5010 < 8192) Dec 10 15:21:04 mail dovecot: imap(user@example.org): Error: Corrupted index cache file /var/vmail/example.org/user//mdbox/mailboxes/INBOX/Ubuntu/dbox-Mails/dovecot.index.cache: Broken physical size for mail UID 27517 Dec 10 15:21:04 mail dovecot: imap(user@example.org): Error: read(/var/vmail/example.org/user//mdbox/storage/m.7123) failed: Input/output error (FETCH for mailbox INBOX/Ubuntu UID 27517) Dec 10 15:21:04 mail dovecot: imap(user@example.org): Disconnected: Internal error occurred. Refer to server log for more information. [2012-12-10 15:21:04] in=3391 out=1383967
Dec 17 21:38:34 mail dovecot: imap(user2@example.org): Error: Cached message size larger than expected (12222 > 12085) Dec 17 21:38:34 mail dovecot: imap(user2@example.org): Error: Corrupted index cache file /var/vmail/example.org/user2//mdbox/mailboxes/LKML/dbox-Mails/dovecot.index.cache: Broken physical size for mail UID 216231 Dec 17 21:38:34 mail dovecot: imap(user2@example.org): Error: read(/var/vmail/example.org/user2//mdbox/storage/m.15569) failed: Input/output error (FETCH for mailbox LKML UID 216231) Dec 17 21:38:34 mail dovecot: imap(user2@example.org): Disconnected: Internal error occurred. Refer to server log for more information. [2012-12-17 21:38:34] in=4079 out=61996358
Dec 17 21:40:34 mail dovecot: imap(user2@example.org): Error: Cached message size larger than expected (6726 > 5302) Dec 17 21:40:34 mail dovecot: imap(user2@example.org): Error: Corrupted index cache file /var/vmail/example.org/user2//mdbox/mailboxes/Google/dbox-Mails/dovecot.index.cache: Broken physical size for mail UID 930834 Dec 17 21:40:34 mail dovecot: imap(user2@example.org): Error: read(/var/vmail/example.org/user2//mdbox/storage/m.15544) failed: Input/output error (FETCH for mailbox Google UID 930834) Dec 17 21:40:34 mail dovecot: imap(user2@example.org): Disconnected: Internal error occurred. Refer to server log for more information. [2012-12-17 21:40:34] in=27666 out=29738162
Since the release of dovecot 2.1.x, I have never seen this error message. After downgrading to 2.1.10 the error is gone.
Andy ideas?
Best regards,
Morten
On 18.12.2012, at 23.37, Morten Stevens <mstevens@imt-systems.com> wrote:
Could you try if this fix helps? http://hg.dovecot.org/dovecot-2.1/rev/bded819417d9
On 18.12.2012 22:42, Timo Sirainen wrote:
Hi Timo,
Thanks for the quick reply, but it doesn't help.
Dec 18 23:25:28 mail dovecot: imap(user2@example.org): Error: Cached message size larger than expected (5437 > 3778) Dec 18 23:25:28 mail dovecot: imap(user2@example.org): Error: Corrupted index cache file /var/vmail/example.org/user2//mdbox/mailboxes/Commit/dbox-Mails/dovecot.index.cache: Broken physical size for mail UID 684475 Dec 18 23:25:28 mail dovecot: imap(user2@example.org): Error: read(/var/vmail/example.org/user2//mdbox/storage/m.15613) failed: Input/output error (FETCH for mailbox Commit UID 684475) Dec 18 23:25:28 mail dovecot: imap(user2@example.org): Disconnected: Internal error occurred. Refer to server log for more information. [2012-12-18 23:25:28] in=6602 out=5191621
Dec 18 23:25:52 mail dovecot: imap(user2@example.org): Error: Cached message size larger than expected (8371 > 7123) Dec 18 23:25:52 mail dovecot: imap(user2@example.org): Error: Corrupted index cache file /var/vmail/example.org/user2//mdbox/mailboxes/LKML/dbox-Mails/dovecot.index.cache: Broken physical size for mail UID 216722 Dec 18 23:25:52 mail dovecot: imap(user2@example.org): Error: read(/var/vmail/example.org/user2//mdbox/storage/m.15604) failed: Input/output error (FETCH for mailbox LKML UID 216722) Dec 18 23:25:52 mail dovecot: imap(user2@example.org): Disconnected: Internal error occurred. Refer to server log for more information. [2012-12-18 23:25:52] in=24312 out=41801670
Dec 18 23:26:00 mail dovecot: imap(user2@example.org): Error: Cached message size larger than expected (6008 > 5194) Dec 18 23:26:00 mail dovecot: imap(user2@example.org): Error: Corrupted index cache file /var/vmail/example.org/user2//mdbox/mailboxes/Google/dbox-Mails/dovecot.index.cache: Broken physical size for mail UID 935885 Dec 18 23:26:00 mail dovecot: imap(user2@example.org): Error: read(/var/vmail/example.org/user2//mdbox/storage/m.15620) failed: Input/output error (FETCH for mailbox Google UID 935885) Dec 18 23:26:00 mail dovecot: imap(user2@example.org): Disconnected: Internal error occurred. Refer to server log for more information. [2012-12-18 23:26:00] in=6334 out=7859459
Dec 18 23:26:38 mail dovecot: imap(user2@example.org): Error: Cached message size smaller than expected (4433 < 5226) Dec 18 23:26:38 mail dovecot: imap(user2@example.org): Error: Corrupted index cache file /var/vmail/example.org/user2//mdbox/mailboxes/Bugs/dbox-Mails/dovecot.index.cache: Broken physical size for mail UID 497733 Dec 18 23:26:38 mail dovecot: imap(user2@example.org): Error: read(/var/vmail/example.org/user2//mdbox/storage/m.15620) failed: Input/output error (FETCH for mailbox Bugs UID 497733) Dec 18 23:26:38 mail dovecot: imap(user2@example.org): Disconnected: Internal error occurred. Refer to server log for more information. [2012-12-18 23:26:38] in=1589 out=1320584
Do you have an idea?
Best regards,
Morten
On 19.12.2012, at 0.40, Morten Stevens <mstevens@imt-systems.com> wrote:
Can you easily reproduce this with a new empty user, using some simple IMAP commands? Or alternatively running imaptest tool? http://imapwiki.org/ImapTest
Also send your full doveconf -n output. And especially: are you using zlib plugin compression? Storing attachments separately?
On 18.12.2012 23:44, Timo Sirainen wrote:
I'm not able to reproduce this with a new empty user and I am also not able to reproduce this error after I deleted the affected folders and recreated them.
Also send your full doveconf -n output. And especially: are you using zlib plugin compression? Storing attachments separately?
No zlib compression and we do not save the attachments separately.
Do you still need the full doveconf -n output? I think it's fixed with commit bded819417d9.
Best regards,
Morten
On 19.12.2012 00:22, Morten Stevens wrote:
Hi Timo,
I see the error again. (this is a newly created folder)
Dec 19 14:20:45 mail dovecot: imap-login: Login: user=<user2@example.org>, method=PLAIN, rip=109.90.118.156, lip=81.30.145.131, mpid=16599, TLS, session=<0jJoezTRewBtWnac> Dec 19 14:20:57 mail dovecot: imap(user2@example.org): Error: Cached message size larger than expected (4318 > 4244) Dec 19 14:20:57 mail dovecot: imap(user2@example.org): Error: Corrupted index cache file /var/vmail/example.org/user2//mdbox/mailboxes/Wiki/dbox-Mails/dovecot.index.cache: Broken physical size for mail UID 15 Dec 19 14:20:57 mail dovecot: imap(user2@example.org): Error: read(/var/vmail/example.org/user2//mdbox/storage/m.15626) failed: Input/output error (FETCH for mailbox Wiki UID 15) Dec 19 14:20:57 mail dovecot: imap(user2@example.org): Disconnected: Internal error occurred. Refer to server log for more information. [2012-12-19 14:20:57] in=2153 out=1200970
Do you need the index cache files?
Best regards,
Morten
On 19.12.2012, at 15.35, Morten Stevens <mstevens@imt-systems.com> wrote:
Hmh. I can't figure out how that could happen. Also the index files alone probably aren't enough.. Can you send the index files, storage/dovecot.map.index* files and storage/m.* files put through http://dovecot.org/tools/mdbox-obfuscate.pl to remove the actual message contents. Then only the dovecot.index.cache file has anything sensitive (some message headers).
Hello Timo, Hello list,
we are running 4 hosts with director+dovecot instances and storage on shared NFSv4.
Last week we upgraded to 2.1.13 and since then we encountered the same issues as Morten Stevens. We got a lot of error messages with corrupt index cache and FETCH errors: So we had to roll back to dovecot 2.1.10. Now the errors are gone.
Jan 12 16:48:35 10.129.3.233 dovecot: mailbox: mail: imap(user1@example.org): <TaljVxnT4gAKgQPp>: Error: Cached message size larger than expected (5194 > 4399) Jan 12 16:48:35 10.129.3.233 dovecot: mailbox: mail: imap(user1@example.org): <TaljVxnT4gAKgQPp>: Error: Corrupted index cache file /mail/dovecot/example.org/user1/mail/mailboxes/INBOX/Projekte/backup/MCS/Detail/dbox-Mails/dovecot.index.cache: Broken physical size for mail UID 17593 Jan 12 16:48:35 10.129.3.233 dovecot: mailbox: mail: imap(user1@example.org): <TaljVxnT4gAKgQPp>: Error: read(/mail/dovecot/example.org/user1/mail/storage/m.3070) failed: Input/output error (FETCH for mailbox INBOX/Projekte/backup/MCS/Detail UID 17593) Jan 14 08:18:44 10.129.3.249 dovecot: mailbox: mail: pop3(user1@example2.org): <pWmBdDrT6AAKgQO+>: Error: Cached message size smaller than expected (3149 < 3803)
This critical bug must have been introduced with the latest hg patches somewhere between 2.1.10 and 2.1.13. Please find and revert the problematic changes, since they break the 2.1.x stable release for us.
I'll send you m.3070 obfuscated and gzipped in a separate mail.
Regards Daniel
On 14.1.2013, at 17.16, Daniel Parthey <daniel.parthey@informatik.tu-chemnitz.de> wrote:
Does reverting this change also fix it for you? http://hg.dovecot.org/dovecot-2.1/rev/6c1b4b9f527c
I guess you're using only LMTP to deliver mails? How many CPU cores do you have?
Also how much is "a lot", especially compared to the number of users you have / emails you get? I think it didn't happen very often for Morten (something like 1 error per several tens of thousands of mails?)
Timo Sirainen wrote:
Thanks, did not find this hint on the mailinglist yet. Will try (sooner or later) and keep you informed.
I guess you're using only LMTP to deliver mails?
Yes, LMTP via director port 20024 only.
How many CPU cores do you have?
4 physical machines with Dual Core.
Also how much is "a lot", especially compared to the number of users you have / emails you get? I think it didn't happen very often for Morten (something like 1 error per several tens of thousands of mails?)
I must admit, "a lot" was way too exaggerated ;)
Around 10000 successful LMTP deliveries a day. Around 1 to 3 errors a day.
But data corruption is something we can neither accept, nor explain to our customers, since indexes are very important with mdbox.
Regards Daniel
Timo Sirainen wrote:
Is it safe to run 2.1.13 plus reverted 6c1b4b9f527c on just one machine for a few days of testing while keeping the other machines of the cluster at 2.1.10, using the same NFS storage?
Regards Daniel
Timo Sirainen wrote:
Yes, I can confirm 2.1.16 to be running smoothly without any cache issues. Thanks, Timo!
I guess you're using only LMTP to deliver mails?
Yes, LMTP director proxy only.
How many CPU cores do you have?
2 CPU cores on each of the 4 machines.
Regards Daniel
On Mon, 2013-01-14 at 16:16 +0100, Daniel Parthey wrote:
I wonder if this patch causes your lmtp/lda processes to start assert-crashing (which at least with lmtp causes MTA to retry, with lda might reject mail depending on config): http://hg.dovecot.org/dovecot-2.1/rev/2b76d357a56a
Since the bug appears to be in the mail delivery part, it would be much easier to debug&fix it by looking at what goes wrong there.
On Tue, 2013-01-22 at 16:38 +0200, Timo Sirainen wrote:
participants (3)
-
Daniel Parthey
-
Morten Stevens
-
Timo Sirainen