Pigeonhole/sieve possibly corrupting mails
Hi!
I'm here again with a problem. I'm using dovecot as an IMAP server and LDA, filtering mail via sieve. However, few times a day I get the following error on server and my client (mutt) gets disconnected.
Oct 15 20:20:29 ibex dovecot: imap(krakonos): Error: Corrupted index cache file /home/krakonos/.mbox/.imap/INBOX/dovecot.index.cache: Broken physical s ize for mail UID 149418 in mailbox INBOX: read(/home/krakonos/.mbox/inbox) failed: Cached message size smaller than expected (3793 < 8065, box=INBOX, UID=149418, cached Message-Id=88deda0d-86f6-6115-af10-60ac06bb2d22@rename-it.nl) Oct 15 20:20:29 ibex dovecot: imap(krakonos): Error: read(/home/krakonos/.mbox/inbox) failed: Cached message size smaller than expected (3793 < 8065, box=INBOX, UID=149418, cached Message-Id=88deda0d-86f6-6115-af10-60ac06bb2d22@rename-it.nl) (FETCH BODY[] for mailbox INBOX UID 149418) Oct 15 20:20:29 ibex dovecot: imap(krakonos): FETCH read() failed in=110326 out=5115197
This is on a new message (attached), and this error happens on some messages when first opened. Once I reconnect, the message always opens fine, and no old message ever causes problem.
I also noticed this error, which is possibly connected:
Oct 15 20:15:12 ibex dovecot: lda(krakonos): Error: Next message unexpectedly corrupted in mbox file /home/krakonos/.mbox/inbox at 546862809
The filesystem is ext4, and there are no errors in syslog or problems with any other services.
I also don't access the mbox locally, and only dovecot manipulates the mbox (via imap and mailbox_command = /usr/libexec/dovecot/deliver)
The postfix version is 2.2.25. I'm attaching dovecot -n and the offending message (after it's been corrected). I'd rather not publish my sieve file, but will send it privately.
The offending message also contains other message I received at approximately the same time.
Any hint's on what could be wrong?
-- S pozdravem Ladislav "Krakonoš" Láska http://www.krakonos.org/
... Bump. Anything?
On Sat, Oct 15, 2016 at 08:59:24PM +0200, Ladislav Laska wrote:
Hi!
I'm here again with a problem. I'm using dovecot as an IMAP server and LDA, filtering mail via sieve. However, few times a day I get the following error on server and my client (mutt) gets disconnected.
Oct 15 20:20:29 ibex dovecot: imap(krakonos): Error: Corrupted index cache file /home/krakonos/.mbox/.imap/INBOX/dovecot.index.cache: Broken physical s ize for mail UID 149418 in mailbox INBOX: read(/home/krakonos/.mbox/inbox) failed: Cached message size smaller than expected (3793 < 8065, box=INBOX, UID=149418, cached Message-Id=88deda0d-86f6-6115-af10-60ac06bb2d22@rename-it.nl) Oct 15 20:20:29 ibex dovecot: imap(krakonos): Error: read(/home/krakonos/.mbox/inbox) failed: Cached message size smaller than expected (3793 < 8065, box=INBOX, UID=149418, cached Message-Id=88deda0d-86f6-6115-af10-60ac06bb2d22@rename-it.nl) (FETCH BODY[] for mailbox INBOX UID 149418) Oct 15 20:20:29 ibex dovecot: imap(krakonos): FETCH read() failed in=110326 out=5115197
This is on a new message (attached), and this error happens on some messages when first opened. Once I reconnect, the message always opens fine, and no old message ever causes problem.
I also noticed this error, which is possibly connected:
Oct 15 20:15:12 ibex dovecot: lda(krakonos): Error: Next message unexpectedly corrupted in mbox file /home/krakonos/.mbox/inbox at 546862809
The filesystem is ext4, and there are no errors in syslog or problems with any other services.
I also don't access the mbox locally, and only dovecot manipulates the mbox (via imap and mailbox_command = /usr/libexec/dovecot/deliver)
The postfix version is 2.2.25. I'm attaching dovecot -n and the offending message (after it's been corrected). I'd rather not publish my sieve file, but will send it privately.
The offending message also contains other message I received at approximately the same time.
Any hint's on what could be wrong?
-- S pozdravem Ladislav "Krakonoš" Láska http://www.krakonos.org/
# 2.2.25 (7be1766): /etc/dovecot/dovecot.conf # Pigeonhole version 0.4.15 (97b3da0) # OS: Linux 4.0.4-gentoo x86_64 Gentoo Base System release 2.2 auth_username_format = %n hostname = ibex.krakonos.org login_greeting = Dovecot@krakonos.org ready. mail_debug = yes mail_location = mbox:~/.mbox namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = } passdb { args = * driver = pam } passdb { args = scheme=CRYPT username_format=%u /etc/dovecot/users driver = passwd-file } plugin { sieve = file:~/sieve;active=~/.dovecot.sieve sieve_execute_bin_dir = /usr/lib/dovecot/sieve-execute sieve_execute_socket_dir = sieve-execute sieve_extensions = +vnd.dovecot.filter +editheader sieve_filter_bin_dir = /usr/lib/dovecot/sieve-filter sieve_filter_socket_dir = sieve-filter sieve_pipe_bin_dir = /usr/lib/dovecot/sieve-pipe sieve_pipe_socket_dir = sieve-pipe sieve_plugins = sieve_extprograms } postmaster_address = postmaster@krakonos.org protocols = imap service auth { unix_listener /var/spool/postfix/private/auth { mode = 0666 } } ssl_cert =
-- S pozdravem Ladislav "Krakonoš" Láska http://www.krakonos.org/
On 15 Oct 2016, at 21:59, Ladislav Laska laska@kam.mff.cuni.cz wrote:
Hi!
I'm here again with a problem. I'm using dovecot as an IMAP server and LDA, filtering mail via sieve. However, few times a day I get the following error on server and my client (mutt) gets disconnected.
Oct 15 20:20:29 ibex dovecot: imap(krakonos): Error: Corrupted index cache file /home/krakonos/.mbox/.imap/INBOX/dovecot.index.cache: Broken physical s ize for mail UID 149418 in mailbox INBOX: read(/home/krakonos/.mbox/inbox) failed: Cached message size smaller than expected (3793 < 8065, box=INBOX, UID=149418, cached
..
Oct 15 20:15:12 ibex dovecot: lda(krakonos): Error: Next message unexpectedly corrupted in mbox file /home/krakonos/.mbox/inbox at 546862809
Somehow Dovecot thinks that the mbox file changed under it..
The filesystem is ext4, and there are no errors in syslog or problems with any other services.
I also don't access the mbox locally, and only dovecot manipulates the mbox (via imap and mailbox_command = /usr/libexec/dovecot/deliver)
So it shouldn't have broken.
The postfix version is 2.2.25. I'm attaching dovecot -n and the offending message (after it's been corrected). I'd rather not publish my sieve file, but will send it privately.
The offending message also contains other message I received at approximately the same time.
Any hint's on what could be wrong?
These mbox corruptions are usually pretty difficult to reproduce (= impossible to fix without ability to reproduce). You could try if you can (reliably) reproduce it in some way, e.g.:
- Create a test folder: doveadm mailbox create -u krakonos testbox
- Use some combination of:
- Save mail(s) to test folder: cat some-mails | doveadm save -u krakonos testbox
- Try to read mails from test folder: doveadm fetch -u krakonos text mailbox testbox > /dev/null
The fetch should print similar errors to stderr in some way. I attempted to reproduce this way with your msg-error.mbox, but it worked ok.
Hi!
Somehow Dovecot thinks that the mbox file changed under it..
Yes. And it's probably right, but I wonder what could have changed it. I looked around inotify and it seems there is no way to let a file being watched and get program names/pids of processes accessing it.
These mbox corruptions are usually pretty difficult to reproduce (= impossible to fix without ability to reproduce). You could try if you can (reliably) reproduce it in some way, e.g.:
I can reproduce them multiple times a day :-). But not on command, and probably not on another machine, I know...
- Create a test folder: doveadm mailbox create -u krakonos testbox
- Use some combination of:
- Save mail(s) to test folder: cat some-mails | doveadm save -u krakonos testbox
- Try to read mails from test folder: doveadm fetch -u krakonos text mailbox testbox > /dev/null
Well, that's something. doveadm-save doesn't have a manpage, and there is nothing about it on wiki. Is it something new? Also, it doesn't seem to work.
The fetch should print similar errors to stderr in some way. I attempted to reproduce this way with your msg-error.mbox, but it worked ok.
Thinking about it, it might be that I'm fetching the message just as dovecot delivers another one.
Is it possible that fcntl locking is just not working? I'm running a bit older kernel, if that could play a role in it. I'll try to enable dotlock even on read and see if the problem persists.
-- S pozdravem Ladislav "Krakonoš" Láska http://www.krakonos.org/
On October 20, 2016 at 8:03 PM Ladislav Laska laska@kam.mff.cuni.cz wrote:
Hi!
Somehow Dovecot thinks that the mbox file changed under it..
Yes. And it's probably right, but I wonder what could have changed it. I looked around inotify and it seems there is no way to let a file being watched and get program names/pids of processes accessing it.
These mbox corruptions are usually pretty difficult to reproduce (= impossible to fix without ability to reproduce). You could try if you can (reliably) reproduce it in some way, e.g.:
I can reproduce them multiple times a day :-). But not on command, and probably not on another machine, I know...
- Create a test folder: doveadm mailbox create -u krakonos testbox
- Use some combination of:
- Save mail(s) to test folder: cat some-mails | doveadm save -u krakonos testbox
- Try to read mails from test folder: doveadm fetch -u krakonos text mailbox testbox > /dev/null
Well, that's something. doveadm-save doesn't have a manpage, and there is nothing about it on wiki. Is it something new? Also, it doesn't seem to work.
The fetch should print similar errors to stderr in some way. I attempted to reproduce this way with your msg-error.mbox, but it worked ok.
Thinking about it, it might be that I'm fetching the message just as dovecot delivers another one.
Is it possible that fcntl locking is just not working? I'm running a bit older kernel, if that could play a role in it. I'll try to enable dotlock even on read and see if the problem persists.
-- S pozdravem Ladislav "Krakonoš" Láska http://www.krakonos.org/
You could try running lsof in hopes of catching it. Might be rather difficult though.
Aki
Well, I tried.
for i in {1..50}; do echo x | mail -s test krakonos+test@krakonos.org; done
and running lsof. Didn't catch a single lockfile. lsof runs about 1s, so there is little chance of catching it. However, I was reading the mails while they were being delivered, and didn't trigger the problem.
I'll let it happen once more, so I know it's still reproducible and add dotfile locks even for read, and see if it helps.
Or is it possible to enable lock debugging, or perhaps run it completely synchronized (I don't have a lot of traffic, so a little slowdown isn't an issue).
On Thu, Oct 20, 2016 at 08:20:51PM +0300, Aki Tuomi wrote:
On October 20, 2016 at 8:03 PM Ladislav Laska laska@kam.mff.cuni.cz wrote:
Hi!
Somehow Dovecot thinks that the mbox file changed under it..
Yes. And it's probably right, but I wonder what could have changed it. I looked around inotify and it seems there is no way to let a file being watched and get program names/pids of processes accessing it.
These mbox corruptions are usually pretty difficult to reproduce (= impossible to fix without ability to reproduce). You could try if you can (reliably) reproduce it in some way, e.g.:
I can reproduce them multiple times a day :-). But not on command, and probably not on another machine, I know...
- Create a test folder: doveadm mailbox create -u krakonos testbox
- Use some combination of:
- Save mail(s) to test folder: cat some-mails | doveadm save -u krakonos testbox
- Try to read mails from test folder: doveadm fetch -u krakonos text mailbox testbox > /dev/null
Well, that's something. doveadm-save doesn't have a manpage, and there is nothing about it on wiki. Is it something new? Also, it doesn't seem to work.
The fetch should print similar errors to stderr in some way. I attempted to reproduce this way with your msg-error.mbox, but it worked ok.
Thinking about it, it might be that I'm fetching the message just as dovecot delivers another one.
Is it possible that fcntl locking is just not working? I'm running a bit older kernel, if that could play a role in it. I'll try to enable dotlock even on read and see if the problem persists.
-- S pozdravem Ladislav "Krakonoš" Láska http://www.krakonos.org/
You could try running lsof in hopes of catching it. Might be rather difficult though.
Aki
-- S pozdravem Ladislav "Krakonoš" Láska http://www.krakonos.org/
Hi!
So I've been running dovecot with rw dotlocks, and it seems the frequency of failures didn't drop.
Do you have any more ideas on what could be going on?
On Thu, Oct 20, 2016 at 07:31:20PM +0200, Ladislav Laska wrote:
Well, I tried.
for i in {1..50}; do echo x | mail -s test krakonos+test@krakonos.org; done
and running lsof. Didn't catch a single lockfile. lsof runs about 1s, so there is little chance of catching it. However, I was reading the mails while they were being delivered, and didn't trigger the problem.
I'll let it happen once more, so I know it's still reproducible and add dotfile locks even for read, and see if it helps.
Or is it possible to enable lock debugging, or perhaps run it completely synchronized (I don't have a lot of traffic, so a little slowdown isn't an issue).
On Thu, Oct 20, 2016 at 08:20:51PM +0300, Aki Tuomi wrote:
On October 20, 2016 at 8:03 PM Ladislav Laska laska@kam.mff.cuni.cz wrote:
Hi!
Somehow Dovecot thinks that the mbox file changed under it..
Yes. And it's probably right, but I wonder what could have changed it. I looked around inotify and it seems there is no way to let a file being watched and get program names/pids of processes accessing it.
These mbox corruptions are usually pretty difficult to reproduce (= impossible to fix without ability to reproduce). You could try if you can (reliably) reproduce it in some way, e.g.:
I can reproduce them multiple times a day :-). But not on command, and probably not on another machine, I know...
- Create a test folder: doveadm mailbox create -u krakonos testbox
- Use some combination of:
- Save mail(s) to test folder: cat some-mails | doveadm save -u krakonos testbox
- Try to read mails from test folder: doveadm fetch -u krakonos text mailbox testbox > /dev/null
Well, that's something. doveadm-save doesn't have a manpage, and there is nothing about it on wiki. Is it something new? Also, it doesn't seem to work.
The fetch should print similar errors to stderr in some way. I attempted to reproduce this way with your msg-error.mbox, but it worked ok.
Thinking about it, it might be that I'm fetching the message just as dovecot delivers another one.
Is it possible that fcntl locking is just not working? I'm running a bit older kernel, if that could play a role in it. I'll try to enable dotlock even on read and see if the problem persists.
-- S pozdravem Ladislav "Krakonoš" Láska http://www.krakonos.org/
You could try running lsof in hopes of catching it. Might be rather difficult though.
Aki
-- S pozdravem Ladislav "Krakonoš" Láska http://www.krakonos.org/
-- S pozdravem Ladislav "Krakonoš" Láska http://www.krakonos.org/
participants (3)
-
Aki Tuomi
-
Ladislav Laska
-
Timo Sirainen