[Dovecot] save failed to INBOX
Hello,
I'm seeing som error messages in the mail log, like this:
Jun 12 12:50:44
It seems that when two or more messages are being delivered at the same time, only one delivery succeeds.
Both the maildirs and the indexes are stored on a SAN ..
I'll try to read the source, to find out whether this is a locking problem. I would think that if a deliver -process finds that a maildir is locked, then it should wait, and retry?
-- Erland Nylend
On Tue, 2007-06-12 at 13:08 +0200, Erland Nylend wrote:
Hello,
I'm seeing som error messages in the mail log, like this:
Jun 12 12:50:44
smtp2 deliver(user1@domain.tld): msgid=10784847.1181644475592.foo.bar@foo99: save failed to INBOX Jun 12 12:50:44 smtp2 deliver(user1@domain.tld): msgid=30757588.1181644556388.foo.bar@foo99: saved mail to INBOX It seems that when two or more messages are being delivered at the same time, only one delivery succeeds.
Both the maildirs and the indexes are stored on a SAN ..
I'll try to read the source, to find out whether this is a locking problem. I would think that if a deliver -process finds that a maildir is locked, then it should wait, and retry?
It should have logged an error message. http://wiki.dovecot.org/Logging
On 2007-06-12, 15:00, Timo Sirainen wrote:
It should have logged an error message. http://wiki.dovecot.org/Logging
It doesn't seem to, though. I cannot find any log messages related to this error in the logs, and I'm pretty sure that mail.info, mail.err and mail.crit ends up there.
-- Erland Nylend
On Tue, 2007-06-12 at 16:30 +0200, Erland Nylend wrote:
On 2007-06-12, 15:00, Timo Sirainen wrote:
It should have logged an error message. http://wiki.dovecot.org/Logging
It doesn't seem to, though. I cannot find any log messages related to this error in the logs, and I'm pretty sure that mail.info, mail.err and mail.crit ends up there.
Make sure with:
protocol lda { log_path = /var/log/dovecot-deliver.err }
And with mode 0666 to make sure it's able to write there.
"save failed to INBOX" doesn't give any useful information in figuring out what the problem is. If it really doesn't log anything else, the problem is going to be difficult to debug. Can you easily reproduce it?
On 2007-06-12, 18:13, Timo Sirainen wrote:
On Tue, 2007-06-12 at 16:30 +0200, Erland Nylend wrote:
On 2007-06-12, 15:00, Timo Sirainen wrote:
It should have logged an error message. http://wiki.dovecot.org/Logging
It doesn't seem to, though. I cannot find any log messages related to this error in the logs, and I'm pretty sure that mail.info, mail.err and mail.crit ends up there.
Make sure with:
protocol lda { log_path = /var/log/dovecot-deliver.err }
And with mode 0666 to make sure it's able to write there.
Hereby done. The log file is filling up, and I will look out for the "save failed to INBOX" -errors.
I'll keep you posted :)
"save failed to INBOX" doesn't give any useful information in figuring out what the problem is. If it really doesn't log anything else, the problem is going to be difficult to debug. Can you easily reproduce it?
I'll try to pipe two large messages to deliver at the same time, to see if that triggers any errors.
-- Erland Nylend
On 2007-06-12, 18:13, Timo Sirainen wrote:
On Tue, 2007-06-12 at 16:30 +0200, Erland Nylend wrote:
On 2007-06-12, 15:00, Timo Sirainen wrote:
It should have logged an error message. http://wiki.dovecot.org/Logging
It doesn't seem to, though. I cannot find any log messages related to this error in the logs, and I'm pretty sure that mail.info, mail.err and mail.crit ends up there.
Make sure with:
protocol lda { log_path = /var/log/dovecot-deliver.err }
And with mode 0666 to make sure it's able to write there.
It logs the same information:
deliver(firstname.lastname@domain.tld): Jun 12 18:30:57 Info: msgid=ECEMJFJEJPMECJKPPFBJEEBOCCAA.foo.bar@domain.tld: save failed to INBOX deliver(firstname.lastname@domain.tld): Jun 12 18:30:57 Info: msgid=ECEMJFJEJPMECJKPPFBJGEBOCCAA.foo.bar@domain.tld: saved mail to INBOX
This is taken from /var/log/dovecot-deliver.err, and all the log lines before and after are "saved mail to INBOX" -lines.
"save failed to INBOX" doesn't give any useful information in figuring out what the problem is. If it really doesn't log anything else, the problem is going to be difficult to debug. Can you easily reproduce it?
I haven't been able to yet .. but will keep trying :)
-- Erland Nylend
On Tue, 2007-06-12 at 18:54 +0200, Erland Nylend wrote:
protocol lda { log_path = /var/log/dovecot-deliver.err }
And with mode 0666 to make sure it's able to write there.
It logs the same information:
deliver(firstname.lastname@domain.tld): Jun 12 18:30:57 Info: msgid=ECEMJFJEJPMECJKPPFBJEEBOCCAA.foo.bar@domain.tld: save failed to INBOX deliver(firstname.lastname@domain.tld): Jun 12 18:30:57 Info: msgid=ECEMJFJEJPMECJKPPFBJGEBOCCAA.foo.bar@domain.tld: saved mail to INBOX
This is taken from /var/log/dovecot-deliver.err, and all the log lines before and after are "saved mail to INBOX" -lines.
Oh. I thought it would have written only errors there. I guess you should have also added then "info_log_path = /var/log/dovecot-deliver.info" or something. :)
But why is it then failing without an error message.. Post your dovecot -n output?
Hello,
dovecot -n:
| # /usr/local/dovecot-1.0.0/etc/dovecot.conf | protocols: imap pop3 | listen(default): *:143 | listen(imap): *:143 | listen(pop3): *:110 | ssl_disable: yes | disable_plaintext_auth: no | login_dir: /usr/local/dovecot-1.0.0/var/run/dovecot/login | login_executable(default): /usr/local/dovecot-1.0.0/libexec/dovecot/imap-login | login_executable(imap): /usr/local/dovecot-1.0.0/libexec/dovecot/imap-login | login_executable(pop3): /usr/local/dovecot-1.0.0/libexec/dovecot/pop3-login | mail_executable(default): /usr/local/dovecot-1.0.0/libexec/dovecot/imap | mail_executable(imap): /usr/local/dovecot-1.0.0/libexec/dovecot/imap | mail_executable(pop3): /usr/local/dovecot-1.0.0/libexec/dovecot/pop3 | mail_plugins(default): quota imap_quota | mail_plugins(imap): quota imap_quota | mail_plugins(pop3): quota | mail_plugin_dir(default): /usr/local/dovecot-1.0.0/lib/dovecot/imap | mail_plugin_dir(imap): /usr/local/dovecot-1.0.0/lib/dovecot/imap | mail_plugin_dir(pop3): /usr/local/dovecot-1.0.0/lib/dovecot/pop3 | imap_client_workarounds(default): outlook-idle delay-newmail | imap_client_workarounds(imap): outlook-idle delay-newmail | imap_client_workarounds(pop3): outlook-idle | pop3_uidl_format(default): | pop3_uidl_format(imap): | pop3_uidl_format(pop3): UID%u-%v | pop3_client_workarounds(default): | pop3_client_workarounds(imap): | pop3_client_workarounds(pop3): outlook-no-nuls oe-ns-eoh | namespace: | type: private | prefix: INBOX. | inbox: yes | auth default: | user: dovecot-auth | passdb: | driver: sql | args: /usr/local/dovecot/etc/dovecot-sql.conf | userdb: | driver: sql | args: /usr/local/dovecot/etc/dovecot-sql.conf | socket: | type: listen | client: | master: | path: /var/run/dovecot/auth-master | mode: 432 | user: root | group: atmail
-- Erland Nylend
On Tue, 2007-06-12 at 21:05 +0300, Timo Sirainen wrote:
On Tue, 2007-06-12 at 19:55 +0200, Erland Nylend wrote:
Hello,
dovecot -n:
| # /usr/local/dovecot-1.0.0/etc/dovecot.conf
Are you using Sieve? Unfortunately dovecot -n doesn't show protocol lda section..
And what parameters do you give to deliver?
On 2007-06-12, 21:06, Timo Sirainen wrote:
On Tue, 2007-06-12 at 21:05 +0300, Timo Sirainen wrote:
On Tue, 2007-06-12 at 19:55 +0200, Erland Nylend wrote:
Hello,
dovecot -n:
| # /usr/local/dovecot-1.0.0/etc/dovecot.conf
Are you using Sieve? Unfortunately dovecot -n doesn't show protocol lda section..
And what parameters do you give to deliver?
This is the master.cf entry:
# Dovecot LDA dovecot unix - n n - - pipe flags=DRhu user=vmail:vmail argv=/usr/local/dovecot/libexec/dovecot/deliver -d ${recipient}
I've set this in main.cf:
dovecot_destination_recipient_limit = 1
Postfix is set to soft_bounce mail ..
-- Erland Nylend
On 2007-06-12, 21:05, Timo Sirainen wrote:
On Tue, 2007-06-12 at 19:55 +0200, Erland Nylend wrote:
Hello,
dovecot -n:
| # /usr/local/dovecot-1.0.0/etc/dovecot.conf
Are you using Sieve? Unfortunately dovecot -n doesn't show protocol lda section..
No, I'm not using sieve.
This is the lda section of the conf-file:
protocol lda { postmaster_address = MAILER-DAEMON@domain.tld mail_plugins = quota auth_socket_path = /var/run/dovecot/auth-master sendmail_path = /usr/sbin/sendmail log_path = /var/log/dovecot-deliver.err }
-- Erland Nylend
On Tue, 2007-06-12 at 18:54 +0200, Erland Nylend wrote:
deliver(firstname.lastname@domain.tld): Jun 12 18:30:57 Info: msgid=ECEMJFJEJPMECJKPPFBJEEBOCCAA.foo.bar@domain.tld: save failed to INBOX deliver(firstname.lastname@domain.tld): Jun 12 18:30:57 Info: msgid=ECEMJFJEJPMECJKPPFBJGEBOCCAA.foo.bar@domain.tld: saved mail to INBOX
As you see there, the message-IDs are the same. So Dovecot first tried to save a message to INBOX and then when it failed it tried to do that again which finally succeeded. I can't figure out why it would ever try to save it twice to INBOX though.
Anyway, if the failure is non-critical such as "quota exceeded", Dovecot doesn't currently log anything. If the save wouldn't have succeeded, it would have given the error message in rejection mail. This patch makes it log the error always: http://hg.dovecot.org/dovecot-1.0/rev/abc4be10176e
On 2007-06-12, 21:27, Timo Sirainen wrote:
On Tue, 2007-06-12 at 18:54 +0200, Erland Nylend wrote:
deliver(firstname.lastname@domain.tld): Jun 12 18:30:57 Info: msgid=ECEMJFJEJPMECJKPPFBJEEBOCCAA.foo.bar@domain.tld: save failed to INBOX deliver(firstname.lastname@domain.tld): Jun 12 18:30:57 Info: msgid=ECEMJFJEJPMECJKPPFBJGEBOCCAA.foo.bar@domain.tld: saved mail to INBOX
As you see there, the message-IDs are the same. So Dovecot first tried to save a message to INBOX and then when it failed it tried to do that again which finally succeeded. I can't figure out why it would ever try to save it twice to INBOX though.
For these two messages, the message-ids were the same, but in most cases they are different..
Anyway, if the failure is non-critical such as "quota exceeded", Dovecot doesn't currently log anything.
I've checked some of the accounts, and they are not over quota. It seems that some deliveries fail, and then they receive mail again afterwards ..
If the save wouldn't have succeeded, it would have given the error message in rejection mail. This patch makes it log the error always: http://hg.dovecot.org/dovecot-1.0/rev/abc4be10176e
I'll try it first thing tomorrow, and give you a feedback.
-- Erland Nylend
On 2007-06-12, 21:27, Timo Sirainen wrote:
On Tue, 2007-06-12 at 18:54 +0200, Erland Nylend wrote:
deliver(firstname.lastname@domain.tld): Jun 12 18:30:57 Info: msgid=ECEMJFJEJPMECJKPPFBJEEBOCCAA.foo.bar@domain.tld: save failed to INBOX deliver(firstname.lastname@domain.tld): Jun 12 18:30:57 Info: msgid=ECEMJFJEJPMECJKPPFBJGEBOCCAA.foo.bar@domain.tld: saved mail to INBOX
As you see there, the message-IDs are the same. So Dovecot first tried to save a message to INBOX and then when it failed it tried to do that again which finally succeeded. I can't figure out why it would ever try to save it twice to INBOX though.
Anyway, if the failure is non-critical such as "quota exceeded", Dovecot doesn't currently log anything. If the save wouldn't have succeeded, it would have given the error message in rejection mail. This patch makes it log the error always: http://hg.dovecot.org/dovecot-1.0/rev/abc4be10176e
I added the patch, and recompiled with these options:
./configure --prefix=/usr/local/dovecot-1.0.0.patched --with-mysql --with-passwd=no --with-sqlite=no --with-pam=no --with-shadow=no --with-gssapi
This is a log line from /var/log/dovecot-deliver.err:
deliver(firstname.lastname@domain.tld): Jun 13 10:36:29 Info: msgid=NM-0-3-40561452-1181723787@mailgen-0-1: save failed to INBOX: Unknown error
Not sure whether this is the same problem, though. The log line only appears once. I'll send you an update if I get more error messages.
-- Erland Nylend
On Wed, 2007-06-13 at 11:00 +0200, Erland Nylend wrote:
deliver(firstname.lastname@domain.tld): Jun 13 10:36:29 Info: msgid=NM-0-3-40561452-1181723787@mailgen-0-1: save failed to INBOX: Unknown error
This is what I was afraid of first. "Unknown error" means that some function returned -1, but didn't set an error message.
See if this changes the message: http://hg.dovecot.org/dovecot-1.0/rev/7959ec38de98
If not, I'd have to create some debug patches to find out which function is doing this.
On 2007-06-13, 14:29, Timo Sirainen wrote:
On Wed, 2007-06-13 at 11:00 +0200, Erland Nylend wrote:
deliver(firstname.lastname@domain.tld): Jun 13 10:36:29 Info: msgid=NM-0-3-40561452-1181723787@mailgen-0-1: save failed to INBOX: Unknown error
This is what I was afraid of first. "Unknown error" means that some function returned -1, but didn't set an error message.
See if this changes the message: http://hg.dovecot.org/dovecot-1.0/rev/7959ec38de98
If not, I'd have to create some debug patches to find out which function is doing this.
I've added the patch. Will send update as soon as errors are logged.
-- Erland Nylend
On 2007-06-13, 14:29, Timo Sirainen wrote:
On Wed, 2007-06-13 at 11:00 +0200, Erland Nylend wrote:
deliver(firstname.lastname@domain.tld): Jun 13 10:36:29 Info: msgid=NM-0-3-40561452-1181723787@mailgen-0-1: save failed to INBOX: Unknown error
This is what I was afraid of first. "Unknown error" means that some function returned -1, but didn't set an error message.
See if this changes the message: http://hg.dovecot.org/dovecot-1.0/rev/7959ec38de98
If not, I'd have to create some debug patches to find out which function is doing this.
hey .. just before I added the last patch, these messages appeared in the log:
deliver(user@domain.tld): Jun 13 14:13:21 Info: msgid=6AGuCWvbL0008158e@mail.server: save failed to INBOX: Timeout while waiting for lock deliver(user@domain.tld): Jun 13 14:13:22 Info: msgid=Pqyfj7TRw00081594@mail.server: save failed to INBOX: Timeout while waiting for lock deliver(user@domain.tld): Jun 13 14:13:22 Info: msgid=sIYbI7M5o00081593@mail.server: save failed to INBOX: Timeout while waiting for lock deliver(user@domain.tld): Jun 13 14:13:22 Info: msgid=W0ccKEhtX00081596@mail.server: save failed to INBOX: Timeout while waiting for lock deliver(user@domain.tld): Jun 13 14:13:23 Info: msgid=lZhe1T8SF00081597@mail.server: save failed to INBOX: Timeout while waiting for lock
Messages to this particular account has produced error messages earlier.
-- Erland Nylend
On 2007-06-13, 14:29, Timo Sirainen wrote:
If not, I'd have to create some debug patches to find out which function is doing this.
I would appreciate that very much. I still see quite a few of these "unknown error" -lines in the logs ..
Another thing is that maybe deliver should return a "tempfail" exit code to postfix in these cases, instead of sending bounce messages.
Since postfix (in our case) soft_bounces mail, that would keep them in queue.
-- Erland Nylend
On Thu, 2007-06-14 at 15:33 +0200, Erland Nylend wrote:
On 2007-06-13, 14:29, Timo Sirainen wrote:
If not, I'd have to create some debug patches to find out which function is doing this.
I would appreciate that very much. I still see quite a few of these "unknown error" -lines in the logs ..
Well, attached is the first one. When you tell me which one of those functions failed I'll send another patch..
Another thing is that maybe deliver should return a "tempfail" exit code to postfix in these cases, instead of sending bounce messages.
On 2007-06-14, 21:53, Timo Sirainen wrote:
Well, attached is the first one. When you tell me which one of those functions failed I'll send another patch..
Will do. Thank you :)
Another thing is that maybe deliver should return a "tempfail" exit code to postfix in these cases, instead of sending bounce messages.
Great.
-- Erland Nylend
On 2007-06-14, 21:53, Timo Sirainen wrote:
On Thu, 2007-06-14 at 15:33 +0200, Erland Nylend wrote:
On 2007-06-13, 14:29, Timo Sirainen wrote:
If not, I'd have to create some debug patches to find out which function is doing this.
I would appreciate that very much. I still see quite a few of these "unknown error" -lines in the logs ..
Well, attached is the first one. When you tell me which one of those functions failed I'll send another patch..
It's mailbox_copy.
Jun 15 05:51:31 smtp1 deliver(first.last@domain.tld): mailbox_copy() failed Jun 15 05:51:31 smtp1 deliver(first.last@domain.tld): msgid=122554978.1181879490893.JavaMail.weblogic@mailserver: save failed to INBOX: Unknown error Jun 15 07:06:21 smtp1 deliver(user@domain2.tld): mailbox_copy() failed Jun 15 07:06:21 smtp1 deliver(user@domain2.tld): msgid=20070615050617.162DB3800C7@mailserver2: save failed to INBOX: Unknown error
I'm also getting some "timeout while waiting for lock", and with the debug patch I get "mailbox_transaction_commit() failed" on those.
-- Erland Nylend
On Fri, 2007-06-15 at 10:01 +0200, Erland Nylend wrote:
On 2007-06-14, 21:53, Timo Sirainen wrote:
On Thu, 2007-06-14 at 15:33 +0200, Erland Nylend wrote:
On 2007-06-13, 14:29, Timo Sirainen wrote:
If not, I'd have to create some debug patches to find out which function is doing this.
I would appreciate that very much. I still see quite a few of these "unknown error" -lines in the logs ..
Well, attached is the first one. When you tell me which one of those functions failed I'll send another patch..
It's mailbox_copy.
Jun 15 05:51:31 smtp1 deliver(first.last@domain.tld): mailbox_copy() failed
OK, attached is the next one. I actually found one missing error handler from there. Doesn't matter if you keep the previous patch or not.
I'm also getting some "timeout while waiting for lock", and with the debug patch I get "mailbox_transaction_commit() failed" on those.
Those lock timeouts shouldn't happen either.. Are they separate errors or do they happen for those that get "save failed"?
BTW. What filesystem do you use?
On 2007-06-16, 01:36, Timo Sirainen wrote:
OK, attached is the next one. I actually found one missing error handler from there. Doesn't matter if you keep the previous patch or not.
Thanks. Will add it, and keep you updated.
I'm also getting some "timeout while waiting for lock", and with the debug patch I get "mailbox_transaction_commit() failed" on those.
Those lock timeouts shouldn't happen either.. Are they separate errors or do they happen for those that get "save failed"?
I think this is a different issue. The logs (with the debug patch) looks like this:
Jun 15 09:24:13 smtp1 deliver(user@domain.tld): mailbox_transaction_commit() failed Jun 15 09:24:13 smtp1 deliver(user@domain.tld): msgid=920880857.83287269771168@mailserver: save failed to INBOX: Timeout while waiting for lock Jun 15 09:24:13 smtp1 deliver(user@domain.tld): mailbox_transaction_commit() failed Jun 15 09:24:13 smtp1 deliver(user@domain.tld): msgid=920880857.83287269771168@mailserver: save failed to INBOX: Timeout while waiting for lock Jun 15 09:24:13 smtp1 deliver(user@domain.tld): mailbox_transaction_commit() failed Jun 15 09:24:13 smtp1 deliver(user@domain.tld): msgid=920880857.83287269771168@mailserver: save failed to INBOX: Timeout while waiting for lock
Dont't know if it matters, but this account has 22000+ messages .. The other one I saw with the same error message had 15000+.
BTW. What filesystem do you use?
We're using GPFS.
-- Erland Nylend
On Sat, 2007-06-16 at 22:20 +0200, Erland Nylend wrote:
Jun 15 09:24:13 smtp1 deliver(user@domain.tld): mailbox_transaction_commit() failed Jun 15 09:24:13 smtp1 deliver(user@domain.tld): msgid=920880857.83287269771168@mailserver: save failed to INBOX: Timeout while waiting for lock Jun 15 09:24:13 smtp1 deliver(user@domain.tld): mailbox_transaction_commit() failed Jun 15 09:24:13 smtp1 deliver(user@domain.tld): msgid=920880857.83287269771168@mailserver: save failed to INBOX: Timeout while waiting for lock Jun 15 09:24:13 smtp1 deliver(user@domain.tld): mailbox_transaction_commit() failed Jun 15 09:24:13 smtp1 deliver(user@domain.tld): msgid=920880857.83287269771168@mailserver: save failed to INBOX: Timeout while waiting for lock
Dont't know if it matters, but this account has 22000+ messages .. The other one I saw with the same error message had 15000+.
BTW. What filesystem do you use?
We're using GPFS.
But it doesn't happen always I guess? Maybe this has something to do with GPFS?
That error happens if maildir-uidlist.lock file was found to exist for two minutes + 2 seconds, and within those 2 minutes it changed at least once.
The uidlist is locked while maildir is being synchronized. With local filesystems syncing a 22k maildir takes less than a second. I don't know with GPFS. You could try this with for example expunging a message and seeing how long it takes to return "OK".
Or maybe GPFS caches uidlist.lock file's existence similar to NFS attribute cache?
On Sun, 2007-06-17 at 13:19 +0300, Timo Sirainen wrote:
The uidlist is locked while maildir is being synchronized. With local filesystems syncing a 22k maildir takes less than a second. I don't know with GPFS. You could try this with for example expunging a message and seeing how long it takes to return "OK".
Actually that might not be enough to trigger a full sync. Rather change cur/ directory's mtime and then do "NOOP" command.
On 2007-06-17, 13:21, Timo Sirainen wrote:
On Sun, 2007-06-17 at 13:19 +0300, Timo Sirainen wrote:
The uidlist is locked while maildir is being synchronized. With local filesystems syncing a 22k maildir takes less than a second. I don't know with GPFS. You could try this with for example expunging a message and seeing how long it takes to return "OK".
Actually that might not be enough to trigger a full sync. Rather change cur/ directory's mtime and then do "NOOP" command.
I've tried doing:
$ telnet my.mail.server 143 Trying 81.167.36.148... Connected to my.mail.server Escape character is '^]'.
- OK Dovecot ready.
LOGIN user@domain password
# OK Logged in. # SELECT INBOX
- FLAGS (\Answered \Flagged \Deleted \Seen \Draft Junk NonJunk)
- OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft Junk NonJunk \*)] Flags permitted.
- 1 EXISTS
- 1 RECENT
- OK [UIDVALIDITY 1155059954] UIDs valid
- OK [UIDNEXT 28796] Predicted next UID
OK [READ-WRITE] Select completed.
# NOOP
- 22809 EXISTS
OK NOOP completed.
The NOOP typically takes up to 2 seconds if I "touch cur/foo" in the users maildir before sending the command.
I noticed that it printed some output once, when I sent a NOOP:
[...]
- 2578 FETCH (FLAGS (NonJunk))
- 2579 FETCH (FLAGS (NonJunk))
- 2580 FETCH (FLAGS (NonJunk))
- 2581 FETCH (FLAGS (NonJunk))
- 2582 FETCH (FLAGS (NonJunk))
- 2583 FETCH (FLAGS (NonJunk))
- 2584 FETCH (FLAGS (NonJunk))
- 2585 FETCH (FLAGS (NonJunk))
- 2586 FETCH (FLAGS (NonJunk)) [...]
It normally prints only:
- 22809 EXISTS
OK NOOP completed.
-- Erland Nylend
On 2007-06-17, 13:19, Timo Sirainen wrote:
But it doesn't happen always I guess?
No. The accounts receive most of the deliveries just fine. Then inbetween the successful deliveries, some fail with that error message.
Maybe this has something to do with GPFS?
Could be. I'm not sure.
That error happens if maildir-uidlist.lock file was found to exist for two minutes + 2 seconds, and within those 2 minutes it changed at least once.
The uidlist is locked while maildir is being synchronized. With local filesystems syncing a 22k maildir takes less than a second. I don't know with GPFS. You could try this with for example expunging a message and seeing how long it takes to return "OK".
Using thunderbird, expurging a message seems to work just fine, and it takes less than a second.
Or maybe GPFS caches uidlist.lock file's existence similar to NFS attribute cache?
Could be something like that.
-- Erland Nylend
On 2007-06-16, 01:36, Timo Sirainen wrote:
OK, attached is the next one. I actually found one missing error handler from there. Doesn't matter if you keep the previous patch or not.
Here's the first one:
Jun 17 00:34:50 smtp1 deliver(user@domain.tld): copy: mailbox_save_finish() failed Jun 17 00:34:50 smtp1 deliver(user@domain.tld): mailbox_copy() failed Jun 17 00:34:50 smtp1 deliver(user@domain.tld): msgid=20070616223442.A36DE38002E@mailserver.tld: save failed to INBOX: Unknown error
(I kept both debug patches)
-- Erland Nylend
On Sun, 2007-06-17 at 01:23 +0200, Erland Nylend wrote:
On 2007-06-16, 01:36, Timo Sirainen wrote:
OK, attached is the next one. I actually found one missing error handler from there. Doesn't matter if you keep the previous patch or not.
Here's the first one:
Jun 17 00:34:50 smtp1 deliver(user@domain.tld): copy: mailbox_save_finish() failed
I still can't figure out why that would fail there.. Well, one more patch that should tell where the error is.
On 2007-06-17, 12:59, Timo Sirainen wrote:
On Sun, 2007-06-17 at 01:23 +0200, Erland Nylend wrote:
On 2007-06-16, 01:36, Timo Sirainen wrote:
OK, attached is the next one. I actually found one missing error handler from there. Doesn't matter if you keep the previous patch or not.
Here's the first one:
Jun 17 00:34:50 smtp1 deliver(user@domain.tld): copy: mailbox_save_finish() failed
I still can't figure out why that would fail there.. Well, one more patch that should tell where the error is.
Hereby added :)
-- Erland Nylend
On 2007-06-17, 17:52, Erland Nylend wrote:
On 2007-06-17, 12:59, Timo Sirainen wrote:
On Sun, 2007-06-17 at 01:23 +0200, Erland Nylend wrote:
On 2007-06-16, 01:36, Timo Sirainen wrote:
OK, attached is the next one. I actually found one missing error handler from there. Doesn't matter if you keep the previous patch or not.
Here's the first one:
Jun 17 00:34:50 smtp1 deliver(user@domain.tld): copy: mailbox_save_finish() failed
I still can't figure out why that would fail there.. Well, one more patch that should tell where the error is.
Hereby added :)
Here are the resulting logs:
Jun 17 20:06:35 smtp2 deliver(user@domain.tld): copy: mailbox_save_finish() failed Jun 17 20:06:35 smtp2 deliver(user@domain.tld): quota_copy(): super.copy failed Jun 17 20:06:35 smtp2 deliver(user@domain.tld): mailbox_copy() failed Jun 17 20:06:35 smtp2 deliver(user@domain.tld): msgid=200706171815.06f45f992013@foo.bar.tld: save failed to INBOX: Unknown error
-- Erland Nylend
On Mon, 2007-06-18 at 02:55 +0200, Erland Nylend wrote:
On 2007-06-17, 17:52, Erland Nylend wrote:
On 2007-06-17, 12:59, Timo Sirainen wrote:
On Sun, 2007-06-17 at 01:23 +0200, Erland Nylend wrote:
On 2007-06-16, 01:36, Timo Sirainen wrote:
OK, attached is the next one. I actually found one missing error handler from there. Doesn't matter if you keep the previous patch or not.
Here's the first one:
Jun 17 00:34:50 smtp1 deliver(user@domain.tld): copy: mailbox_save_finish() failed
I still can't figure out why that would fail there.. Well, one more patch that should tell where the error is.
Hereby added :)
Here are the resulting logs:
Jun 17 20:06:35 smtp2 deliver(user@domain.tld): copy: mailbox_save_finish() failed Jun 17 20:06:35 smtp2 deliver(user@domain.tld): quota_copy(): super.copy failed Jun 17 20:06:35 smtp2 deliver(user@domain.tld): mailbox_copy() failed Jun 17 20:06:35 smtp2 deliver(user@domain.tld): msgid=200706171815.06f45f992013@foo.bar.tld: save failed to INBOX: Unknown error
So I guess nothing before this? There should have been, or otherwise I can't see why it would happen, unless you're using some plugin I don't know about, or you're not using maildir.
Your dovecot -n output didn't contain mail_location. Are you returning it from userdb either? You could try if the attached patch changes anything, it's logging only mbox failures..
On 2007-06-18, 18:47, Timo Sirainen wrote:
So I guess nothing before this?
There is one more log line, which I missed:
Jun 18 19:47:54 smtp2 deliver(user@domain.tld): copy: mailbox_save_finish() failed Jun 18 19:47:54 smtp2 deliver(user@domain.tld): quota_copy(): super.copy failed Jun 18 19:47:54 smtp2 deliver(user@domain.tld): mailbox_copy() failed Jun 18 19:47:54 smtp2 deliver(user@domain.tld): msgid=20070618174753.A82793500DB@mail.server1: save failed to INBOX: Unknown error Jun 18 19:47:54 smtp2 deliver(user@domain.tld): msgid=dovecot-1182188873-946994-0@smtp2: saved mail to INBOX
Another example:
Jun 18 15:09:59 smtp2 deliver(user2@domain2.tld): copy: mailbox_save_finish() failed Jun 18 15:09:59 smtp2 deliver(user2@domain2.tld): quota_copy(): super.copy failed Jun 18 15:09:59 smtp2 deliver(user2@domain2.tld): mailbox_copy() failed Jun 18 15:09:59 smtp2 deliver(user2@domain2.tld): msgid=006101c7b116$cd6c9270$83044f0a@foo.bar: save failed to INBOX: Unknown error Jun 18 15:09:59 smtp2 deliver(user2@domain2.tld): msgid=005f01c7b116$9a05d9a0$83044f0a@foo.bar: saved mail to INBOX
Sorry about that. I used "grep" when sending you logs .. and grepped out "saved mail to INBOX".
Then we're back to the original problem. (I thought this was a problem when two or more messages were sent at the same time)
There should have been, or otherwise I can't see why it would happen, unless you're using some plugin I don't know about, or you're not using maildir.
We're using maildir on all accounts, and mysql.
Your dovecot -n output didn't contain mail_location.
The user_query looks like this:
user_query = SELECT 3000 as uid, 3000 as gid, concat(concat('maildir:', MailDir), ':INDEX=/path/to/indexes/%d/%u') as mail, concat('maildir:storage=', UserQuota) as quota FROM Users WHERE Account = '%u'
Are you returning it from userdb either? You could try if the attached patch changes anything, it's logging only mbox failures..
As you can see above, I've statically set the mailbox type to maildir .. (I did that to make it create the maildir's automatically)
I've checked the mysql database for some users, and the maildir attribute, and the other information there, seems to be correct.
Is it possible that the mysql query fails, so that dovecot doesn't know maildir, quota, or the other information it gets from user_query?
-- Erland Nylend
On Mon, 2007-06-18 at 20:50 +0200, Erland Nylend wrote:
On 2007-06-18, 18:47, Timo Sirainen wrote:
So I guess nothing before this?
There is one more log line, which I missed:
Jun 18 19:47:54 smtp2 deliver(user@domain.tld): copy: mailbox_save_finish() failed Jun 18 19:47:54 smtp2 deliver(user@domain.tld): quota_copy(): super.copy failed
Oh, one of my earlier patches was supposed to catch quota_save_finish(), not quota_copy(). Attached. Also added some more checks.. It has to be related to quota code.
On 2007-06-19, 03:50, Timo Sirainen wrote:
On Mon, 2007-06-18 at 20:50 +0200, Erland Nylend wrote:
On 2007-06-18, 18:47, Timo Sirainen wrote:
So I guess nothing before this?
There is one more log line, which I missed:
Jun 18 19:47:54 smtp2 deliver(user@domain.tld): copy: mailbox_save_finish() failed Jun 18 19:47:54 smtp2 deliver(user@domain.tld): quota_copy(): super.copy failed
Oh, one of my earlier patches was supposed to catch quota_save_finish(), not quota_copy(). Attached. Also added some more checks.. It has to be related to quota code.
Great. I've added the patch :)
-- Erland Nylend
On 2007-06-19, 03:50, Timo Sirainen wrote:
On Mon, 2007-06-18 at 20:50 +0200, Erland Nylend wrote:
On 2007-06-18, 18:47, Timo Sirainen wrote:
So I guess nothing before this?
There is one more log line, which I missed:
Jun 18 19:47:54 smtp2 deliver(user@domain.tld): copy: mailbox_save_finish() failed Jun 18 19:47:54 smtp2 deliver(user@domain.tld): quota_copy(): super.copy failed
Oh, one of my earlier patches was supposed to catch quota_save_finish(), not quota_copy(). Attached. Also added some more checks.. It has to be related to quota code.
We're getting closer. Now I'm getting this:
Jun 19 11:30:25 smtp2 deliver(user@domain.tld): quota_check(): failed: Unknown error Jun 19 11:30:25 smtp2 deliver(user@domain.tld): quota.save_finish() quota_check() failed Jun 19 11:30:25 smtp2 deliver(user@domain.tld): copy: mailbox_save_finish() failed Jun 19 11:30:25 smtp2 deliver(user@domain.tld): quota_copy(): super.copy failed Jun 19 11:30:25 smtp2 deliver(user@domain.tld): mailbox_copy() failed Jun 19 11:30:25 smtp2 deliver(user@domain.tld): msgid=004801c7b253$9618dcf0$0514a8c0@foo: save failed to INBOX: Unknown error Jun 19 11:30:25 smtp2 deliver(user@domain.tld): msgid=004901c7b253$9709a590$0514a8c0@foo: saved mail to INBOX
-- Erland Nylend
On Tue, 2007-06-19 at 12:08 +0200, Erland Nylend wrote:
On 2007-06-19, 03:50, Timo Sirainen wrote:
On Mon, 2007-06-18 at 20:50 +0200, Erland Nylend wrote:
On 2007-06-18, 18:47, Timo Sirainen wrote:
So I guess nothing before this?
There is one more log line, which I missed:
Jun 18 19:47:54 smtp2 deliver(user@domain.tld): copy: mailbox_save_finish() failed Jun 18 19:47:54 smtp2 deliver(user@domain.tld): quota_copy(): super.copy failed
Oh, one of my earlier patches was supposed to catch quota_save_finish(), not quota_copy(). Attached. Also added some more checks.. It has to be related to quota code.
We're getting closer. Now I'm getting this:
Jun 19 11:30:25 smtp2 deliver(user@domain.tld): quota_check(): failed: Unknown error
Oh. I never thought the Unknown error came from quota plugin. I changed that now to "Unknown quota error": http://hg.dovecot.org/dovecot-1.0/rev/02fb6b136e19
Also I think this should fix the problem: http://hg.dovecot.org/dovecot-1.0/rev/b919a32cd077
On 2007-06-19, 14:27, Timo Sirainen wrote:
Jun 19 11:30:25 smtp2 deliver(user@domain.tld): quota_check(): failed: Unknown error
Oh. I never thought the Unknown error came from quota plugin. I changed that now to "Unknown quota error": http://hg.dovecot.org/dovecot-1.0/rev/02fb6b136e19
Also I think this should fix the problem: http://hg.dovecot.org/dovecot-1.0/rev/b919a32cd077
Both patches added :) Will send you updates as soon as logs come in.
-- Erland Nylend
On 2007-06-19, 14:27, Timo Sirainen wrote:
Also I think this should fix the problem: http://hg.dovecot.org/dovecot-1.0/rev/b919a32cd077
The errors are gone .. thank you :)
-- Erland Nylend
- Erland Nylend erland.nylend@lyse.no:
Also I think this should fix the problem: http://hg.dovecot.org/dovecot-1.0/rev/b919a32cd077
The errors are gone .. thank you :)
I don't see any more errors either.
-- Ralf Hildebrandt (Ralf.Hildebrandt@charite.de) plonk@charite.de Postfix - Einrichtung, Betrieb und Wartung Tel. +49 (0)30-450 570-155 http://www.arschkrebs.de "General Failure's Fault. Not Yours." -Anon.
On 2007-06-12, 21:27, Timo Sirainen wrote:
Anyway, if the failure is non-critical such as "quota exceeded", Dovecot doesn't currently log anything. If the save wouldn't have succeeded, it would have given the error message in rejection mail. This patch makes it log the error always: http://hg.dovecot.org/dovecot-1.0/rev/abc4be10176e
Some more errors, after I put in the patch:
deliver(user1@domain1.tld): Jun 13 12:37:44 Info: msgid=463352340.80279575829514@domain2.tld: save failed to INBOX: Unknown error deliver(user1@domain1.tld): Jun 13 12:37:44 Info: msgid=463352340.80279575829514@domain2.tld: save failed to INBOX: Unknown error deliver(user2@domain3.tld): Jun 13 13:56:27 Info: msgid=28825971.1181735779007.JavaMail.nobody@domain4.tld: save failed to INBOX: Unknown error deliver(user2@domain3.tld): Jun 13 13:56:27 Info: msgid=17567431.1181735777893.JavaMail.nobody@domain4.tld: save failed to INBOX: Unknown error
There aren't any other logs except "saved mail to INBOX" -messages.
-- Erland Nylend
participants (3)
-
Erland Nylend
-
Ralf Hildebrandt
-
Timo Sirainen