On 11/22/2012 3:26 PM, 1st WebDesigns wrote:
Output of dovecot -n is as follows:
# 1.0.7: /etc/dovecot.conf login_dir: /var/run/dovecot/login login_executable(default): /usr/libexec/dovecot/imap-login login_executable(imap): /usr/libexec/dovecot/imap-login login_executable(pop3): /usr/libexec/dovecot/pop3-login mail_privileged_group: mail mail_location: mbox:~/mail:INBOX=/var/mail/%u mbox_lock_timeout: 600 mail_executable(default): /usr/libexec/dovecot/imap mail_executable(imap): /usr/libexec/dovecot/imap mail_executable(pop3): /usr/libexec/dovecot/pop3 mail_plugin_dir(default): /usr/lib64/dovecot/imap mail_plugin_dir(imap): /usr/lib64/dovecot/imap mail_plugin_dir(pop3): /usr/lib64/dovecot/pop3 auth default: passdb: driver: pam userdb: driver: passwd
Are your mailboxes on NFS storage? You haven't stated on what storage your mailboxes reside. NFS complicates locking. If you use an NFS server, did anything on it change recently, such as an upgrade to RHEL5?
I found a thread stating RHEL5 has a bad FCNTL implementation that could be related to your write lock delay problem. Try using dotlock only for read and write and see if that helps. It has additional filesystem IO overhead, but nothing like the many minutes of delay you have now.
mbox_read_locks = dotlock mbox_write_locks = dotlock
We upgraded from RedHat 4 to RedHat 5. The problem didn't exist with RH4 and an even older version of Dovecot.
That may be, but you're surely not planning on downgrading back to RHEL4.
When emails are stuck in the queue, doing this:
Dovecot doesn't use queues. It writes directly to the mailbox files.
lsof /var/spool/mail/<user>
These are mailbox files, your user inbox mbox files, not spool files. Spool implies temporary storage. Don't let "spool" fool you. On many/most systems /var/spool/mail is a link to /var/mail.
shows the spool file in use by a pop3 login and the Dovecot deliver process. Since changing mbox_lock_timeout from 300 to 600 the pop3 process eventually finishes before 600 seconds and the deliver process is able to complete. I admit this is masking the problem rather than solving it.
Does the larger timeout value completely eliminate the errors? If so this may be the best outcome you can get with Dovecot 1.0.7, mbox storage, on RHEL5, unless a different locking method fixes it.
As discussed before our version of Dovecot is dated now, however it's the version provided by RedHat and the version supported by our support company (who aren't doing a great job, hence me posting here).
It's the version provided by RHEL5. RHEL6.3 has Dovecot 2.0.9. There are 3rd party 1.2.x RPMs available for RHEL5.x as well as 2.x.x RPMs for RHEL5.x.
What "support company"? If you're using RHEL, Red Hat provides the support. That's the whole reason for "paying for" a Linux distro. What is preventing you from upgrading to RHEL 6.3, the current release? Which BTW is behind nearly all other distros WRT package versions. For instance Debian stable has Dovecot 2.1.7 available in the backports repo.
-- Stan
On 22/11/2012 12:09, Stan Hoeppner wrote:
On 11/12/2012 5:15 AM, 1st WebDesigns wrote:
Thanks for your replies. I switched to Dovecot LDA this morning, but the issue still persists, albeit logged slightly differently by Dovecot now instead of Postfix:
"save failed to INBOX: Timeout while waiting for lock"
The reason is because some pop3 clients
Full stop. This is the first time you've mentioned POP that I recall. FYI, Dovecot is primarily an IMAP server. Unless an OP states up front that he's using primarily POP, everyone assumes IMAP and counsels accordingly. You should have stated POP in your first post. Actually, you should have included many more details prior to now. Please post your complete 'dovecot -n' output.
are holding their connection for 5 or 6 minutes (don't ask me why - and the iPhone seems to be the major culprit).
I'm no smartphone POP expert, but old rural tower, poor tower connection, etc, all cause low data rates, which could cause this. However, you state this problem cropped up out of nowhere after a distro upgrade to CentOS 5. Can you confirm that the problem didn't exist before the upgrade? Your definitive answer to this question dictates the troubleshooting course of action.
In dovecot.conf I changed:
mbox_lock_timeout = 300
to
mbox_lock_timeout = 600
Which seems to have helped. I am unclear if this value only applied to Dovecot LDA or if it would have worked previously before switching to Dovecot LDA?
This simply changes how long Dovecot will wait to acquire a lock. Increasing this value simply increases delays, masks the underlying problem without really helping much.
The only real architectural solution to such a POP/mbox locking problem due to slow/long client downloads is, as you mentioned, moving to a lockless mailbox format, such as maildir or sdbox.
Worth noting, we are both/all at fault in the slow progress of this issue, you for not stating POP up front, and me/us for not asking.
Your 'dovecot -n' output may allow us to help get mbox working a little better, but the long term solution is very likely moving to maildir/sdbox.