[Dovecot] cannot update mailbox - unable to lock for exclusive access
Hi there,
We're using Dovecot version 1.0.7 and Postfix, and since upgrading our Linux box we're getting this in the maillog:
Nov 8 07:49:11 server1 postfix/local[27901]: 04B8E7081DA: to=xyz@xyz.com, orig_to=abc@abc.com, relay=local, delay=19, delays=0.07/0/0/19, dsn=4.2.0, status=deferred (cannot update mailbox /var/spool/mail/abc for user abc. unable to lock for exclusive access: Resource temporarily unavailable)
Postfix is currently set to: mailbox_delivery_lock = fcntl, dotlock
Dovecot has this: mbox_read_locks = fcntl mbox_write_locks = fcntl dotlock
I've scoured the web and tried all kinds of different locking mechanisms and combinations to no avail. The mail is eventually delivered but on a busy day this can take several hours.
In the evening it generally doesn't happen, which leads me to believe it occurs during the day when people have their mail clients open. However, this particular issue doesn't affect every user that has their mail client open, only some!
We currently have the mailboxes in mbox format - yes I know we should upgrade to maildir which we will eventually, however we've been using mbox for the last 10 years and this has only become an issue since upgrading.
What I have discovered this morning is a Dovecot connection that is open for 6 minutes before disconnection. During this 6 minutes the above problem occurs (new mail cannot get an exclusive lock). The same user will have connected and disconnected several times before and after, with a connection taking no more than a few seconds. But it seems sometimes the connection is taking longer than it should and I suspect the problem lies here.
Please can anyone help!
Thank you,
G
On 11/8/2012 2:29 AM, 1st WebDesigns wrote:
Hi there,
We're using Dovecot version 1.0.7 and Postfix, and since upgrading our Linux box we're getting this in the maillog:
1.0.7 is absolutely ancient and no longer officially supported. You need 1.2.x minimum, 2.x.x even better. And you say you just recently upgraded your Linux distro? What planet do you live on son? You're a few light years behind current stable software.
Nov 8 07:49:11 server1 postfix/local[27901]: 04B8E7081DA: to=xyz@xyz.com, orig_to=abc@abc.com, relay=local, delay=19, delays=0.07/0/0/19, dsn=4.2.0, status=deferred (cannot update mailbox /var/spool/mail/abc for user abc. unable to lock for exclusive access: Resource temporarily unavailable)
The simple permanent fix to Postfix/Dovecot mbox locking issues is switching from Postfix LOCAL to Dovecot LDA for mailbox delivery. 1.0.7 is before my time. I do not know if LDA was available then. Upgrade and you'll have it, and you'll also fix other problems you're not even aware of yet.
Postfix is currently set to: mailbox_delivery_lock = fcntl, dotlock
Dovecot has this: mbox_read_locks = fcntl mbox_write_locks = fcntl dotlock
LDA completely eliminates lock contention.
http://wiki.dovecot.org/LDA/Postfix http://wiki2.dovecot.org/LDA/Postfix
-- Stan
At 3AM -0600 on 8/11/12 you (Stan Hoeppner) wrote:
1.0.7 is absolutely ancient and no longer officially supported. You need 1.2.x minimum, 2.x.x even better. And you say you just recently upgraded your Linux distro? What planet do you live on son? You're a few light years behind current stable software.
[A light-year is a measure of distance, not of time.]
LDA completely eliminates lock contention.
As we have discussed before, using the LDA does not prevent lock contention, it just prevents the problems that arise when different software is using different locking strategies on the same mailbox (assuming nothing except LDA and imap is touching the mailbox directly).
There are valid reasons for not using the LDA: the OP might be already using procmail, for instance, and have users with procmail recipies which sort into IMAP folders. These folders will need to be locked by procmail even if the default delivery to INBOX is changed (globally) to happen through dovecot-lda. While migrating to sieve (and mdbox, and LMTP) would, IMHO, be the best long-term solution, this isn't necessarily something that can be set up overnight.
Ben
On 11/8/2012 5:53 PM, Ben Morrow wrote:
At 3AM -0600 on 8/11/12 you (Stan Hoeppner) wrote:
1.0.7 is absolutely ancient and no longer officially supported. You need 1.2.x minimum, 2.x.x even better. And you say you just recently upgraded your Linux distro? What planet do you live on son? You're a few light years behind current stable software.
[A light-year is a measure of distance, not of time.]
"metric fuckload" isn't a real measurement, but that doesn't stop people from [mis]using the term to get a point across. Don't arrogantly assume that intentional misuse of a term equals mouth breathing or knuckle dragging.
LDA completely eliminates lock contention.
As we have discussed before, using the LDA does not prevent lock contention, it just prevents the problems that arise when different software is using different locking strategies on the same mailbox (assuming nothing except LDA and imap is touching the mailbox directly).
You seem to have contradicted yourself. You described lock contention to a T, and stated Dovecot does prevents that "problem", but also said Dovecot doesn't prevent lock contention. File locking != lock contention. You can have the former without the latter.
There are valid reasons for not using the LDA: the OP might be already using procmail, for instance, and have users with procmail recipies which sort into IMAP folders. These folders will need to be locked by procmail even if the default delivery to INBOX is changed (globally) to happen through dovecot-lda. While migrating to sieve (and mdbox, and LMTP) would, IMHO, be the best long-term solution, this isn't necessarily something that can be set up overnight.
And? I'm failing to understand your point here. The OP hasn't stated yet, that I recall, if he's accessing the mbox files with anything other than Dovecot and Postfix. If he does state this, we'll make further recommendations as to how get across the LDA bridge with the same functionality, or if it's workable. None of that precludes making the LDA recommendation. Most people already running procmail or local UNIX MUAs are savvy enough to discover LDA before hitting this list. So you can assume with some surety that the OP who doesn't know about LDA likely isn't using procmail, mutt, pine, etc. Sure there are exceptions, but this is normally the case.
I think the problem here, given the tone of your prose above and correcting me on the use of "light year" of all damn things, is that my earlier praise directed at you due to your slightly greater knowledge of the intricacies of file locking, has given you the impression that I'm some kind of knuckle dragging noob in need of education by you. If that is the case please read my last 500 posts to this list to dispel that misconception.
You are my peer, not my superior. Keep that in mind in your future correspondence.
-- Stan
At 12PM -0600 on 10/11/12 you (Stan Hoeppner) wrote:
On 11/8/2012 5:53 PM, Ben Morrow wrote:
At 3AM -0600 on 8/11/12 you (Stan Hoeppner) wrote:
LDA completely eliminates lock contention.
As we have discussed before, using the LDA does not prevent lock contention, it just prevents the problems that arise when different software is using different locking strategies on the same mailbox (assuming nothing except LDA and imap is touching the mailbox directly).
You seem to have contradicted yourself. You described lock contention to a T, and stated Dovecot does prevents that "problem", but also said Dovecot doesn't prevent lock contention. File locking != lock contention. You can have the former without the latter.
The usual meaning of 'lock contention' is 'two processes legitimately competing for the *same* lock'. For instance, a search for 'lock contention' on Wikipedia leads to
lock contention: This occurs whenever one process or thread attempts
to acquire a lock held by another process or thread.
This will still occur when using the LDA: that is, there will still be occasions where the LDA and the imap process are competing for the mbox lock, and one ends up locking the other out temporarily.
The problems with locking that arise when accessing the same mailbox using both Dovecot and non-Dovecot software come from different processes using *different* locks from each other, or acquiring them in a different order. This is not ordinary lock contention: in fact, in the worst case, the two processes end up not having any locks in common, so you get no lock contention at all but data corruption instead.
There are valid reasons for not using the LDA: the OP might be already using procmail, for instance, and have users with procmail recipies which sort into IMAP folders. These folders will need to be locked by procmail even if the default delivery to INBOX is changed (globally) to happen through dovecot-lda. While migrating to sieve (and mdbox, and LMTP) would, IMHO, be the best long-term solution, this isn't necessarily something that can be set up overnight.
And? I'm failing to understand your point here.
You appear to be advocating the LDA as the solution to all possible problems with mail delivery, and implying anyone not using it is doing something seriously wrong. I was pointing out that that is not always the case.
The OP hasn't stated yet, that I recall, if he's accessing the mbox files with anything other than Dovecot and Postfix. If he does state this, we'll make further recommendations as to how get across the LDA bridge with the same functionality, or if it's workable. None of that precludes making the LDA recommendation. Most people already running procmail or local UNIX MUAs are savvy enough to discover LDA before hitting this list. So you can assume with some surety that the OP who doesn't know about LDA likely isn't using procmail, mutt, pine, etc. Sure there are exceptions, but this is normally the case.
I believe the OP mentioned something about having run out of mboxes for 20 years? To me that suggests an old-fashioned Unix setup, which in turn suggests procmail as a likely possibility. I could, of course, be wrong.
I think the problem here, given the tone of your prose above and correcting me on the use of "light year" of all damn things, is that my earlier praise directed at you due to your slightly greater knowledge of the intricacies of file locking, has given you the impression that I'm some kind of knuckle dragging noob in need of education by you. If that is the case please read my last 500 posts to this list to dispel that misconception.
You are my peer, not my superior. Keep that in mind in your future correspondence.
If I have offended you, I apologise. That was certainly not my intention.
Ben
On 11/10/2012 2:25 PM, Ben Morrow wrote:
The usual meaning of 'lock contention' is 'two processes legitimately competing for the *same* lock'.
Sure, this is the textbook definition, and software designers will discuss it as such in that context. However, when systems users use the term, in a production use context, they are using it in the context of problems resulting from it, performance or otherwise. I.e. if lock contention isn't causing problems, systems users will not be discussing it. With many things, including software, context is critical.
You appear to be advocating the LDA as the solution to all possible problems with mail delivery, and implying anyone not using it is doing something seriously wrong.
I advocated no such thing, nor implied such a thing. I stated that if one is using Postfix/local(8) for mbox delivery and Dovecot for POP/IMAP that s/he should switch to LDA (or LMTP) to eliminate any potential mbox locking problems; that it doesn't make sense to use Postfix/local(8) with Dovecot as there is no upside. Again, the context is mbox. Did you see me state this in relation to maildir?
I was pointing out that that is not always the case.
You seem to spend a lot of time pointing out exceptions.
I believe the OP mentioned something about having run out of mboxes for 20 years? To me that suggests an old-fashioned Unix setup, which in turn suggests procmail as a likely possibility. I could, of course, be wrong.
Examination of his log entry indicates he's not using procmail, but Postfix' local(8) delivery agent directly to the mailbox file:
Nov 8 07:49:11 server1 postfix/local[27901]: 04B8E7081DA: to=xyz@xyz.com, orig_to=abc@abc.com, relay=local, delay=19, delays=0.07/0/0/19, dsn=4.2.0, status=deferred (cannot update mailbox /var/spool/mail/abc for user abc. unable to lock for exclusive access: Resource temporarily unavailable)
"unable to lock for exclusive access: Resource temporarily unavailable" is a Postfix local(8) error message.
If procmail was configured, you'd likely see this instead:
...status=sent (delivered to command: /usr/bin/procmail...)
Then procmail would do the actual delivery to the mailbox (mbox) file, and if a locking problem occurred, it would be logged by procmail, and possibly a bounce sent to the sender. I'm not sure what, if any, error would be returned to local(8) as I've never used procmail.
If I have offended you, I apologise. That was certainly not my intention.
I was not offended, just a bit annoyed. When you attempted to correct my intentional misuse of "light year" it reminded me of something similar. Almost daily I've wanted to stomp on Europeans for sticking the indefinite article "an" in front of words beginning with consonants, when they should be using "a" instead, butchering the English language in the process:
"I need help with an Debian Linux install on an Dell server." "I'm installing an Dovecot cluster and need help configuring an Dovecot Director."
Supremely irritating, but I've never stomped on them, bit my tongue every time, as it's a waste of time. Maybe you could follow suit.
-- Stan
On 08/11/2012 23:53, Ben Morrow wrote:
At 3AM -0600 on 8/11/12 you (Stan Hoeppner) wrote:
1.0.7 is absolutely ancient and no longer officially supported. You need 1.2.x minimum, 2.x.x even better. And you say you just recently upgraded your Linux distro? What planet do you live on son? You're a few light years behind current stable software.
[A light-year is a measure of distance, not of time.]
LDA completely eliminates lock contention.
As we have discussed before, using the LDA does not prevent lock contention, it just prevents the problems that arise when different software is using different locking strategies on the same mailbox (assuming nothing except LDA and imap is touching the mailbox directly).
There are valid reasons for not using the LDA: the OP might be already using procmail, for instance, and have users with procmail recipies which sort into IMAP folders. These folders will need to be locked by procmail even if the default delivery to INBOX is changed (globally) to happen through dovecot-lda. While migrating to sieve (and mdbox, and LMTP) would, IMHO, be the best long-term solution, this isn't necessarily something that can be set up overnight.
Ben
No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.2221 / Virus Database: 2441/5382 - Release Date: 11/08/12
Thanks for your replies. I switched to Dovecot LDA this morning, but the issue still persists, albeit logged slightly differently by Dovecot now instead of Postfix:
"save failed to INBOX: Timeout while waiting for lock"
The reason is because some pop3 clients are holding their connection for 5 or 6 minutes (don't ask me why - and the iPhone seems to be the major culprit).
In dovecot.conf I changed:
mbox_lock_timeout = 300
to
mbox_lock_timeout = 600
Which seems to have helped. I am unclear if this value only applied to Dovecot LDA or if it would have worked previously before switching to Dovecot LDA?
On 11/12/2012 5:15 AM, 1st WebDesigns wrote:
Thanks for your replies. I switched to Dovecot LDA this morning, but the issue still persists, albeit logged slightly differently by Dovecot now instead of Postfix:
"save failed to INBOX: Timeout while waiting for lock"
The reason is because some pop3 clients
Full stop. This is the first time you've mentioned POP that I recall. FYI, Dovecot is primarily an IMAP server. Unless an OP states up front that he's using primarily POP, everyone assumes IMAP and counsels accordingly. You should have stated POP in your first post. Actually, you should have included many more details prior to now. Please post your complete 'dovecot -n' output.
are holding their connection for 5 or 6 minutes (don't ask me why - and the iPhone seems to be the major culprit).
I'm no smartphone POP expert, but old rural tower, poor tower connection, etc, all cause low data rates, which could cause this. However, you state this problem cropped up out of nowhere after a distro upgrade to CentOS 5. Can you confirm that the problem didn't exist before the upgrade? Your definitive answer to this question dictates the troubleshooting course of action.
In dovecot.conf I changed:
mbox_lock_timeout = 300
to
mbox_lock_timeout = 600
Which seems to have helped. I am unclear if this value only applied to Dovecot LDA or if it would have worked previously before switching to Dovecot LDA?
This simply changes how long Dovecot will wait to acquire a lock. Increasing this value simply increases delays, masks the underlying problem without really helping much.
The only real architectural solution to such a POP/mbox locking problem due to slow/long client downloads is, as you mentioned, moving to a lockless mailbox format, such as maildir or sdbox.
Worth noting, we are both/all at fault in the slow progress of this issue, you for not stating POP up front, and me/us for not asking.
Your 'dovecot -n' output may allow us to help get mbox working a little better, but the long term solution is very likely moving to maildir/sdbox.
-- Stan
Output of dovecot -n is as follows:
# 1.0.7: /etc/dovecot.conf login_dir: /var/run/dovecot/login login_executable(default): /usr/libexec/dovecot/imap-login login_executable(imap): /usr/libexec/dovecot/imap-login login_executable(pop3): /usr/libexec/dovecot/pop3-login mail_privileged_group: mail mail_location: mbox:~/mail:INBOX=/var/mail/%u mbox_lock_timeout: 600 mail_executable(default): /usr/libexec/dovecot/imap mail_executable(imap): /usr/libexec/dovecot/imap mail_executable(pop3): /usr/libexec/dovecot/pop3 mail_plugin_dir(default): /usr/lib64/dovecot/imap mail_plugin_dir(imap): /usr/lib64/dovecot/imap mail_plugin_dir(pop3): /usr/lib64/dovecot/pop3 auth default: passdb: driver: pam userdb: driver: passwd
We upgraded from RedHat 4 to RedHat 5. The problem didn't exist with RH4 and an even older version of Dovecot.
When emails are stuck in the queue, doing this:
lsof /var/spool/mail/<user>
shows the spool file in use by a pop3 login and the Dovecot deliver process. Since changing mbox_lock_timeout from 300 to 600 the pop3 process eventually finishes before 600 seconds and the deliver process is able to complete. I admit this is masking the problem rather than solving it.
As discussed before our version of Dovecot is dated now, however it's the version provided by RedHat and the version supported by our support company (who aren't doing a great job, hence me posting here).
Thanks,
On 22/11/2012 12:09, Stan Hoeppner wrote:
On 11/12/2012 5:15 AM, 1st WebDesigns wrote:
Thanks for your replies. I switched to Dovecot LDA this morning, but the issue still persists, albeit logged slightly differently by Dovecot now instead of Postfix:
"save failed to INBOX: Timeout while waiting for lock"
The reason is because some pop3 clients
Full stop. This is the first time you've mentioned POP that I recall. FYI, Dovecot is primarily an IMAP server. Unless an OP states up front that he's using primarily POP, everyone assumes IMAP and counsels accordingly. You should have stated POP in your first post. Actually, you should have included many more details prior to now. Please post your complete 'dovecot -n' output.
are holding their connection for 5 or 6 minutes (don't ask me why - and the iPhone seems to be the major culprit).
I'm no smartphone POP expert, but old rural tower, poor tower connection, etc, all cause low data rates, which could cause this. However, you state this problem cropped up out of nowhere after a distro upgrade to CentOS 5. Can you confirm that the problem didn't exist before the upgrade? Your definitive answer to this question dictates the troubleshooting course of action.
In dovecot.conf I changed:
mbox_lock_timeout = 300
to
mbox_lock_timeout = 600
Which seems to have helped. I am unclear if this value only applied to Dovecot LDA or if it would have worked previously before switching to Dovecot LDA?
This simply changes how long Dovecot will wait to acquire a lock. Increasing this value simply increases delays, masks the underlying problem without really helping much.
The only real architectural solution to such a POP/mbox locking problem due to slow/long client downloads is, as you mentioned, moving to a lockless mailbox format, such as maildir or sdbox.
Worth noting, we are both/all at fault in the slow progress of this issue, you for not stating POP up front, and me/us for not asking.
Your 'dovecot -n' output may allow us to help get mbox working a little better, but the long term solution is very likely moving to maildir/sdbox.
On 11/22/2012 3:26 PM, 1st WebDesigns wrote:
Output of dovecot -n is as follows:
# 1.0.7: /etc/dovecot.conf login_dir: /var/run/dovecot/login login_executable(default): /usr/libexec/dovecot/imap-login login_executable(imap): /usr/libexec/dovecot/imap-login login_executable(pop3): /usr/libexec/dovecot/pop3-login mail_privileged_group: mail mail_location: mbox:~/mail:INBOX=/var/mail/%u mbox_lock_timeout: 600 mail_executable(default): /usr/libexec/dovecot/imap mail_executable(imap): /usr/libexec/dovecot/imap mail_executable(pop3): /usr/libexec/dovecot/pop3 mail_plugin_dir(default): /usr/lib64/dovecot/imap mail_plugin_dir(imap): /usr/lib64/dovecot/imap mail_plugin_dir(pop3): /usr/lib64/dovecot/pop3 auth default: passdb: driver: pam userdb: driver: passwd
Are your mailboxes on NFS storage? You haven't stated on what storage your mailboxes reside. NFS complicates locking. If you use an NFS server, did anything on it change recently, such as an upgrade to RHEL5?
I found a thread stating RHEL5 has a bad FCNTL implementation that could be related to your write lock delay problem. Try using dotlock only for read and write and see if that helps. It has additional filesystem IO overhead, but nothing like the many minutes of delay you have now.
mbox_read_locks = dotlock mbox_write_locks = dotlock
We upgraded from RedHat 4 to RedHat 5. The problem didn't exist with RH4 and an even older version of Dovecot.
That may be, but you're surely not planning on downgrading back to RHEL4.
When emails are stuck in the queue, doing this:
Dovecot doesn't use queues. It writes directly to the mailbox files.
lsof /var/spool/mail/<user>
These are mailbox files, your user inbox mbox files, not spool files. Spool implies temporary storage. Don't let "spool" fool you. On many/most systems /var/spool/mail is a link to /var/mail.
shows the spool file in use by a pop3 login and the Dovecot deliver process. Since changing mbox_lock_timeout from 300 to 600 the pop3 process eventually finishes before 600 seconds and the deliver process is able to complete. I admit this is masking the problem rather than solving it.
Does the larger timeout value completely eliminate the errors? If so this may be the best outcome you can get with Dovecot 1.0.7, mbox storage, on RHEL5, unless a different locking method fixes it.
As discussed before our version of Dovecot is dated now, however it's the version provided by RedHat and the version supported by our support company (who aren't doing a great job, hence me posting here).
It's the version provided by RHEL5. RHEL6.3 has Dovecot 2.0.9. There are 3rd party 1.2.x RPMs available for RHEL5.x as well as 2.x.x RPMs for RHEL5.x.
What "support company"? If you're using RHEL, Red Hat provides the support. That's the whole reason for "paying for" a Linux distro. What is preventing you from upgrading to RHEL 6.3, the current release? Which BTW is behind nearly all other distros WRT package versions. For instance Debian stable has Dovecot 2.1.7 available in the backports repo.
-- Stan
On 22/11/2012 12:09, Stan Hoeppner wrote:
On 11/12/2012 5:15 AM, 1st WebDesigns wrote:
Thanks for your replies. I switched to Dovecot LDA this morning, but the issue still persists, albeit logged slightly differently by Dovecot now instead of Postfix:
"save failed to INBOX: Timeout while waiting for lock"
The reason is because some pop3 clients
Full stop. This is the first time you've mentioned POP that I recall. FYI, Dovecot is primarily an IMAP server. Unless an OP states up front that he's using primarily POP, everyone assumes IMAP and counsels accordingly. You should have stated POP in your first post. Actually, you should have included many more details prior to now. Please post your complete 'dovecot -n' output.
are holding their connection for 5 or 6 minutes (don't ask me why - and the iPhone seems to be the major culprit).
I'm no smartphone POP expert, but old rural tower, poor tower connection, etc, all cause low data rates, which could cause this. However, you state this problem cropped up out of nowhere after a distro upgrade to CentOS 5. Can you confirm that the problem didn't exist before the upgrade? Your definitive answer to this question dictates the troubleshooting course of action.
In dovecot.conf I changed:
mbox_lock_timeout = 300
to
mbox_lock_timeout = 600
Which seems to have helped. I am unclear if this value only applied to Dovecot LDA or if it would have worked previously before switching to Dovecot LDA?
This simply changes how long Dovecot will wait to acquire a lock. Increasing this value simply increases delays, masks the underlying problem without really helping much.
The only real architectural solution to such a POP/mbox locking problem due to slow/long client downloads is, as you mentioned, moving to a lockless mailbox format, such as maildir or sdbox.
Worth noting, we are both/all at fault in the slow progress of this issue, you for not stating POP up front, and me/us for not asking.
Your 'dovecot -n' output may allow us to help get mbox working a little better, but the long term solution is very likely moving to maildir/sdbox.
On 23/11/2012 06:07, Stan Hoeppner wrote:
On 11/22/2012 3:26 PM, 1st WebDesigns wrote:
Output of dovecot -n is as follows:
# 1.0.7: /etc/dovecot.conf login_dir: /var/run/dovecot/login login_executable(default): /usr/libexec/dovecot/imap-login login_executable(imap): /usr/libexec/dovecot/imap-login login_executable(pop3): /usr/libexec/dovecot/pop3-login mail_privileged_group: mail mail_location: mbox:~/mail:INBOX=/var/mail/%u mbox_lock_timeout: 600 mail_executable(default): /usr/libexec/dovecot/imap mail_executable(imap): /usr/libexec/dovecot/imap mail_executable(pop3): /usr/libexec/dovecot/pop3 mail_plugin_dir(default): /usr/lib64/dovecot/imap mail_plugin_dir(imap): /usr/lib64/dovecot/imap mail_plugin_dir(pop3): /usr/lib64/dovecot/pop3 auth default: passdb: driver: pam userdb: driver: passwd
Are your mailboxes on NFS storage? You haven't stated on what storage your mailboxes reside. NFS complicates locking. If you use an NFS server, did anything on it change recently, such as an upgrade to RHEL5?
No they are not on NFS storage, the mailboxes are stored on the local filesystem.
I found a thread stating RHEL5 has a bad FCNTL implementation that could be related to your write lock delay problem. Try using dotlock only for read and write and see if that helps. It has additional filesystem IO overhead, but nothing like the many minutes of delay you have now.
mbox_read_locks = dotlock mbox_write_locks = dotlock
Thank you I will try this. I did read that when using Postfix and Dovecot, both systems should use a matching locking mechanism, which I had already tried. However, I hadn't tried just dotlock, only FCNTL and a combination of FCNTL and dotlock.
We upgraded from RedHat 4 to RedHat 5. The problem didn't exist with RH4 and an even older version of Dovecot.
That may be, but you're surely not planning on downgrading back to RHEL4.
No, not at all.
When emails are stuck in the queue, doing this:
Dovecot doesn't use queues. It writes directly to the mailbox files.
lsof /var/spool/mail/<user>
These are mailbox files, your user inbox mbox files, not spool files. Spool implies temporary storage. Don't let "spool" fool you. On many/most systems /var/spool/mail is a link to /var/mail.
Yes that's correct.
shows the spool file in use by a pop3 login and the Dovecot deliver process. Since changing mbox_lock_timeout from 300 to 600 the pop3 process eventually finishes before 600 seconds and the deliver process is able to complete. I admit this is masking the problem rather than solving it.
Does the larger timeout value completely eliminate the errors? If so this may be the best outcome you can get with Dovecot 1.0.7, mbox storage, on RHEL5, unless a different locking method fixes it.
Yes it completely eliminates the errors. If a pop3 connection has the lock, the mail simply sits there and is eventually delivered in (less than) 600 seconds. Whereas before, it would get deferred. When re-delivery was attempted, it's possible that the box would be locked again, and the mail would get deferred again, leading to a delay of several hours on a busy day.
As discussed before our version of Dovecot is dated now, however it's the version provided by RedHat and the version supported by our support company (who aren't doing a great job, hence me posting here).
It's the version provided by RHEL5. RHEL6.3 has Dovecot 2.0.9. There are 3rd party 1.2.x RPMs available for RHEL5.x as well as 2.x.x RPMs for RHEL5.x.
What "support company"? If you're using RHEL, Red Hat provides the support. That's the whole reason for "paying for" a Linux distro. What is preventing you from upgrading to RHEL 6.3, the current release? Which BTW is behind nearly all other distros WRT package versions. For instance Debian stable has Dovecot 2.1.7 available in the backports repo.
Our server is with Rackspace, and RHEL5 is the OS they offered us as an upgrade path from RHEL4. So they're getting the support from Red Hat and we're getting the support from Rackspace.
On 11/23/2012 5:36 AM, 1st WebDesigns wrote:
No they are not on NFS storage, the mailboxes are stored on the local filesystem.
Ok, good.
Thank you I will try this. I did read that when using Postfix and Dovecot, both systems should use a matching locking mechanism, which I had already tried. However, I hadn't tried just dotlock, only FCNTL and a combination of FCNTL and dotlock.
Since you're now using Dovecot LDA the locking mech may not make much if any difference, but it's worth trying.
Yes it completely eliminates the errors. If a pop3 connection has the lock, the mail simply sits there and is eventually delivered in (less than) 600 seconds. Whereas before, it would get deferred. When re-delivery was attempted, it's possible that the box would be locked again, and the mail would get deferred again, leading to a delay of several hours on a busy day.
So this is a step in the right direction. But still far less than optimal. The read/write lock contention on mbox is unnecessarily eating up system resources (mainly memory), and causing unnecessary delivery delays to the mailbox. You should really start looking at migrating to maildir. It's not that difficult (though maybe more so with 1.0.7) if you don't have a ton of mailboxes, and especially with POP since the mailboxes typically wont be holding much mail to migrate. How many do you have?
Our server is with Rackspace, and RHEL5 is the OS they offered us as an upgrade path from RHEL4. So they're getting the support from Red Hat and we're getting the support from Rackspace.
The plot thickens again. You're using a rented server. Sigh...
This entire thread could have been greatly shortened, saving all of us much time, if you'd have given all these details up front.
Is this a cloud server (shared host), or a dedicated server?
FWIW, you don't have RHEL5, but CentOS 5. Hosting companies don't pay for RHEL licenses for 10s of thousands of hosts.
I have a few salient recommendations for you:
- Migrate to maildir. It is far more appropriate for a POP workload.
- Switch to a hosting provider that offers much more recent software.
- Or, get a colo server so you can use whatever software you wish.
Finally, if this email service you're providing isn't all that critical to you or your organization, simply prod along as you have been, fighting these problems frequently along the way.
-- Stan
So this is a step in the right direction. But still far less than optimal. The read/write lock contention on mbox is unnecessarily eating up system resources (mainly memory), and causing unnecessary delivery delays to the mailbox. You should really start looking at migrating to maildir. It's not that difficult (though maybe more so with 1.0.7) if you don't have a ton of mailboxes, and especially with POP since the mailboxes typically wont be holding much mail to migrate. How many do you have?
There's around four hundred mail boxes or so. Some used more intensively than others.
Our server is with Rackspace, and RHEL5 is the OS they offered us as an upgrade path from RHEL4. So they're getting the support from Red Hat and we're getting the support from Rackspace.
The plot thickens again. You're using a rented server. Sigh...
This entire thread could have been greatly shortened, saving all of us much time, if you'd have given all these details up front.
Is this a cloud server (shared host), or a dedicated server?
It's a dedicated server
FWIW, you don't have RHEL5, but CentOS 5. Hosting companies don't pay for RHEL licenses for 10s of thousands of hosts.
It's RHEL5:
$cat /etc/issue Red Hat Enterprise Linux Server release 5.8 (Tikanga)
The cost of the license is included in our contract.
I have a few salient recommendations for you:
- Migrate to maildir. It is far more appropriate for a POP workload.
Yes, this will be our next course of action
- Switch to a hosting provider that offers much more recent software.
We can upgrade the software if we wish, but will no longer get full support from Rackspace if we do this.
- Or, get a colo server so you can use whatever software you wish.
We can install whatever software we wish at the moment, but see the point above.
Finally, if this email service you're providing isn't all that critical to you or your organization, simply prod along as you have been, fighting these problems frequently along the way.
It's kind of working ok now but we will go with your recommendation of switching to maildir when we have time. Thanks for your help
On 11/26/2012 1:58 PM, 1st WebDesigns wrote:
So this is a step in the right direction. But still far less than optimal. The read/write lock contention on mbox is unnecessarily eating up system resources (mainly memory), and causing unnecessary delivery delays to the mailbox. You should really start looking at migrating to maildir. It's not that difficult (though maybe more so with 1.0.7) if you don't have a ton of mailboxes, and especially with POP since the mailboxes typically wont be holding much mail to migrate. How many do you have?
There's around four hundred mail boxes or so. Some used more intensively than others.
There are methods to convert one mailbox at a time, groups of mailboxes, or all mailboxes in one fell swoop in a batch mode. I'm uncertain WRT the status of the tools in 1.0.7, but given the age of that release you may avoid problems by upgrading to Dovecot 1.2.x or later before doing the conversion. If you attempt the conversion on 1.0.7 and hit snags, this mailing list may not be of much help as nobody has used 1.0.7 for years. You may want to post a new thread asking Timo about such a conversion with 1.0.7. He doesn't seem to be paying attention to this thread.
Our server is with Rackspace, and RHEL5 is the OS they offered us as an upgrade path from RHEL4. So they're getting the support from Red Hat and we're getting the support from Rackspace.
The plot thickens again. You're using a rented server. Sigh...
This entire thread could have been greatly shortened, saving all of us much time, if you'd have given all these details up front.
Is this a cloud server (shared host), or a dedicated server?
It's a dedicated server
FWIW, you don't have RHEL5, but CentOS 5. Hosting companies don't pay for RHEL licenses for 10s of thousands of hosts.
It's RHEL5:
$cat /etc/issue Red Hat Enterprise Linux Server release 5.8 (Tikanga)
The cost of the license is included in our contract.
Now that's interesting.
I have a few salient recommendations for you:
- Migrate to maildir. It is far more appropriate for a POP workload.
Yes, this will be our next course of action
- Switch to a hosting provider that offers much more recent software.
We can upgrade the software if we wish, but will no longer get full support from Rackspace if we do this.
And you consider this a net loss? If you're that dependent on your provider's tit, find one that can suckle you on RHEL 6.3. Or buy your copy/license directly from Red Hat and get support directly from them.
- Or, get a colo server so you can use whatever software you wish.
We can install whatever software we wish at the moment, but see the point above.
See my point above. And WRT Dovecot and most other application software, you'll get better support from the community than your bulk hosting provider anyway. Their primary business is making $$ from providing you a host and a pipe. Customer support is a cost, especially application support, not a profit center, and thus is almost always a secondary concern at best. Red Hat's entire business model is customer support, same for SuSE.
Finally, if this email service you're providing isn't all that critical to you or your organization, simply prod along as you have been, fighting these problems frequently along the way.
It's kind of working ok now but we will go with your recommendation of switching to maildir when we have time. Thanks for your help
As I said, you can migrate users individually. You could easily do 10 users a day during coffee breaks etc and be done in a month plus. Do 40 a day and you're done in 10 days. The only time you'll burn is in the learning curve, not the actual mailbox migration which takes no time at all with POP accounts.
Always test with a dummy mailbox first to iron out any issues. Then start migrating the problem users first, the smart phone users who tie up their mailboxes for many minutes during download.
-- Stan
Thanks, all your comments are noted.
As I said, you can migrate users individually. You could easily do 10 users a day during coffee breaks etc and be done in a month plus. Do 40 a day and you're done in 10 days. The only time you'll burn is in the learning curve, not the actual mailbox migration which takes no time at all with POP accounts.
That's interesting, as I (wrongly) assumed switching from mbox to maildir was an all or nothing process. You're saying we can run half the mailboxes in mbox format and the other half in maildir format?
In which case we can get going with this sooner than I thought.
Always test with a dummy mailbox first to iron out any issues. Then start migrating the problem users first, the smart phone users who tie up their mailboxes for many minutes during download.
Thank you - I would probably start with the CEO's mailbox first and then go from there >:-D
On 11/26/2012 3:39 PM, 1st WebDesigns wrote:
Thanks, all your comments are noted.
As I said, you can migrate users individually. You could easily do 10 users a day during coffee breaks etc and be done in a month plus. Do 40 a day and you're done in 10 days. The only time you'll burn is in the learning curve, not the actual mailbox migration which takes no time at all with POP accounts.
That's interesting, as I (wrongly) assumed switching from mbox to maildir was an all or nothing process. You're saying we can run half the mailboxes in mbox format and the other half in maildir format?
In which case we can get going with this sooner than I thought.
Yes, this can be done. But if you're using UNIX system user accounts IIRC you'll have to convert to virtual users before you can migrate one user at a time. Virtual user setup is required to change mail_location on a per user basis. With system users mail_location is defined once for all users. Converting to virtual users first makes the process more painful. I've not done such a POP mbox<>maildir migration myself, so hopefully someone who has will chime in. If not start a new thread called "need POP mbox<>maildir migration help" or similar.
And again, I wouldn't try any of this with 1.0.7. Upgrade to at least 1.2.x first.
Always test with a dummy mailbox first to iron out any issues. Then start migrating the problem users first, the smart phone users who tie up their mailboxes for many minutes during download.
Thank you - I would probably start with the CEO's mailbox first and then go from there >:-D
Start a new thread as I suggested. State your version, current user account type (system or virtual), and post your dovecot -n at the end of the email. You'll get many more helpful suggestions and insight from people who've actually done this migration.
-- Stan
participants (3)
-
1st WebDesigns
-
Ben Morrow
-
Stan Hoeppner