[Dovecot] IMAP "freezing" on OSX
Hi,
I've got a problem with dovecot on an Intel Mac OSX box. Basically, I was running an older verion (0.8.?, I think), and everything was fine. Then, when I upgraded to 1.0.0 via MacPorts, I found that periodically the system gets into a state where it simply stops dealing with IMAP requests.
This tends to manifest itself as Thunderbird simply stalling, not fetching any new messages or folder contents. Sometimes quitting and restarting Thunderbird will help, but at other times the problem persists even afterwards. It seems to be linked to specific mailboxes (or messages?) - often one "problem folder" will reproduce the effect 100% whilst operating on the rest of the mailbox is fine.
The problem /seems/ to often start as a result of trying to move messages between folders. I'm not certain if that's a root cause, though, or simply a consequence of "opening" an affected folder to perform the move. I have a spam folder with a large number (>10000) of messages in it, and trying to file stuff in there is a relatively surefire way to generate the problem.
Once it's happened once, it inevitably escalates - restarting the client helps a bit, as does restarting dovecot. However, the only way to "fix" it and restore full functionality seems to be to delete all of the index and dovecot* files from the mailbox and folders... it generally takes a day or so after that before things start to go wrong again.
All of this seems to point to corruption of the index files in some way, but the log is universally unhelpful - there are no error messages of any kind showing up there. The actual dovecot instance doesn't appear to crash or even "hang" properly - you can still make inbound IMAP connections OK, it's just that they don't ever give a response to commands that touch the affected mailboxes.
Sometimes, I end up with temporary files lying around in the folders in the mailbox - apparently mails which were in the process of being moved when the server "gave up". I'm not sure if this is actually part of the problem, though, or just a symptom.
My configuration (from dovecot -n) is:
# /opt/local/etc/dovecot/dovecot.conf ssl_disable: yes disable_plaintext_auth: no login_dir: /opt/local/var/run/dovecot/login login_executable: /opt/local/libexec/dovecot/imap-login mail_location: maildir:~/Library/Mail/IMAP auth default: passdb: driver: pam args: * userdb: driver: passwd
I'm running this on my personal machine, so there's only one user on the system, which I have set up using MailDir (on HFS). There's a fair bit of mail in many of my folders, but I doubt that's the problem as the previous version didn't seem to have any issues...
I'm doing mail delivery using postfix directly into the maildir folders. That doesn't seem to be affected at all by this, though - mail still arrives fine even when Dovecot is failing to serve the folder contents to IMAP clients... you just can't see it ^-^
Has anyone seen an effect like this? Or does anyone have any pointers as to what I could try to debug it?
Any suggestions would be very greatly appreciated!
Thanks!
Ben Carter - ben@gunk.demon.co.uk / ben@saillune.net (preferred)
are you using the dovecot lda, or postfix, or procmail? i've been
using dovecot on a PPC mac os x box for quite some time, but had to
abandon the dovecot lda because of exactly the problems you
describe. i switched to postfix, then procmail, and have not had any
problems since.
-SM-
On May 6, 2007, at 5:49 AM, Ben Carter wrote:
Hi,
I've got a problem with dovecot on an Intel Mac OSX box.
Basically, I was running an older verion (0.8.?, I think), and
everything was fine. Then, when I upgraded to 1.0.0 via MacPorts, I
found that periodically the system gets into a state where it
simply stops dealing with IMAP requests.This tends to manifest itself as Thunderbird simply stalling, not
fetching any new messages or folder contents. Sometimes quitting
and restarting Thunderbird will help, but at other times the
problem persists even afterwards. It seems to be linked to specific
mailboxes (or messages?) - often one "problem folder" will
reproduce the effect 100% whilst operating on the rest of the
mailbox is fine.The problem /seems/ to often start as a result of trying to move
messages between folders. I'm not certain if that's a root cause,
though, or simply a consequence of "opening" an affected folder to
perform the move. I have a spam folder with a large number (>10000)
of messages in it, and trying to file stuff in there is a
relatively surefire way to generate the problem.Once it's happened once, it inevitably escalates - restarting the
client helps a bit, as does restarting dovecot. However, the only
way to "fix" it and restore full functionality seems to be to
delete all of the index and dovecot* files from the mailbox and
folders... it generally takes a day or so after that before things
start to go wrong again.All of this seems to point to corruption of the index files in
some way, but the log is universally unhelpful - there are no error
messages of any kind showing up there. The actual dovecot instance
doesn't appear to crash or even "hang" properly - you can still
make inbound IMAP connections OK, it's just that they don't ever
give a response to commands that touch the affected mailboxes.Sometimes, I end up with temporary files lying around in the
folders in the mailbox - apparently mails which were in the process
of being moved when the server "gave up". I'm not sure if this is
actually part of the problem, though, or just a symptom.My configuration (from dovecot -n) is:
# /opt/local/etc/dovecot/dovecot.conf ssl_disable: yes disable_plaintext_auth: no login_dir: /opt/local/var/run/dovecot/login login_executable: /opt/local/libexec/dovecot/imap-login mail_location: maildir:~/Library/Mail/IMAP auth default: passdb: driver: pam args: * userdb: driver: passwd
I'm running this on my personal machine, so there's only one user
on the system, which I have set up using MailDir (on HFS). There's
a fair bit of mail in many of my folders, but I doubt that's the
problem as the previous version didn't seem to have any issues...I'm doing mail delivery using postfix directly into the maildir
folders. That doesn't seem to be affected at all by this, though -
mail still arrives fine even when Dovecot is failing to serve the
folder contents to IMAP clients... you just can't see it ^-^Has anyone seen an effect like this? Or does anyone have any
pointers as to what I could try to debug it?Any suggestions would be very greatly appreciated!
Thanks!
Ben Carter - ben@gunk.demon.co.uk / ben@saillune.net (preferred)
Scott Murman wrote:
are you using the dovecot lda, or postfix, or procmail? i've been using dovecot on a PPC mac os x box for quite some time, but had to abandon the dovecot lda because of exactly the problems you describe. i switched to postfix, then procmail, and have not had any problems since.
Hmm... an interesting question - not as far as I know, though! Unless the act of updating Dovecot has reconfigured something...
<checks>
Nope, incoming mail is still going through Fetchmail->Postfix->Procmail, unless there's some clever hook into the system after Procmail gets its hands on it that I can't spot (procmail is just set to deliver to a specified maildir folder).
Even when Dovecot isn't behaving, mail still arrives fine, so I'm pretty sure that isn't the problem (and my Procmail rules are all still working, so at the very least mail is getting there OK...)
It's, erm, good to hear that someone else has seen this, though, and I'm not just going mad...
Hmm... could it be LDA-related, though? I thought maildir was essentially lock-free, but is there any chance that Dovecot is getting confused because of local delivery operations taking place whilst it's working on the mailbox...?
Thanks for the suggestion!
Ben Carter - ben@gunk.demon.co.uk / ben@saillune.net (preferred)
Nope, incoming mail is still going through Fetchmail->Postfix-
Procmail, unless there's some clever hook into the system after
Procmail gets its hands on it that I can't spot (procmail is just
set to deliver to a specified maildir folder).Even when Dovecot isn't behaving, mail still arrives fine, so I'm
pretty sure that isn't the problem (and my Procmail rules are all
still working, so at the very least mail is getting there OK...)
yes, this sounds different than my issues. in my case i'm pretty
sure the dovecot LDA has a locking issue which manifests under high
load (read: being spammed) which shuts out incoming mail and folder
queries, etc. once it gets bunged, there's nothing to do be restart.
-SM-
On Sun, 2007-05-06 at 13:49 +0100, Ben Carter wrote:
The problem /seems/ to often start as a result of trying to move messages between folders. I'm not certain if that's a root cause, though, or simply a consequence of "opening" an affected folder to perform the move. I have a spam folder with a large number (>10000) of messages in it, and trying to file stuff in there is a relatively surefire way to generate the problem.
Try if dotlock_use_excl=yes helps. That setting was made to fix hard linking problems with HFS.
Otherwise http://wiki.dovecot.org/Debugging/ProcessTracing might show up something useful when looking at what imap process is doing at the time it seems to be hanging.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On May 9, 2007, at 2:35 AM, Timo Sirainen wrote:
- PGP Signed by an unknown key
On Sun, 2007-05-06 at 13:49 +0100, Ben Carter wrote:
The problem /seems/ to often start as a result of trying to move messages between folders. I'm not certain if that's a root cause, though, or simply a consequence of "opening" an affected folder to perform the move. I have a spam folder with a large number
(>10000) of messages in it, and trying to file stuff in there is a relatively surefire way to generate the problem.Try if dotlock_use_excl=yes helps. That setting was made to fix hard linking problems with HFS.
Otherwise http://wiki.dovecot.org/Debugging/ProcessTracing might
show up something useful when looking at what imap process is doing at the
time it seems to be hanging.
Is this a setting in dovecot.conf?
I'm using all the defaults, there, myself.
Jon
-----BEGIN PGP SIGNATURE----- Version: PGP Universal 2.6.1 Charset: US-ASCII
wj8DBQFGQbp6sTedWZOD3gYRAtP1AJ9ly++iRdNDFWs+dHm+zzkB8WRIswCgwZoI NrQmAuBDvMhF65FxLI2e6FY= =8qPy -----END PGP SIGNATURE-----
Timo Sirainen wrote:
On Sun, 2007-05-06 at 13:49 +0100, Ben Carter wrote:
The problem /seems/ to often start as a result of trying to move messages between folders. I'm not certain if that's a root cause, though, or simply a consequence of "opening" an affected folder to perform the move. I have a spam folder with a large number (>10000) of messages in it, and trying to file stuff in there is a relatively surefire way to generate the problem.
Try if dotlock_use_excl=yes helps. That setting was made to fix hard linking problems with HFS.
After a week or so of intensive testing, it appears that we have a winner!
That does indeed seem to resolve the problem - whilst it was sporadic to begin with, it generally only took a couple of days to manifest itself, so I'm reasonably confident that it's gone away.
Many thanks!
Ben Carter - ben@gunk.demon.co.uk / ben@saillune.net (preferred)
participants (4)
-
Ben Carter
-
Jon Callas
-
Scott Murman
-
Timo Sirainen