[Dovecot] Stale imap processes
Anyone have any theories or experiences as to why for a certain user, some of his imap processes hang, and stay around for days and days? He just reported that his Thunderbird was timing out when trying to open folders, I looked on the server, and he had 3 old imap processes from 1 week+ ago. I killed those and his performance immediately improved.
He checks mail with Thunderbird > 1.5 from both Win and Linux, Squirrelmail webmail, and pine. Of those, pine is the only one that is somewhat rare around here - any specific problems with pine's imap client code that anyone knows about?
We have been using the dovecot-1.0-0.beta2.7 rpm for Fedora Core 5, I'm updating to 1.0-0.beta8.2.fc5 to see if that helps, as well. Just curious if anyone else has seen this lately.
Thanks, Fran
-- Fran Fabrizio Senior Systems Analyst Department of Computer and Information Sciences University of Alabama at Birmingham http://www.cis.uab.edu/ 205.934.0653
Fran Fabrizio wrote:
Anyone have any theories or experiences as to why for a certain user, some of his imap processes hang, and stay around for days and days? He just reported that his Thunderbird was timing out when trying to open folders, I looked on the server, and he had 3 old imap processes from 1 week+ ago. I killed those and his performance immediately improved.
He checks mail with Thunderbird > 1.5 from both Win and Linux, Squirrelmail webmail, and pine. Of those, pine is the only one that is somewhat rare around here - any specific problems with pine's imap client code that anyone knows about?
We have been using the dovecot-1.0-0.beta2.7 rpm for Fedora Core 5, I'm updating to 1.0-0.beta8.2.fc5 to see if that helps, as well. Just curious if anyone else has seen this lately.
Thanks, Fran
I have just seen this problem, big time. I was poking around on a FreeBSD 6.1 server because of Thunderbird hangs (see thread "dovecot-1.0rc2 problems with Thunderbird" if you are interested), and I found over 900 (!) imap process. Only two users were using the dovecot server at the time.
I restarted dovecot, but this did NOT kill the multitudinous imap processes. After another restart, with a 'killall imap' in the middle, things were temporarily fine, but the number of imap processes immediately started to grow again. Apparently they never exit. or use any CPU, but just sit around in KQREAD state, eventually getting swapped out. No error messages are generated.
On Tue, Aug 01, 2006 at 05:38:00PM -0700, Pete Slagle wrote:
Fran Fabrizio wrote:
Anyone have any theories or experiences as to why for a certain user, some of his imap processes hang, and stay around for days and days? He just reported that his Thunderbird was timing out when trying to open folders, I looked on the server, and he had 3 old imap processes from 1 week+ ago. I killed those and his performance immediately improved.
He checks mail with Thunderbird > 1.5 from both Win and Linux, Squirrelmail webmail, and pine. Of those, pine is the only one that is somewhat rare around here - any specific problems with pine's imap client code that anyone knows about?
We have been using the dovecot-1.0-0.beta2.7 rpm for Fedora Core 5, I'm updating to 1.0-0.beta8.2.fc5 to see if that helps, as well. Just curious if anyone else has seen this lately.
Thanks, Fran
I have just seen this problem, big time. I was poking around on a FreeBSD 6.1 server because of Thunderbird hangs (see thread "dovecot-1.0rc2 problems with Thunderbird" if you are interested), and I found over 900 (!) imap process. Only two users were using the dovecot server at the time.
I restarted dovecot, but this did NOT kill the multitudinous imap processes. After another restart, with a 'killall imap' in the middle, things were temporarily fine, but the number of imap processes immediately started to grow again. Apparently they never exit. or use any CPU, but just sit around in KQREAD state, eventually getting swapped out. No error messages are generated.
Stop using the kqueue support! It is broken and has never fully worked properly. This is a known issue if you back in the list archives.
Brad wrote:
On Tue, Aug 01, 2006 at 05:38:00PM -0700, Pete Slagle wrote:
Fran Fabrizio wrote:
Anyone have any theories or experiences as to why for a certain user, some of his imap processes hang, and stay around for days and days? He just reported that his Thunderbird was timing out when trying to open folders, I looked on the server, and he had 3 old imap processes from 1 week+ ago. I killed those and his performance immediately improved.
He checks mail with Thunderbird > 1.5 from both Win and Linux, Squirrelmail webmail, and pine. Of those, pine is the only one that is somewhat rare around here - any specific problems with pine's imap client code that anyone knows about?
We have been using the dovecot-1.0-0.beta2.7 rpm for Fedora Core 5, I'm updating to 1.0-0.beta8.2.fc5 to see if that helps, as well. Just curious if anyone else has seen this lately.
Thanks, Fran I have just seen this problem, big time. I was poking around on a FreeBSD 6.1 server because of Thunderbird hangs (see thread "dovecot-1.0rc2 problems with Thunderbird" if you are interested), and I found over 900 (!) imap process. Only two users were using the dovecot server at the time.
I restarted dovecot, but this did NOT kill the multitudinous imap processes. After another restart, with a 'killall imap' in the middle, things were temporarily fine, but the number of imap processes immediately started to grow again. Apparently they never exit. or use any CPU, but just sit around in KQREAD state, eventually getting swapped out. No error messages are generated.
Stop using the kqueue support! It is broken and has never fully worked properly. This is a known issue if you back in the list archives.
Thanks for the tip. But, it's perplexing: kqueue support is built in by default to the FreeBSD port of dovecot, which presumably means that almost everyone running dovecot on FreeBSD is using it.
I had some trouble finding a good way to search the dovecot list archives. Can you provide a pointer to the thread you mention? It might be something I should send to the maintainer of the FreeBSD dovecot port.
Pete
Pete Slagle wrote:
Thanks for the tip. But, it's perplexing: kqueue support is built in by default to the FreeBSD port of dovecot, which presumably means that almost everyone running dovecot on FreeBSD is using it.
I had some trouble finding a good way to search the dovecot list archives. Can you provide a pointer to the thread you mention? It might be something I should send to the maintainer of the FreeBSD dovecot port.
Pete
I've been reading this thread this morning and others about problems with the latest dovecot and thunderbird. I'm running dovecot-1.0.r2 on FreeBSD 6.1 My personal client is Thunderbird on OS-X
After tinkering with rebuilding dovecot with and without kqueue support it would appear building with kqueue is the desired configuration. With kqueue I did notice two stale imap processes after about a weeks worth of use. After recompiling without kqueue moving messages around folders caused the client to hang and after the first login the imap process wouldn't close after i closed the connection.
Moving lots of messages around folders seems to be fine with kqueue support and the stale imap process problem seems to be reduced. This was all deducted after about 15 minutes of tinkering so i could be wrong :-)
Brad wrote:
On Tue, Aug 01, 2006 at 05:38:00PM -0700, Pete Slagle wrote:
Fran Fabrizio wrote:
Anyone have any theories or experiences as to why for a certain user, some of his imap processes hang, and stay around for days and days? He just reported that his Thunderbird was timing out when trying to open folders, I looked on the server, and he had 3 old imap processes from 1 week+ ago. I killed those and his performance immediately improved.
He checks mail with Thunderbird > 1.5 from both Win and Linux, Squirrelmail webmail, and pine. Of those, pine is the only one that is somewhat rare around here - any specific problems with pine's imap client code that anyone knows about?
We have been using the dovecot-1.0-0.beta2.7 rpm for Fedora Core 5, I'm updating to 1.0-0.beta8.2.fc5 to see if that helps, as well. Just curious if anyone else has seen this lately.
Thanks, Fran I have just seen this problem, big time. I was poking around on a FreeBSD 6.1 server because of Thunderbird hangs (see thread "dovecot-1.0rc2 problems with Thunderbird" if you are interested), and I found over 900 (!) imap process. Only two users were using the dovecot server at the time.
I restarted dovecot, but this did NOT kill the multitudinous imap processes. After another restart, with a 'killall imap' in the middle, things were temporarily fine, but the number of imap processes immediately started to grow again. Apparently they never exit. or use any CPU, but just sit around in KQREAD state, eventually getting swapped out. No error messages are generated.
Stop using the kqueue support! It is broken and has never fully worked properly. This is a known issue if you back in the list archives.
Unfortunately, I had the problem on OpenBSD with dovecot 1.0rc2. Stale imap processes when _NOT_ using kqueue (using poll). Using kqueue solved the problem. Not sure if it's the same on FreeBSD, but it seems that using kqueue solves the stale processes problem both on OpenBSD and NetBSD. On linux, kqueue doesn't produce stale imap processes either.
--
.O. ..O OOO
On Wed, Aug 02, 2006 at 04:27:25PM +0200, Renaud Allard wrote:
Brad wrote:
On Tue, Aug 01, 2006 at 05:38:00PM -0700, Pete Slagle wrote:
Fran Fabrizio wrote:
Anyone have any theories or experiences as to why for a certain user, some of his imap processes hang, and stay around for days and days? He just reported that his Thunderbird was timing out when trying to open folders, I looked on the server, and he had 3 old imap processes from 1 week+ ago. I killed those and his performance immediately improved.
He checks mail with Thunderbird > 1.5 from both Win and Linux, Squirrelmail webmail, and pine. Of those, pine is the only one that is somewhat rare around here - any specific problems with pine's imap client code that anyone knows about?
We have been using the dovecot-1.0-0.beta2.7 rpm for Fedora Core 5, I'm updating to 1.0-0.beta8.2.fc5 to see if that helps, as well. Just curious if anyone else has seen this lately.
Thanks, Fran I have just seen this problem, big time. I was poking around on a FreeBSD 6.1 server because of Thunderbird hangs (see thread "dovecot-1.0rc2 problems with Thunderbird" if you are interested), and I found over 900 (!) imap process. Only two users were using the dovecot server at the time.
I restarted dovecot, but this did NOT kill the multitudinous imap processes. After another restart, with a 'killall imap' in the middle, things were temporarily fine, but the number of imap processes immediately started to grow again. Apparently they never exit. or use any CPU, but just sit around in KQREAD state, eventually getting swapped out. No error messages are generated.
Stop using the kqueue support! It is broken and has never fully worked properly. This is a known issue if you back in the list archives.
Unfortunately, I had the problem on OpenBSD with dovecot 1.0rc2. Stale imap processes when _NOT_ using kqueue (using poll). Using kqueue solved the problem. Not sure if it's the same on FreeBSD, but it seems that using kqueue solves the stale processes problem both on OpenBSD and NetBSD. On linux, kqueue doesn't produce stale imap processes either.
I have had a number of reports that kqueue, specifically kqueue for the file change mechanism (--with-notify=kqueue), causes the hanging and disabling this allowed regular operation. kqueue for the I/O loop mechanism causes dovecot processes to crash under certain (unknown) conditions. If poll is causing problems then there are definitely some nasty bugs lurking in the code base. The problem with the file change mechanism and kqueue popped up somewhere between beta8 and rc2.
Unfortunately, I had the problem on OpenBSD with dovecot 1.0rc2. Stale imap processes when _NOT_ using kqueue (using poll). Using kqueue solved the problem. Not sure if it's the same on FreeBSD, but it seems that using kqueue solves the stale processes problem both on OpenBSD and NetBSD. On linux, kqueue doesn't produce stale imap processes either.
I have had a number of reports that kqueue, specifically kqueue for the file change mechanism (--with-notify=kqueue), causes the hanging and disabling this allowed regular operation. kqueue for the I/O loop mechanism causes dovecot processes to crash under certain (unknown) conditions. If poll is causing problems then there are definitely some nasty bugs lurking in the code base. The problem with the file change mechanism and kqueue popped up somewhere between beta8 and rc2.
I use --with-ioloop=kqueue I had no problems with beta8, and noticed stale imap processes after upgrading to rc2. Someone suggested using --with-ioloop=kqueue and this indeed solved my problem. Note that while I have no more imap processes hanging, I noticed that sometimes thunderbird complains about being unable to copy mails in sent folder. This wasn't the case in beta8.
--
.O. ..O OOO
Ok, my head is swimming after catching up, but I think this is what I've concluded...
I am using Fedora Core 5 and dovecot RPMs. The RPM that shipped with FC5 was:
dovecot-1.0-0.beta2.7
after observing the stale imap processed I upgraded this to:
1.0-0.beta8.2.fc5
Both of these are built with ioloop=poll.
I am still seeing the stale processes with beta8. Sounds like I need to build from source using ioloop=kqueue to maybe clear up this issue. Did I interpret this correctly? And should I stick with beta8 or move to rc2?
Thanks, Fran
Renaud Allard wrote:
Unfortunately, I had the problem on OpenBSD with dovecot 1.0rc2. Stale imap processes when _NOT_ using kqueue (using poll). Using kqueue solved the problem. Not sure if it's the same on FreeBSD, but it seems that using kqueue solves the stale processes problem both on OpenBSD and NetBSD. On linux, kqueue doesn't produce stale imap processes either.
I have had a number of reports that kqueue, specifically kqueue for the file change mechanism (--with-notify=kqueue), causes the hanging and disabling this allowed regular operation. kqueue for the I/O loop mechanism causes dovecot processes to crash under certain (unknown) conditions. If poll is causing problems then there are definitely some nasty bugs lurking in the code base. The problem with the file change mechanism and kqueue popped up somewhere between beta8 and rc2.
I use --with-ioloop=kqueue I had no problems with beta8, and noticed stale imap processes after upgrading to rc2. Someone suggested using --with-ioloop=kqueue and this indeed solved my problem. Note that while I have no more imap processes hanging, I noticed that sometimes thunderbird complains about being unable to copy mails in sent folder. This wasn't the case in beta8.
-- Fran Fabrizio Senior Systems Analyst Department of Computer and Information Sciences University of Alabama at Birmingham http://www.cis.uab.edu/ 205.934.0653
On Thu, 2006-08-03 at 15:58 -0500, Fran Fabrizio wrote:
Ok, my head is swimming after catching up, but I think this is what I've concluded...
I am using Fedora Core 5 and dovecot RPMs. The RPM that shipped with FC5 was:
dovecot-1.0-0.beta2.7
after observing the stale imap processed I upgraded this to:
1.0-0.beta8.2.fc5
Both of these are built with ioloop=poll.
I am still seeing the stale processes with beta8. Sounds like I need to build from source using ioloop=kqueue to maybe clear up this issue. Did I interpret this correctly? And should I stick with beta8 or move to rc2?
kqueue is BSD-specific feature, so you can't use it with Linux.
But I'd like to know what exactly does "stale processes" mean: Do you use SSL and they're all SSL connections? Are they eating any CPU at all? If you strace them, does it show if they're doing anything? How long are they stuck, or don't they ever die?
How exactly do you know they are stale? Are you sure there aren't any clients that still have active connections to it?
The processes just stay there forever and at some point just hit the per user processes limit. They just never die, they don't use any CPU at all, but just stay there without any client connection. On BSD building dovecot with kqueue solved the problem.
Timo Sirainen wrote:
On Thu, 2006-08-03 at 15:58 -0500, Fran Fabrizio wrote:
Ok, my head is swimming after catching up, but I think this is what I've concluded...
I am using Fedora Core 5 and dovecot RPMs. The RPM that shipped with FC5 was:
dovecot-1.0-0.beta2.7
after observing the stale imap processed I upgraded this to:
1.0-0.beta8.2.fc5
Both of these are built with ioloop=poll.
I am still seeing the stale processes with beta8. Sounds like I need to build from source using ioloop=kqueue to maybe clear up this issue. Did I interpret this correctly? And should I stick with beta8 or move to rc2?
kqueue is BSD-specific feature, so you can't use it with Linux.
But I'd like to know what exactly does "stale processes" mean: Do you use SSL and they're all SSL connections? Are they eating any CPU at all? If you strace them, does it show if they're doing anything? How long are they stuck, or don't they ever die?
How exactly do you know they are stale? Are you sure there aren't any clients that still have active connections to it?
Renaud Allard wrote:
The processes just stay there forever and at some point just hit the per user processes limit. They just never die, they don't use any CPU at all, but just stay there without any client connection. On BSD building dovecot with kqueue solved the problem.
I had this same problem, many idle imap processes, using rc2 without SSL, but it was NOT solved by using kqueue. It suggests that the kqueue issue sometimes affects the symptoms, but actually just covers or exposes an underlying bug.
Pete Slagle wrote:
I had this same problem, many idle imap processes, using rc2 without SSL, but it was NOT solved by using kqueue. It suggests that the kqueue issue sometimes affects the symptoms, but actually just covers or exposes an underlying bug.
i had this problem too, i change to use kqueue to poll (notify), then the problem is solved (all connection tls/ssl/plain)
if you want use poll, you need use "--with-notify=pool" for configure script
on my system(openbsd), if i do not use "--with-notify" flag, dovecot will use kqueue for notify by default and use poll for ioloop by default
shell$> dovecot --build-options Build options: ioloop=poll ipv6 openssl SQL drivers: mysql Passdb: bsdauth checkpassword passwd passwd-file sql Userdb: checkpassword passwd prefetch passwd-file sql static
Hi Timo,
I have the same problem using rc2 and TLS compiled with "poll" instead of kqueue.
With only one tbird client, I ended up with 100 processes (ps ax|grep -c imap) after one day or two.
kill cat /var/run/dovecot/master.pid
would not kill those processes. I
have to do a killall imap.
b8 does not have this problem.
Timo Sirainen wrote:
On Thu, 2006-08-03 at 15:58 -0500, Fran Fabrizio wrote:
Ok, my head is swimming after catching up, but I think this is what I've concluded...
I am using Fedora Core 5 and dovecot RPMs. The RPM that shipped with FC5 was:
dovecot-1.0-0.beta2.7
after observing the stale imap processed I upgraded this to:
1.0-0.beta8.2.fc5
Both of these are built with ioloop=poll.
I am still seeing the stale processes with beta8. Sounds like I need to build from source using ioloop=kqueue to maybe clear up this issue. Did I interpret this correctly? And should I stick with beta8 or move to rc2?
kqueue is BSD-specific feature, so you can't use it with Linux.
But I'd like to know what exactly does "stale processes" mean: Do you use SSL and they're all SSL connections? Are they eating any CPU at all? If you strace them, does it show if they're doing anything? How long are they stuck, or don't they ever die?
How exactly do you know they are stale? Are you sure there aren't any clients that still have active connections to it?
webbie wrote:
I have the same problem using rc2 and TLS compiled with "poll" instead of kqueue.
With only one tbird client, I ended up with 100 processes (ps ax|grep -c imap) after one day or two.
kill
cat /var/run/dovecot/master.pid
would not kill those processes. I have to do a killall imap.b8 does not have this problem.
Just so here as well, but with kqueue, which again suggests that the issue is not really kqueue.
One way forward may be to consider what changed between b8 and rc2.
kqueue is BSD-specific feature, so you can't use it with Linux.
Ah, I see now.
But I'd like to know what exactly does "stale processes" mean: Do you use SSL and they're all SSL connections?
Yes, they could all be SSL connections. Many certainly are, not sure if -all- of the stale ones are yet.
Are they eating any CPU at all?
top reports 0% CPU.
If you strace them, does it show if they're doing anything?
Just polling...
gettimeofday({1154640050, 545868}, {300, 0}) = 0 gettimeofday({1154640050, 546099}, NULL) = 0 poll([{fd=6, events=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL}, {fd=2, events=POLLERR|POLLHUP|POLLNVAL}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL}, {fd=0, events=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL}], 4, 0) = 0 gettimeofday({1154640050, 546512}, {300, 0}) = 0 gettimeofday({1154640050, 546725}, NULL) = 0 poll([{fd=6, events=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL}, {fd=2, events=POLLERR|POLLHUP|POLLNVAL}, {fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL}, {fd=0, events=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL}], 4, 531) = 0
...and so on.
How long are they stuck, or don't they ever die?
Until I kill them manually.
How exactly do you know they are stale?
Because the clients have been closed and the user owning the process does not have any mail applications open.
Are you sure there aren't any clients that still have active connections to it?
Pretty sure - of course it could be the client shutting down uncleanly. Mostly 1.5.0.x of Thunderbird and squirrelmail-1.4.6-5.fc5 with an occasional pine user.
The symptom for me is either user's IMAP operation times out or just user slowdown. The resolution is that once I kill the old processes owned by that user, their performance immediately improves. Typical example is that user is trying to delete a message from the INBOX (i.e. move a message from INBOX to Trash in our Maildir environment). Thunderbird will just hang on it, or take 1 minute+ to move the message. If I kill the old imap processes, it seems that the move happens very quickly (~ 1 second) if the user tries again immediately thereafter.
-Fran
-- Fran Fabrizio Senior Systems Analyst Department of Computer and Information Sciences University of Alabama at Birmingham http://www.cis.uab.edu/ 205.934.0653
I have been seeing the same thing for some time. Seems not client specific, I see it with Evolution, Thunderbird, and Netscape 7.2 mail clients. I thought it was our firewall but users behind the firewall on the same subnet as the dovecot server have the same problem. Still working on the migration to Dovecot 1, rc5 but have been seeing this in RH stock rpm dovecot on FC3 and CentOS 4.3.
But I'd like to know what exactly does "stale processes" mean: Do you use SSL and they're all SSL connections?
Mine are not SSL. I get as many as 800 IMAP processes (on about 1000 users), this dies down overnight, when no one is checking their mail.
Are they eating any CPU at all?
Nope.
How long are they stuck, or don't they ever die?
They will die ovenight, as no one checks mail after 5 PM. I set a process limit at about 1800 just to be safe.
How exactly do you know they are stale?
Because the clients have been closed and the user owning the process does not have any mail applications open.
Are you sure there aren't any clients that still have active connections to it?
Pretty sure - of course it could be the client shutting down uncleanly. Mostly 1.5.0.x of Thunderbird and squirrelmail-1.4.6-5.fc5 with an occasional pine user.
Same here. If I check my mail w/Evolution (and get no new mail) this process stays around for hours.
Sometimes the excessive processes do seem to cause problems (user cannot pop/imap mail, connections auth but hang. I stop dovecot and wack imap will killall several times, then restart dovecot. This fixes the user problems but within a few mins I have 300-400 imap processes, this grows over the day to about 800-900.
-- James H. Edwards Network Systems Administrator Judicial Information Division jedwards@nmcourts.com
Am 01.08.2006 um 13:34 Uhr -0500 schrieb Fran Fabrizio:
Anyone have any theories or experiences as to why for a certain user, some of his imap processes hang, and stay around for days and days? He just reported that his Thunderbird was timing out when trying to open folders, I looked on the server, and he had 3 old imap processes from 1 week+ ago. I killed those and his performance immediately improved.
Same here with NetBSD/i386 2.x.
For users that have "max. cached connections" set to 1, this means effectively that they are locked out until an admin 'kill -9's the relevant imap process. Seems to be a lot better with the 1.0beta8 that I downgraded to.
hauke
-- /~\ The ASCII Ribbon Campaign Hauke Fath \ / No HTML/RTF in email Institut für Nachrichtentechnik X No Word docs in email TU Darmstadt / \ Respect for open standards Ruf +49-6151-16-3281
participants (11)
-
Brad
-
Dillon
-
Fran Fabrizio
-
Hauke Fath
-
james edwards
-
John Wong
-
Pete Slagle
-
Pete Slagle
-
Renaud Allard
-
Timo Sirainen
-
webbie