Re: [Dovecot] pop3_lock_session question
Rob Mangiafico <rmang@lexiconn.com> wrote:
On Wed, 4 Feb 2009, Timo Sirainen wrote:
On Wed, 2009-02-04 at 11:17 -0700, Mark Costlow wrote:
Hello, I'm preparing to convert from qpopper + UW-IMAP to dovecot. So far testing has gone very well. One problem we haven't figured out is that long-running POP sessions keep the mailbox locked, so that the MDA times out while trying to deliver. ..
Switch to Maildir and the problem goes away.
We see this as well with mbox and pop3 accesses where some pop3 clients do not logout (iphone's are the worst offenders) for 30-90 minutes it seems. Timeout settings in dovecot.conf do not seem to help. procmail backs up waiting to get access to the inbox to deliver mail.
We are having the same problem. It's not a problem with imap, only with pop3. We're using deliver as the MDA. We're also using flock() locking for the mbox: is that what everyone else is using as well?
Migrating our servers to maildir globally is also not an option for us. In some cases, even convincing customers to use IMAP isn't going to work.
One problem which might be making this worse than it needs to be, is the fact that mbox_lock_flock in mbox-lock.c is not using a blocking flock(); instead, it's polling for a non-blocking lock. This technique can cause lock starvation, if another process is dropping the lock and picking it back up again frequently: other processes will only see the lock as being available if they happen to poll for the lock at just the right instant.
A better technique to use here, if it's adequately cross-platform, would be to set an alarm() for the max_wait_time, and use a blocking flock(). If the alarm times out and you don't have a lock, it's a timeout. In the meantime, you're guaranteed to eventually get the lock, if it is dropped.
That said: I'm not sure whether this will solve our problem in practice.
Why doesn't this happen with imap? Why can't we make pop3 do what imap does? Even if it's inefficient, it's better than hanging all incoming mail delivery while deliver eats up our local concurrency limits.
Thanks!
Alan Ferrency pair Networks, Inc. alan@pair.com
On Wed, 2009-02-04 at 14:51 -0500, Alan Ferrency wrote:
One problem which might be making this worse than it needs to be, is the fact that mbox_lock_flock in mbox-lock.c is not using a blocking flock(); instead, it's polling for a non-blocking lock. This technique can cause lock starvation, if another process is dropping the lock and picking it back up again frequently: other processes will only see the lock as being available if they happen to poll for the lock at just the right instant.
A better technique to use here, if it's adequately cross-platform, would be to set an alarm() for the max_wait_time, and use a blocking flock(). If the alarm times out and you don't have a lock, it's a timeout. In the meantime, you're guaranteed to eventually get the lock, if it is dropped.
That's what Dovecot does elsewhere. I don't really know why I'm using non-blocking flock() calls. I guess I should fix that.
That said: I'm not sure whether this will solve our problem in practice.
Probably not.
Why doesn't this happen with imap? Why can't we make pop3 do what imap does? Even if it's inefficient, it's better than hanging all incoming mail delivery while deliver eats up our local concurrency limits.
IMAP unlocks mbox after each command is done. But POP3 clients typically just run RETR, RETR, RETR, .. so unlocking + locking again later is just extra work that slows things down. I guess there could be a timeout that if no RETR has been run for a few seconds it unlocks the mailbox.
But I've never before heard POP3 clients behaving that way, so I'd like to know what exactly are they doing. Are they not sending anything? Are they NOOPing? I don't see any reason for them to be doing either..
On Wed, 4 Feb 2009, Timo Sirainen wrote:
Why doesn't this happen with imap? Why can't we make pop3 do what imap does? Even if it's inefficient, it's better than hanging all incoming mail delivery while deliver eats up our local concurrency limits.
IMAP unlocks mbox after each command is done. But POP3 clients typically just run RETR, RETR, RETR, .. so unlocking + locking again later is just extra work that slows things down. I guess there could be a timeout that if no RETR has been run for a few seconds it unlocks the mailbox.
But I've never before heard POP3 clients behaving that way, so I'd like to know what exactly are they doing. Are they not sending anything? Are they NOOPing? I don't see any reason for them to be doing either..
In the cases I've looked into, the client seems to be connected and not doing anything. I don't have access to the clients or end users, but ktrace on the pop3 process basically is producing no output or very little output over an extended period.
Could it be an interactive client which maintains an open pop connection, even when no one is actively doing anything with it?
The "unlock after a few seconds" option would be great.
Do you have any documentation or hints on how to identify or debug connecting pop clients without involving the end user?
Thanks, Alan Ferrency pair Networks, Inc. alan@pair.com
Why doesn't this happen with imap? Why can't we make pop3 do what imap does? Even if it's inefficient, it's better than hanging all incoming mail delivery while deliver eats up our local concurrency limits.
IMAP unlocks mbox after each command is done. But POP3 clients typically just run RETR, RETR, RETR, .. so unlocking + locking again later is just extra work that slows things down. I guess there could be a timeout that if no RETR has been run for a few seconds it unlocks the mailbox.
But I've never before heard POP3 clients behaving that way, so I'd like to know what exactly are they doing. Are they not sending anything? Are they NOOPing? I don't see any reason for them to be doing either..
In the cases I've looked into, the client seems to be connected and not doing anything. I don't have access to the clients or end users, but ktrace on the pop3 process basically is producing no output or very little output over an extended period.
Could it be an interactive client which maintains an open pop connection, even when no one is actively doing anything with it?
The "unlock after a few seconds" option would be great.
Do you have any documentation or hints on how to identify or debug connecting pop clients without involving the end user?
Could it be some (older?) webmail clients that use pop3 instead of imap?
But I've never before heard POP3 clients behaving that way, so I'd like to know what exactly are they doing. Are they not sending anything? Are they NOOPing? I don't see any reason for them to be doing either..
In the cases I've looked into, the client seems to be connected and not doing anything. I don't have access to the clients or end users, but ktrace on the pop3 process basically is producing no output or very little output over an extended period.
Could it be an interactive client which maintains an open pop connection, even when no one is actively doing anything with it?
The "unlock after a few seconds" option would be great.
Do you have any documentation or hints on how to identify or debug connecting pop clients without involving the end user?
Could it be some (older?) webmail clients that use pop3 instead of imap?
I wouldn't expect a webmail client to hold a pop3 connection open across multiple web requests. We have standard webmail clients available for customer use, but they use IMAP. With the frequency we're seeing this problem, I'd expect it's more likely to be something newer or more commonly used.
Alan Ferrency pair Networks, Inc. alan@pair.com
Could it be some (older?) webmail clients that use pop3 instead of imap?
I wouldn't expect a webmail client to hold a pop3 connection open across multiple web requests. We have standard webmail clients available for customer use, but they use IMAP. With the frequency we're seeing this problem, I'd expect it's more likely to be something newer or more commonly used.
Presuming you've been able to identify which users this is affecting I would suspect you could go back to those users and determine what clients they are connecting with and then interested parties (dovecot devs?) could perform further investigation in a lab or whatever to determine what is going on. Maybe the client(s) is/are just whacky or there is a bug somewhere.
You can also track down the source IP addresses which may give you an idea as to the client as well. If it is a RIM subnet then you may be able to assume it's a blackberry. If the PTR record for the IP is webmail.somecompany.com then you can probably contact the company and discuss with them. Etc. Some companies may have a proxy or something that is attempting to hold the connections open for faster response times maybe geared for slow link connections. People do a lot of "interesting" things from time to time.
I also didn't see you mention it but presumably you are on a relatively recent version of Dovecot considering you are examining the source code.
Justin Krejci wrote:
Could it be some (older?) webmail clients that use pop3 instead of imap? I wouldn't expect a webmail client to hold a pop3 connection open across multiple web requests. We have standard webmail clients available for customer use, but they use IMAP. With the frequency we're seeing this problem, I'd expect it's more likely to be something newer or more commonly used.
Presuming you've been able to identify which users this is affecting I would suspect you could go back to those users and determine what clients they are connecting with and then interested parties (dovecot devs?) could perform further investigation in a lab or whatever to determine what is going on. Maybe the client(s) is/are just whacky or there is a bug somewhere.
You can also track down the source IP addresses which may give you an idea as to the client as well. If it is a RIM subnet then you may be able to assume it's a blackberry. If the PTR record for the IP is webmail.somecompany.com then you can probably contact the company and discuss with them. Etc. Some companies may have a proxy or something that is attempting to hold the connections open for faster response times maybe geared for slow link connections. People do a lot of "interesting" things from time to time.
Like imapproxy which holds the connection for subsequent requests to avoid the short-lived HTTP connection issue. I certainly recommend it for squirrelmail installations since squirrelmail can't IDLE the connection.
~Seth
On Wed, 2009-02-04 at 13:23 -0800, Seth Mattinen wrote:
Like imapproxy which holds the connection for subsequent requests to avoid the short-lived HTTP connection issue. I certainly recommend it for squirrelmail installations since squirrelmail can't IDLE the connection.
I've heard imapproxy doesn't help all that much with Dovecot. Do you (or anyone) have any actual statistics?
Timo Sirainen wrote:
On Wed, 2009-02-04 at 13:23 -0800, Seth Mattinen wrote:
Like imapproxy which holds the connection for subsequent requests to avoid the short-lived HTTP connection issue. I certainly recommend it for squirrelmail installations since squirrelmail can't IDLE the connection.
I've heard imapproxy doesn't help all that much with Dovecot. Do you (or anyone) have any actual statistics?
It does prevent spawning a separate IMAP process and running it through AUTH (which executes an SQL call in my case) every time a webmail user clicks on multiple things like a rabid squirrel with attention deficit disorder. No hard numbers though.
~Seth
On Wed, 2009-02-04 at 14:55 -0800, Seth Mattinen wrote:
Timo Sirainen wrote:
On Wed, 2009-02-04 at 13:23 -0800, Seth Mattinen wrote:
Like imapproxy which holds the connection for subsequent requests to avoid the short-lived HTTP connection issue. I certainly recommend it for squirrelmail installations since squirrelmail can't IDLE the connection.
I've heard imapproxy doesn't help all that much with Dovecot. Do you (or anyone) have any actual statistics?
It does prevent spawning a separate IMAP process
Yes, but I think it's not all that much extra work.
and running it through AUTH (which executes an SQL call in my case)
Enabling auth cache would avoid it.
Timo Sirainen wrote:
On Wed, 2009-02-04 at 14:55 -0800, Seth Mattinen wrote:
Timo Sirainen wrote:
On Wed, 2009-02-04 at 13:23 -0800, Seth Mattinen wrote:
Like imapproxy which holds the connection for subsequent requests to avoid the short-lived HTTP connection issue. I certainly recommend it for squirrelmail installations since squirrelmail can't IDLE the connection. I've heard imapproxy doesn't help all that much with Dovecot. Do you (or anyone) have any actual statistics?
It does prevent spawning a separate IMAP process
Yes, but I think it's not all that much extra work.
I've never benchmarked it, but I do hate seeing it cycle through connect/auth/do nothing/disconnect because someone is clicking refresh as fast as they possibly can for some unknown reason.
and running it through AUTH (which executes an SQL call in my case)
Enabling auth cache would avoid it.
True, but there's a small risk someone could get locked out of their mail box for a short time, which isn't acceptable in my environment.
~Seth
On Wed, 2009-02-04 at 15:06 -0500, Alan Ferrency wrote:
The "unlock after a few seconds" option would be great.
Implemented for v1.2, probably apply to v1.1 also:
http://hg.dovecot.org/dovecot-1.2/rev/6f29380ba3a0 http://hg.dovecot.org/dovecot-1.2/rev/ea9a186d64f9
Thanks!
Do you mean you will probably apply it to 1.1, or only that it will probably also work on the 1.1 branch?
I'll try to get this tested soon.
Alan Ferrency pair Networks, Inc. alan@pair.com
On Wed, 4 Feb 2009, Timo Sirainen wrote:
On Wed, 2009-02-04 at 15:06 -0500, Alan Ferrency wrote:
The "unlock after a few seconds" option would be great.
Implemented for v1.2, probably apply to v1.1 also:
http://hg.dovecot.org/dovecot-1.2/rev/6f29380ba3a0 http://hg.dovecot.org/dovecot-1.2/rev/ea9a186d64f9
I mean it will probably work. I'm trying to get v1.1 to a deep feature freeze.
On Wed, 2009-02-04 at 16:41 -0500, Alan Ferrency wrote:
Thanks!
Do you mean you will probably apply it to 1.1, or only that it will probably also work on the 1.1 branch?
I'll try to get this tested soon.
Alan Ferrency pair Networks, Inc. alan@pair.com
On Wed, 4 Feb 2009, Timo Sirainen wrote:
On Wed, 2009-02-04 at 15:06 -0500, Alan Ferrency wrote:
The "unlock after a few seconds" option would be great.
Implemented for v1.2, probably apply to v1.1 also:
http://hg.dovecot.org/dovecot-1.2/rev/6f29380ba3a0 http://hg.dovecot.org/dovecot-1.2/rev/ea9a186d64f9
On Wed, 4 Feb 2009, Timo Sirainen wrote:
Implemented for v1.2, probably apply to v1.1 also:
http://hg.dovecot.org/dovecot-1.2/rev/6f29380ba3a0 http://hg.dovecot.org/dovecot-1.2/rev/ea9a186d64f9 I mean it will probably work. I'm trying to get v1.1 to a deep feature freeze.
Do you think this pop3 lock issue could be applied to 1.1? This is the only remaining problem that we have with 1.1 and the mbox format. Everything else is working flawlessly. Thanks for considering it.
Rob
On 2/12/2009, Rob Mangiafico (rmang@lexiconn.com) wrote:
I mean it will probably work. I'm trying to get v1.1 to a deep feature freeze.
Do you think this pop3 lock issue could be applied to 1.1?
I think above he said 'not officially, but that you could apply the patch yourself'.
Best bet would be to upgrade to 1.2 if you want official support for it...
--
Best regards,
Charles
On Thu, 12 Feb 2009, Charles Marcus wrote:
On 2/12/2009, Rob Mangiafico (rmang@lexiconn.com) wrote:
I mean it will probably work. I'm trying to get v1.1 to a deep feature freeze.
Do you think this pop3 lock issue could be applied to 1.1?
I think above he said 'not officially, but that you could apply the patch yourself'.
Best bet would be to upgrade to 1.2 if you want official support for it...
ok, thanks. Has anyone tried patching against 1.1.11? Any patch file for it? We just spent a few months transitioning from uw imap to dovecot 1.1, so we would rather not jump into 1.2 at the moment. Thanks.
Rob
On 2/12/2009, Rob Mangiafico (rmang@lexiconn.com) wrote:
ok, thanks. Has anyone tried patching against 1.1.11? Any patch file for it? We just spent a few months transitioning from uw imap to dovecot 1.1, so we would rather not jump into 1.2 at the moment. Thanks.
I really don't think upgrading from 1.1 to 1.2 will be an issue... not even in the same galaxy as migrating from uw-imap to dovecot. Should be invisible.
--
Best regards,
Charles
On Wed, Feb 04, 2009 at 03:37:38PM -0500, Timo Sirainen wrote:
On Wed, 2009-02-04 at 15:06 -0500, Alan Ferrency wrote:
The "unlock after a few seconds" option would be great.
Implemented for v1.2, probably apply to v1.1 also:
http://hg.dovecot.org/dovecot-1.2/rev/6f29380ba3a0 http://hg.dovecot.org/dovecot-1.2/rev/ea9a186d64f9
Are both of these patches needed for the "unlock after a few seconds" feature, or just the 2nd one?
I ask because the description of the 1st one doesn't seem related at first look. Also, the 2nd one applies cleanly to 1.1 but the other one doesn't due to some name changes (at least -- I haven't looked closely at what else might have changed in those files from 1.1 to 1.2).
Thanks,
Mark
Mark Costlow | Southwest Cyberport | Fax: +1-505-232-7975 cheeks@swcp.com | Web: www.swcp.com | Voice: +1-505-232-7992
abq-strange.com -- Interesting photos taken in Albuquerque, NM Last post: Cruising San Mateo I - 2009-01-04 15:20:30
On Feb 4, 2009, at 9:02 PM, Mark Costlow wrote:
http://hg.dovecot.org/dovecot-1.2/rev/6f29380ba3a0 http://hg.dovecot.org/dovecot-1.2/rev/ea9a186d64f9
Are both of these patches needed for the "unlock after a few seconds" feature, or just the 2nd one?
I ask because the description of the 1st one doesn't seem related at first look. Also, the 2nd one applies cleanly to 1.1 but the other one doesn't due to some name changes (at least -- I haven't looked closely at what else might have changed in those files from 1.1 to 1.2).
The first one is there so that Dovecot's behavior is correct if RSET
command is given. Although now that I think about it, it should set
\Seen flags for all messages that have already been sucessfully
RETRed, even if client doesn't issue QUIT afterwards. That's how the
old code behaved. Have to fix that one tomorrow.
Thanks.
More questions about the first patch:
Is it necessary to apply this patch in 1.1, if we are using "pop3_no_flag_updates = yes"? (And, is it compatible with pop3_no_flag_updates in 1.2?)
Updating messages as "seen" was confusing to users who accessed their mail with both POP and IMAP, so we turned it off.
Thanks, Alan Ferrency pair Networks, Inc. alan@pair.com
On Wed, 4 Feb 2009, Timo Sirainen wrote:
On Feb 4, 2009, at 9:02 PM, Mark Costlow wrote:
http://hg.dovecot.org/dovecot-1.2/rev/6f29380ba3a0 http://hg.dovecot.org/dovecot-1.2/rev/ea9a186d64f9
Are both of these patches needed for the "unlock after a few seconds" feature, or just the 2nd one?
I ask because the description of the 1st one doesn't seem related at first look. Also, the 2nd one applies cleanly to 1.1 but the other one doesn't due to some name changes (at least -- I haven't looked closely at what else might have changed in those files from 1.1 to 1.2).
The first one is there so that Dovecot's behavior is correct if RSET command is given. Although now that I think about it, it should set \Seen flags for all messages that have already been sucessfully RETRed, even if client doesn't issue QUIT afterwards. That's how the old code behaved. Have to fix that one tomorrow.
On Thu, 2009-02-05 at 09:28 -0500, Alan Ferrency wrote:
Thanks.
More questions about the first patch:
Is it necessary to apply this patch in 1.1, if we are using "pop3_no_flag_updates = yes"?
Probably not.
(And, is it compatible with pop3_no_flag_updates in 1.2?)
Yes.
A small update regarding this patch:
We've patched 1.1.8 with the primary pop3 lock timeout change, and it's in use on several hundred FreeBSD servers without any known problems so far.
Thanks for the solution!
Alan Ferrency pair Networks, Inc. alan@pair.com
On Thu, 5 Feb 2009, Timo Sirainen wrote:
On Thu, 2009-02-05 at 09:28 -0500, Alan Ferrency wrote:
Thanks.
More questions about the first patch:
Is it necessary to apply this patch in 1.1, if we are using "pop3_no_flag_updates = yes"?
Probably not.
(And, is it compatible with pop3_no_flag_updates in 1.2?)
Yes.
On Wed, 2009-02-04 at 14:58 -0500, Timo Sirainen wrote:
On Wed, 2009-02-04 at 14:51 -0500, Alan Ferrency wrote:
One problem which might be making this worse than it needs to be, is the fact that mbox_lock_flock in mbox-lock.c is not using a blocking flock(); instead, it's polling for a non-blocking lock. This technique can cause lock starvation, if another process is dropping the lock and picking it back up again frequently: other processes will only see the lock as being available if they happen to poll for the lock at just the right instant.
A better technique to use here, if it's adequately cross-platform, would be to set an alarm() for the max_wait_time, and use a blocking flock(). If the alarm times out and you don't have a lock, it's a timeout. In the meantime, you're guaranteed to eventually get the lock, if it is dropped.
That's what Dovecot does elsewhere. I don't really know why I'm using non-blocking flock() calls.
I think it's because originally Dovecot was ignoring SIGALRMs which also caused alarm()s not to work right. But I stopped doing that years ago.
I guess I should fix that.
Added fix to v1.2 only: http://hg.dovecot.org/dovecot-1.2/rev/8cca2bf6ab76
Timo Sirainen wrote:
On Wed, 2009-02-04 at 14:51 -0500, Alan Ferrency wrote:
One problem which might be making this worse than it needs to be, is the fact that mbox_lock_flock in mbox-lock.c is not using a blocking flock(); instead, it's polling for a non-blocking lock. This technique can cause lock starvation, if another process is dropping the lock and picking it back up again frequently: other processes will only see the lock as being available if they happen to poll for the lock at just the right instant.
A better technique to use here, if it's adequately cross-platform, would be to set an alarm() for the max_wait_time, and use a blocking flock(). If the alarm times out and you don't have a lock, it's a timeout. In the meantime, you're guaranteed to eventually get the lock, if it is dropped.
That's what Dovecot does elsewhere. I don't really know why I'm using non-blocking flock() calls. I guess I should fix that.
That said: I'm not sure whether this will solve our problem in practice.
Probably not.
Why doesn't this happen with imap? Why can't we make pop3 do what imap does? Even if it's inefficient, it's better than hanging all incoming mail delivery while deliver eats up our local concurrency limits.
IMAP unlocks mbox after each command is done. But POP3 clients typically just run RETR, RETR, RETR, .. so unlocking + locking again later is just extra work that slows things down. I guess there could be a timeout that if no RETR has been run for a few seconds it unlocks the mailbox.
But I've never before heard POP3 clients behaving that way, so I'd like to know what exactly are they doing. Are they not sending anything? Are they NOOPing? I don't see any reason for them to be doing either..
We see it (procmail waiting on pop clients doing nothing) when a connection slows - sometimes to a crawl - on rural U.S. phone lines. Dovecot usually disconnects them after 10 minutes if the connection stops, but sometimes that process can drag on for a while. I've watched these using tcpdump. Analog modems can be quite persistent. Ken
Hello, I'm preparing to convert from qpopper + UW-IMAP to dovecot. So far testing has gone very well. One problem we haven't figured out is that long-running POP sessions keep the mailbox locked, so that the MDA times out while trying to deliver.
We see this as well with mbox and pop3 accesses where some pop3 clients do not logout (iphone's are the worst offenders) for 30-90 minutes it seems.
I wrote:
We are having the same problem. It's not a problem with imap, only with pop3.
Timo,
Do you think the "idle process holds a lock open forever" problem that you recently patched for pop3 could also affect imap?
I've started to get some reports of customers' local mail delivery being hung up by imap processes now, and not just pop processes as we saw previously.
Thanks, Alan Ferrency pair Networks, Inc. alan@pair.com
On Fri, 2009-02-13 at 14:49 -0500, Alan Ferrency wrote:
Do you think the "idle process holds a lock open forever" problem that you recently patched for pop3 could also affect imap?
It shouldn't. The mailbox is unlocked after each command is finished. But of course if the client sends a command that takes a really long time that could be a problem. I don't think clients usually do that though.
On Fri, 13 Feb 2009, Timo Sirainen wrote:
On Fri, 2009-02-13 at 14:49 -0500, Alan Ferrency wrote:
Do you think the "idle process holds a lock open forever" problem that you recently patched for pop3 could also affect imap?
It shouldn't. The mailbox is unlocked after each command is finished. But of course if the client sends a command that takes a really long time that could be a problem. I don't think clients usually do that though.
In the only case I've looked into deeply, the imap processes all seem to be sitting in this state, idle:
#0 0x18290f0b in kevent () from /lib/libc.so.6 #1 0x080c97fc in io_loop_handler_run (ioloop=0x80f0160) at ioloop-kqueue.c:128 #2 0x080c8e59 in io_loop_run (ioloop=0x80f0160) at ioloop.c:326 #3 0x08065ed0 in main (argc=1, argv=0xbfbfea1c, envp=0xbfbfea24) at main.c:293
I don't see how that could be holding anything up. It feels a bit odd that clients have 30+ separate imap processes open, all sitting in the io loop.
I'll get back to you if I find any more useful information.
Thanks!
Alan Ferrency pair Networks, Inc.
On Fri, 2009-02-13 at 14:59 -0500, Alan Ferrency wrote:
On Fri, 13 Feb 2009, Timo Sirainen wrote:
On Fri, 2009-02-13 at 14:49 -0500, Alan Ferrency wrote:
Do you think the "idle process holds a lock open forever" problem that you recently patched for pop3 could also affect imap?
It shouldn't. The mailbox is unlocked after each command is finished. But of course if the client sends a command that takes a really long time that could be a problem. I don't think clients usually do that though.
In the only case I've looked into deeply, the imap processes all seem to be sitting in this state, idle:
#0 0x18290f0b in kevent () from /lib/libc.so.6 #1 0x080c97fc in io_loop_handler_run (ioloop=0x80f0160) at ioloop-kqueue.c:128 #2 0x080c8e59 in io_loop_run (ioloop=0x80f0160) at ioloop.c:326 #3 0x08065ed0 in main (argc=1, argv=0xbfbfea1c, envp=0xbfbfea24) at main.c:293
Yep, it shouldn't be locked at this state.
I don't see how that could be holding anything up. It feels a bit odd that clients have 30+ separate imap processes open, all sitting in the io loop.
30+ is a bit much. I think most max at 5 by default.
You could anyway see what locks the process holds and see if any of them point to the mbox file. With Linux I would have done that by grepping /proc/locks, but I've no idea how to do that with BSDs.
Folks, I welcome your insights into a strange issue that may involve a
memory leak, but certainly involves a repeatable server crash.
I have Dovecot installed as part of a cPanel installation. I think
they're using 1.1.6 rather than the latest version, because they claim
to need to test compatibility with their server software before
deploying these components.
In any case, I set up the latest cPanel on a 64-bit server running two
quad-core AMD Opteron 2352 processors and CentOS 5.2 (64-bit).
When I switch from the default Courier IMAP server to Dovecot, within
3 to 8 hours, IMAP resource usage will spike and ultimately the server
will freeze. Restarting and returning to Courier solves the problem.
The symptom is see in Top is that IMAP consumes 100% of memory, and
the normal load (ranging from .05 to 1.50) soars to unheard of
heights, such as 25, 50. Nothing will work on the server, which forces
me to restart.
Our email load is reasonably light, and all the sites on the server
are mine.
I have done fresh installs of cPanel and fresh updates without a
solution. My server admin and cPanel support both appear stumped over
this issue.
While I am hesitant to install a newer version of Dovecot
independently, I wanted to know if you're aware of any such
difficulties. Since I'm back with Courier now, I presume logs aren't
available (if they are let me know). But I never have problems with
Courier, ever. I had Dovecot running fine with cPanel on a 32-bit
Core2Duo server I use strictly for backup.
I appreciate the lightweight footprint of Dovecot and its performance
advantages, but if I can't get it to work on my server, I can't use it.
Now maybe your newer versions solve this. Regardless, please provide
whatever insights you can that might assist in tracking down this issue.
Peace, Gene Steinberg
Gene Steinberg wrote:
Folks, I welcome your insights into a strange issue that may involve a memory leak, but certainly involves a repeatable server crash.
I have Dovecot installed as part of a cPanel installation. I think they're using 1.1.6 rather than the latest version, because they claim to need to test compatibility with their server software before deploying these components.
In any case, I set up the latest cPanel on a 64-bit server running two quad-core AMD Opteron 2352 processors and CentOS 5.2 (64-bit).
When I switch from the default Courier IMAP server to Dovecot, within 3 to 8 hours, IMAP resource usage will spike and ultimately the server will freeze. Restarting and returning to Courier solves the problem.
The symptom is see in Top is that IMAP consumes 100% of memory, and the normal load (ranging from .05 to 1.50) soars to unheard of heights, such as 25, 50. Nothing will work on the server, which forces me to restart.
Our email load is reasonably light, and all the sites on the server are mine.
I have done fresh installs of cPanel and fresh updates without a solution. My server admin and cPanel support both appear stumped over this issue.
While I am hesitant to install a newer version of Dovecot independently, I wanted to know if you're aware of any such difficulties. Since I'm back with Courier now, I presume logs aren't available (if they are let me know). But I never have problems with Courier, ever. I had Dovecot running fine with cPanel on a 32-bit Core2Duo server I use strictly for backup.
I appreciate the lightweight footprint of Dovecot and its performance advantages, but if I can't get it to work on my server, I can't use it.
Now maybe your newer versions solve this. Regardless, please provide whatever insights you can that might assist in tracking down this issue.
Peace, Gene Steinberg
Hello
Does the debug mode give something understandable ?
On Feb 14, 2009, at 12:24 AM, Frank Bonnet wrote:
Hello
Does the debug mode give something understandable ?
I've been reluctant to try this again, because this is a production
server. But my admin and cPanel support say they can't find any reason
for this.
Peace, Gene
Words by Gene Steinberg [Sat, Feb 14, 2009 at 02:58:06AM -0700]:
On Feb 14, 2009, at 12:24 AM, Frank Bonnet wrote:
Hello
Does the debug mode give something understandable ?
I've been reluctant to try this again, because this is a production
server. But my admin and cPanel support say they can't find any reason
for this.
Is kind of dificult to get to a conclusion of what the problem may be without any further testing or any logs or traces. Do you have any lab where you can test a more recent version on the same hardware ?
-- Jose Celestino | http://japc.uncovering.org/files/japc-pgpkey.asc
"One man’s theology is another man’s belly laugh." -- Robert A. Heinlein
I'm at a disadvantage here. I'm more of the end user than admin, but
I cope.
But if someone wanted to assist me in maybe trying this again --
perhaps with the latest stable version -- I'd be interested in another
test.
Peace, Gene
On Feb 14, 2009, at 11:30 AM, Jose Celestino <japc@co.sapo.pt> wrote:
Words by Gene Steinberg [Sat, Feb 14, 2009 at 02:58:06AM -0700]:
On Feb 14, 2009, at 12:24 AM, Frank Bonnet wrote:
Hello
Does the debug mode give something understandable ?
I've been reluctant to try this again, because this is a production server. But my admin and cPanel support say they can't find any
reason for this.Is kind of dificult to get to a conclusion of what the problem may be without any further testing or any logs or traces. Do you have any lab where you can test a more recent version on the same hardware ?
-- Jose Celestino | http://japc.uncovering.org/files/japc-pgpkey.asc
"One man’s theology is another man’s belly laugh." -- Robert A.
Heinlein
participants (11)
-
Alan Ferrency
-
Charles Marcus
-
Frank Bonnet
-
Gene Steinberg
-
Jose Celestino
-
Justin Krejci
-
Ken A
-
Mark Costlow
-
Rob Mangiafico
-
Seth Mattinen
-
Timo Sirainen