Dear All,
I constantly encounter 'dovecot-auth', 'login-imap', 'login-pop3' and 'imap' processes consume 100% CPU. I run 'ps' and found those hanged up processes were running for a long time. I am wondering whether dovecot.conf has some configuration parameters that timeout those processes (or kill those after certain period of time). I am using dovecot 1.2.11 on RHEL 5.3.
Many thanks ~~~
Yours Sincerely, Jacky Chan
On Tue, Apr 06, 2010 at 12:13:02PM +0800, JackyC@umac.mo wrote:
I constantly encounter 'dovecot-auth', 'login-imap', 'login-pop3' and 'imap' processes consume 100% CPU. I run 'ps' and found those hanged up processes were running for a long time.
Have you tried attaching strace to them?
strace -p <pid>
will show you what system calls they are making, if any. That can be useful to determine how they're stuck.
On 2010-04-06 12:13 AM, JackyC@umac.mo wrote:
I am wondering whether dovecot.conf has some configuration parameters that timeout those processes (or kill those after certain period of time). I am using dovecot 1.2.11 on RHEL 5.3.
Always provide output of dovecot -n whenever config is a potential issue...
--
Best regards,
Charles
Hi Charles,
Please find the output of the "dovecot -n", I'm just wondering whether I can have a default timeout value for those executables ...
# 1.2.11: /usr/local/dovecot-1.2.11/etc/dovecot.conf # OS: Linux 2.6.18-128.2.1.el5PAE i686 Red Hat Enterprise Linux Server release 5.3 (Tikanga) log_path: /var/log/dovecot.log protocols: imap pop3 ssl: no disable_plaintext_auth: no login_dir: /usr/local/dovecot-1.2.11/var/run/dovecot/login login_executable(default): /usr/local/dovecot-1.2.11/libexec/dovecot/imap-login login_executable(imap): /usr/local/dovecot-1.2.11/libexec/dovecot/imap-login login_executable(pop3): /usr/local/dovecot-1.2.11/libexec/dovecot/pop3-login mail_location: mbox:~/mail:INBOX=/home/mail/%u mail_debug: yes mail_executable(default): /usr/local/dovecot-1.2.11/libexec/dovecot/imap mail_executable(imap): /usr/local/dovecot-1.2.11/libexec/dovecot/imap mail_executable(pop3): /usr/local/dovecot-1.2.11/libexec/dovecot/pop3 mail_plugin_dir(default): /usr/local/dovecot-1.2.11/lib/dovecot/imap mail_plugin_dir(imap): /usr/local/dovecot-1.2.11/lib/dovecot/imap mail_plugin_dir(pop3): /usr/local/dovecot-1.2.11/lib/dovecot/pop3 auth default: verbose: yes debug: yes passdb: driver: pam userdb: driver: passwd
Yours Sincerely, Jacky Chan
From: Charles Marcus CMarcus@Media-Brokers.com To: dovecot@dovecot.org Date: 06/04/2010 下午 06:57 Subject: Re: [Dovecot] Timeout Value Sent by: dovecot-bounces+jackyc=umac.mo@dovecot.org
I am wondering whether dovecot.conf has some configuration parameters
On 2010-04-06 12:13 AM, JackyC@umac.mo wrote: that
timeout those processes (or kill those after certain period of time). I am using dovecot 1.2.11 on RHEL 5.3.
Always provide output of dovecot -n whenever config is a potential issue...
--
Best regards,
Charles
On Tue, 2010-04-06 at 12:13 +0800, JackyC@umac.mo wrote:
Dear All,
I constantly encounter 'dovecot-auth', 'login-imap', 'login-pop3' and 'imap' processes consume 100% CPU. I run 'ps' and found those hanged up processes were running for a long time. I am wondering whether dovecot.conf has some configuration parameters that timeout those processes (or kill those after certain period of time). I am using dovecot 1.2.11 on RHEL 5.3.
There is no timeout value, because those processes just shouldn't be eating 100% CPU. Strange that all of your processes are eating 100% CPU. You do mean all dovecot processes eat 100% CPU, right? And it happens immediately after they start up? I guess there's something really wrong in your installation. A few things that might show something useful:
strace -tt -p <pid> output for a 10 lines or so.
dovecot --build-options output
Hi Timo,
You do mean all dovecot processes eat 100% CPU, right? And it happens immediately after they start up?
Not all. For example, we have 2 "dovecot-auth -w". Suddenly one of them will eat 100%. After a certain period of time (short/long, not a constant), another will come up. They didn't eat 100% when they start up even.
- strace -tt -p <pid> output for a 10 lines or so.
Once I start 'strace' to monitor the "dovecot-auth -w", it will immediately eat 100% and the following is the last few lines of strace output. I do have a ldap backend for authentication.
10:43:34.726512 time(NULL) = 1270608214 10:43:34.726561 poll([{fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=-1}, {fd=-1}, {fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 4, 10000) = 1 ([{fd=12, revents=POLLIN}]) 10:43:34.728355 read(12, "0\204\0\0\0\21\2\2", 8) = 8 10:43:34.728421 read(12, "\30pa\204\0\0\0\7\n\1\0\4\0\4\0", 15) = 15 10:43:34.728491 time(NULL) = 1270608214 10:43:34.728542 time(NULL) = 1270608214 10:43:34.728594 write(12, "0\201\376\2\2\30oc\201\367\4'CN=Configuration,DC="..., 257) = 257 10:43:34.728673 time(NULL) = 1270608214 10:43:34.728723 poll([{fd=5, events=POLLIN|POLLPRI|POLLERR|POLLHUP}, {fd=-1}, {fd=-1}, {fd=12, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 4, 120000) = 1 ([{fd=12, revents=POLLIN}]) 10:43:34.729401 read(12, "0\204\0\0\0R\2\2", 8) = 8 10:43:34.729465 read(12, "\30os\204\0\0\0H\4Fldap://xxx.xxx.xxx/C"..., 80) = 80
- dovecot --build-options output Build options: ioloop=epoll notify=inotify ipv6 openssl Mail storages: cydir dbox maildir mbox raw shared SQL drivers: Passdb: checkpassword pam passwd passwd-file shadow Userdb: nss passwd passwd-file prefetch static
Yours Sincerely, Jacky Chan
CHAN Hoi Kei, Jacky Technical and User Support Section Information and Communication Technology Office University of Macau DL: (853) 8397 8629 | FAX: (853) 2883 5606
From: Timo Sirainen tss@iki.fi To: JackyC@umac.mo Cc: dovecot@dovecot.org Date: 07/04/2010 上午 10:36 Subject: Re: [Dovecot] Timeout Value
Dear All,
I constantly encounter 'dovecot-auth', 'login-imap', 'login-pop3' and 'imap' processes consume 100% CPU. I run 'ps' and found those hanged up processes were running for a long time. I am wondering whether dovecot.conf has some configuration parameters
On Tue, 2010-04-06 at 12:13 +0800, JackyC@umac.mo wrote: that
timeout those processes (or kill those after certain period of time). I am using dovecot 1.2.11 on RHEL 5.3.
There is no timeout value, because those processes just shouldn't be eating 100% CPU. Strange that all of your processes are eating 100% CPU. You do mean all dovecot processes eat 100% CPU, right? And it happens immediately after they start up? I guess there's something really wrong in your installation. A few things that might show something useful:
strace -tt -p <pid> output for a 10 lines or so.
dovecot --build-options output
[attachment "signature.asc" deleted by JackyC/UMAC]
On Wed, 2010-04-07 at 10:56 +0800, JackyC@umac.mo wrote:
Hi Timo,
You do mean all dovecot processes eat 100% CPU, right? And it happens immediately after they start up?
Not all. For example, we have 2 "dovecot-auth -w". Suddenly one of them will eat 100%. After a certain period of time (short/long, not a constant), another will come up. They didn't eat 100% when they start up even.
What about other processes than dovecot-auth -w? When do they start to eat 100% CPU?
10:43:34.728594 write(12, "0\201\376\2\2\30oc\201\367\4'CN=Configuration,DC="..., 257) = 257
At least for dovecot-auth -w it looks like it's doing an LDAP lookup (or connect) in here. Maybe it keeps rapidly reconnecting to it all the time?.. Anything in Dovecot's error logs? http://wiki.dovecot.org/Logging
But you said other processes also eat 100% CPU, and LDAP doesn't explain that.
Hi Timo,
What about other processes than dovecot-auth -w? When do they start to eat 100% CPU? Not certain, sometime when the system start doing backup, such process like 'imap', 'procmail' and even postfix's 'proxymap' and 'local' process. The above process also depend on LDAP lookup. Sometime when 2 'dovecot-auth -w' eat 100% CPU, dovecot will die sometime are not.
At least for dovecot-auth -w it looks like it's doing an LDAP lookup (or connect) in here. Maybe it keeps rapidly reconnecting to it all the time?.. Anything in Dovecot's error logs? I think we don't have many massive connection between our LDAP server since we have NSCD to cache the LDAP lookup result in local server. I am not quite sure LDAP lookup can explain but the 'dovecot-auth -w' does eat 100% CPU when looking up LDAP at the last output of strace
I have been encountering this problem for a month, and I upgrade dovecot from 1.2.1 to 1.2.11 for that. I may suspect is there any hardware problem such as disk, can a clean reboot help?
Yours Sincerely, Jacky Chan
CHAN Hoi Kei, Jacky Technical and User Support Section Information and Communication Technology Office University of Macau DL: (853) 8397 8629 | FAX: (853) 2883 5606
From: Timo Sirainen tss@iki.fi To: JackyC@umac.mo Cc: dovecot@dovecot.org Date: 07/04/2010 上午 11:03 Subject: Re: [Dovecot] Timeout Value
On Wed, 2010-04-07 at 10:56 +0800, JackyC@umac.mo wrote:
Hi Timo,
You do mean all dovecot processes eat 100% CPU, right? And it happens immediately after they start up?
Not all. For example, we have 2 "dovecot-auth -w". Suddenly one of them will eat 100%. After a certain period of time (short/long, not a constant), another will come up. They didn't eat 100% when they start up even.
What about other processes than dovecot-auth -w? When do they start to eat 100% CPU?
10:43:34.728594 write(12, "0\201\376\2\2\30oc\201\367\4'CN=Configuration,DC="..., 257) = 257
At least for dovecot-auth -w it looks like it's doing an LDAP lookup (or connect) in here. Maybe it keeps rapidly reconnecting to it all the time?.. Anything in Dovecot's error logs? http://wiki.dovecot.org/Logging
But you said other processes also eat 100% CPU, and LDAP doesn't explain that.
[attachment "signature.asc" deleted by JackyC/UMAC]
On Wed, 2010-04-07 at 11:20 +0800, JackyC@umac.mo wrote:
At least for dovecot-auth -w it looks like it's doing an LDAP lookup (or connect) in here. Maybe it keeps rapidly reconnecting to it all the time?.. Anything in Dovecot's error logs? I think we don't have many massive connection between our LDAP server since we have NSCD to cache the LDAP lookup result in local server. I am not quite sure LDAP lookup can explain but the 'dovecot-auth -w' does eat 100% CPU when looking up LDAP at the last output of strace
Oh, right, you're using NSS lookups instead of Dovecot's direct LDAP lookups. And the strace output shows it's calling poll(), but Dovecot was compiled to use epoll(). That means the problem is with your NSS configuration/libraries/whatever, nothing to do with Dovecot really.
Oh, right, you're using NSS lookups instead of Dovecot's direct LDAP lookups. And the strace output shows it's calling poll(), but Dovecot was compiled to use epoll(). That means the problem is with your NSS configuration/libraries/whatever, nothing to do with Dovecot really.
How about to configure Dovecot to do LDAP lookups directly to isolate the problem? May I ask what is the different between poll() and epoll() ?
Yours Sincerely, Jacky Chan
From: Timo Sirainen tss@iki.fi To: JackyC@umac.mo Cc: dovecot@dovecot.org Date: 07/04/2010 上午 11:28 Subject: Re: [Dovecot] Timeout Value Sent by: dovecot-bounces+jackyc=umac.mo@dovecot.org
On Wed, 2010-04-07 at 11:20 +0800, JackyC@umac.mo wrote:
At least for dovecot-auth -w it looks like it's doing an LDAP lookup (or connect) in here. Maybe it keeps rapidly reconnecting to it all the time?.. Anything in Dovecot's error logs? I think we don't have many massive connection between our LDAP server since we have NSCD to cache the LDAP lookup result in local server. I am not quite sure LDAP lookup can explain but the 'dovecot-auth -w' does eat 100% CPU when looking up LDAP at the last output of strace
Oh, right, you're using NSS lookups instead of Dovecot's direct LDAP lookups. And the strace output shows it's calling poll(), but Dovecot was compiled to use epoll(). That means the problem is with your NSS configuration/libraries/whatever, nothing to do with Dovecot really.
[attachment "signature.asc" deleted by JackyC/UMAC]
On Wed, 2010-04-07 at 11:48 +0800, JackyC@umac.mo wrote:
Oh, right, you're using NSS lookups instead of Dovecot's direct LDAP lookups. And the strace output shows it's calling poll(), but Dovecot was compiled to use epoll(). That means the problem is with your NSS configuration/libraries/whatever, nothing to do with Dovecot really.
How about to configure Dovecot to do LDAP lookups directly to isolate the problem?
That would probably help.
May I ask what is the different between poll() and epoll() ?
Nothing important. What I meant by that above is that strace shows that the code that's running and eating 100% CPU is either pam_ldap or nss_ldap, not Dovecot.
Hi Timo,
Thank you very much for your help!! I will try to use Dovecot ldap plugin and find is there any patches for nss_ldap package. Since nss_ldap maybe the root cause.
Yours Sincerely, Jacky Chan
From: Timo Sirainen tss@iki.fi To: JackyC@umac.mo Cc: Dovecot Mailing List dovecot@dovecot.org Date: 07/04/2010 上午 11:51 Subject: Re: [Dovecot] Timeout Value
Oh, right, you're using NSS lookups instead of Dovecot's direct LDAP lookups. And the strace output shows it's calling poll(), but Dovecot was compiled to use epoll(). That means the problem is with your NSS configuration/libraries/whatever, nothing to do with Dovecot really.
How about to configure Dovecot to do LDAP lookups directly to isolate
On Wed, 2010-04-07 at 11:48 +0800, JackyC@umac.mo wrote: the
problem?
That would probably help.
May I ask what is the different between poll() and epoll() ?
Nothing important. What I meant by that above is that strace shows that the code that's running and eating 100% CPU is either pam_ldap or nss_ldap, not Dovecot.
[attachment "signature.asc" deleted by JackyC/UMAC]
participants (4)
-
Brian Candler
-
Charles Marcus
-
JackyC@umac.mo
-
Timo Sirainen