[Dovecot] dovecot-auth consumes 100% CPU time on Solaris 10

Bart Smaalders barts at smaalders.net
Tue Nov 27 18:48:58 EET 2007


Mark Heitmann wrote:
>> >>/ Did you compile with Solaris's own LDAP library or with OpenLDAP?
>> />/ I'm using iPlanet DS and Solaris's LDAP library.
>> /
>> People have had different kinds of problems with Solaris LDAP  
>> library. You could try if OpenLDAP works better.
> 
> I have the same problem also with openLDAP (compiled with gcc 3.4.3 on
> Solaris 10 x86 (Update 4)). When I build dovecot with ldap-support, the
> dovecot-auth process gets 100% CPU time, without ldap-support, the
> problem doesn't exists.
> 
> The machine is a Sun Fire X2200 M2 with an AMD Opteron processor
> on actual patch level. Does somebody have a solution for this problem
> (pollsys) under Solaris 10?
> 
> Greets,
> Mark
> 

What does truss report the verbose arguments to pollsys to be?

eg:

# truss -v pollsys -p `pgrep dovecot-auth`

I get something like this:

: root at otter[1]; truss -v pollsys -p `pgrep dovecot-auth`
pollsys(0x08094A48, 14, 0x08047B38, 0x00000000) (sleeping...)
         fd=5  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=7  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=0  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=3  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=9  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=11 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=14 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=10 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=12 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=15 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=13 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=16 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=17 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=18 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         timeout: 1.999000000 sec
pollsys(0x08094A48, 14, 0x08047B38, 0x00000000) = 0
         fd=5  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=7  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=0  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=3  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=9  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=11 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=14 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=10 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=12 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=15 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=13 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=16 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=17 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=18 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         timeout: 1.999000000 sec
pollsys(0x08094A48, 14, 0x08047B38, 0x00000000) = 0
         fd=5  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=7  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=0  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=3  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=9  ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=11 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=14 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=10 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=12 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=15 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=13 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=16 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=17 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         fd=18 ev=POLLIN|POLLPRI|POLLERR|POLLHUP|POLLNVAL rev=0
         timeout: 0.000000000 sec

Now, why it sometimes has a 0 second time isn't clear, but it
does.  I'm curious whether or not yours always has a zero
second timeout?

You can find out where the pollsys calls are coming from w/
dtrace:
: root at otter[12]; dtrace -n 'syscall::pollsys:entry/execname == 
"dovecot-auth"/{ustack(10)}'

   1   4595                    pollsys:entry
               libc.so.1`__pollsys+0x7
               libc.so.1`poll+0x52
               dovecot-auth`io_loop_handler_run+0x35
               dovecot-auth`io_loop_run+0x21
               dovecot-auth`main+0x3fe
               dovecot-auth`_start+0x80

   1   4595                    pollsys:entry
               libc.so.1`__pollsys+0x7
               libc.so.1`poll+0x52
               dovecot-auth`io_loop_handler_run+0x35
               dovecot-auth`io_loop_run+0x21
               dovecot-auth`main+0x3fe
               dovecot-auth`_start+0x80

   1   4595                    pollsys:entry
               libc.so.1`__pollsys+0x7
               libc.so.1`poll+0x52
               dovecot-auth`io_loop_handler_run+0x35
               dovecot-auth`io_loop_run+0x21
               dovecot-auth`main+0x3fe
               dovecot-auth`_start+0x80

   1   4595                    pollsys:entry
               libc.so.1`__pollsys+0x7
               libc.so.1`poll+0x52
               dovecot-auth`io_loop_handler_run+0x35
               dovecot-auth`io_loop_run+0x21
               dovecot-auth`main+0x3fe
               dovecot-auth`_start+0x80


- Bart


More information about the dovecot mailing list