On Wed, 2006-10-11 at 12:39 -0500, Ejay Hire wrote:
Hello all.
I have experienced an unusual behaviour from dovecot which I believe to be related to some interaction between dovecot and nss_ldap/Pam.
Centos/Postfix/mbox(es)/Openldap/dovecot. The box is using PAM with ldap, and is also running BIND, SAMBA, and DHCPD. Dovecot 1.0rc2 from source.
I left dovecot at the default, to use PAM for authentication and guess the mailbox. This worked well, except during peak times, dovecot would "hang", waiting an exceptionally long time after connecting before doing authentication. The users would report this as "Send/receive stuck on 64%" or something like that. During this time, I would note a number of <defunct> dovecot processes in ps. Increasing the number of idle login proceses in the pool to ridiculously high values (30) did not affect the symptom. Restarting dovecot would immediately resolve the issue.
I believe this to be related somehow to dovecot's PAM interaction, because I was able to work around it by setting dovecot to talk directly to LDAP. Googling found isolated reports of similar behaviour in the Fedora Core 3 bug list at Redhat.
Well, two things:
"Send/receive stuck on 64%" would mean that it hanged after logging in. Dovecot-auth's hangs can't cause that, unless your whole computer somehow hangs.
Second, I'm guessing this would have more to do with nss_ldap. Dovecot handles PAM lookups in separate processes, but nss_ldap looks are done in the same dovecot-auth process, and since they're blocking calls they could hang the process. So what might help is raising number of auth processes ("count" inside auth section).
Anyway I really wouldn't suggest using nss_ldap since it's been known to give broken replies with Dovecot. Eg. see this thread http://dovecot.org/list/dovecot/2006-September/016454.html