On Sep 2, 2008, at 4:48 PM, Adam McDougall wrote:
I would guess this is unlikely to be dovecot's fault, but I'm
wondering if anyone has any ideas of what might have happened based
on the evidence. My best guess is some kind of resource limit was
reached but I don't see any evidence in the logs, and the condition
is now gone.Suddenly this morning, one (and only one) of my dovecot servers
decided to start failing all logins since 08:25:04 until we
restarted dovecot, at which point they were working fine. The
number of imap-login processes was under the limit, but there were
some obvious PAM errors at the time. My account could still ssh to
the system so I don't think it was a problem general authentication,
and NIS on other systems was working fine. No one was logged into
that server at the time the problems occurred, and I don't think
anything happened to the actual pam libraries to make them missing
since dovecot worked after a restart. I should have used other
means to prevent people from using that dovecot instance rather than
stopping it, and I'll do so if it happens again in hopes of further
debugging.
Maybe your PAM plugins are leaking memory/fds. Have you set
auth_worker_max_request_count to non-zero? That could help.