One of our backend pop/imap-server running dovecot v1.2.16 experienced some problems yesterday. It suddenly couldn't authenticate users anymore, flooding the logs with:
pop3-login: Disconnected (auth failed, 1 attempts): user=<user1@example.com>, method=PLAIN, rip=192.168.42.15, lip=192.168.42.28
The problem seemed to start with this log entry:
Mar 29 18:57:21 popimap1 dovecot: auth(default): ldap(userX@example.com,192.168.42.15): Connection appears to be hanging, reconnecting
and this suspicious entry a few seconds later:
Mar 29 18:57:41 popimap1 dovecot: auth(default): LDAP: Reply with unknown msgid 2
Afterward 18:57:21 it was unable to authenticate any users before I restarted the dovecot service.
Also some of our postfix servers reported ldap lookup failures between 18:56-18:58, but then recovered.
Looking at the ldap server side, we saw no problems logged, but netstat reports there being 3 active ldap sessions towards the dovecot server that failed yesterday, while the dovecot server only has 1:
ldap1$ netstat -na |grep 192.168.42.28
tcp 0 0 192.168.43.2:389 192.168.42.28:44408 ESTABLISHED
tcp 0 0 192.168.43.2:389 192.168.42.28:36464 ESTABLISHED
tcp 0 0 192.168.43.2:389 192.168.42.28:60580 ESTABLISHED
popimap1$ netstat -na| grep 192.168.43.2
tcp 0 0 192.168.42.28:60580 192.168.43.2:389 ESTABLISHED
So it looks to us like there was some problem with either the network, or the ldap-server for a short periode, and that dovecot then wasn't able to recover from this.
-jf