On 5.2.2013, at 11.57, Jan-Frode Myklebust janfrode@tanso.net wrote:
I think there must be some bug I'm hitting here. One of my directors is still running with "client_limit = 1, process_limit = 100" for the lmtp service, and now it's logging:
master: Warning: service(lmtp): process_limit (100) reached, client connections are being dropped
Checking "sudo netstat -anp|grep ":24 " I see 287 ports in TIME_WAIT, one in CLOSE_WAIT and the listening "0.0.0.0:24". No active connections. There are 100 lmtp-processes running.
Sounds like the LMTP processes are hanging for some reason.. http://hg.dovecot.org/dovecot-2.1/rev/63117ab893dc might show something interesting, although I'm pretty sure it will just say that the processes are hanging in DATA command.
Other interesting things to check:
gdb -p <pid of lmtp process> bt full
strace -tt -p <pid of lmtp process> (for a few seconds to see if anything is happening)
If lmtp proxy is hanging, it should have a timeout (default 30 secs) and it should log about it if it triggers. (Although maybe not to error log.)
When trying to connect to the lmtp-port I immediately get dropped:
$ telnet localhost 24 Trying 127.0.0.1... Connected to localhost.localdomain (127.0.0.1). Escape character is '^]'. Connection closed by foreign host.
This happens when the master process notices that all the service processes are full.
Is there maybe some counter that's getting out of sync, or some back off penalty algorithm that kicks in when it first hit the process limit ?
Shouldn't be, but the proctitle patch should make it clearer. Strange anyway, I haven't heard of anything like this happening before.