On Apr 17, 2008, at 7:36 PM, richs@whidbey.net wrote:
Timo Sirainen wrote:
On Apr 17, 2008, at 6:56 PM, richs@whidbey.net wrote:
We recently began seeing server crashes in our cluster related to
"pop3-login", which is causing "oom-killer" to be invoked. The
server only recovers after a reboot. So oom-killer doesn't solve the issue? Then it's likely it has
nothing to do with pop3-login, OOM killer just selects a bad target
to kill (and Dovecot happily restarts a new pop3-login process)
while the real memory-eating process stays alive. Can you check
with ps what process(es) are eating all the memory?That's a good point. Actually, oom-killer does solve the issue
initially, but in every case the server eventually locks up (around
30 minutes later).Unfortunately at this point "ps" and "top" can't run, so we haven't
been able to collect much information. Here's a complete look at
the "oom-killer" events:
What do you have login_process_size set to? The default should be
64MB, so unless you changed that a single login process couldn't be
able to use up all memory (but all of them at once could, of course).
So maybe setting login_process_per_connection=no and
login_max_processes_count=(something small) could show something useful.