Timo Sirainen wrote:
On Apr 17, 2008, at 7:36 PM, richs@whidbey.net wrote:
Timo Sirainen wrote:
We recently began seeing server crashes in our cluster related to "pop3-login", which is causing "oom-killer" to be invoked. The server only recovers after a reboot. So oom-killer doesn't solve the issue? Then it's likely it has nothing to do with pop3-login, OOM killer just selects a bad target to kill (and Dovecot happily restarts a new pop3-login process) while
On Apr 17, 2008, at 6:56 PM, richs@whidbey.net wrote: the real memory-eating process stays alive. Can you check with ps what process(es) are eating all the memory?
That's a good point. Actually, oom-killer does solve the issue initially, but in every case the server eventually locks up (around 30 minutes later).
Unfortunately at this point "ps" and "top" can't run, so we haven't been able to collect much information. Here's a complete look at the "oom-killer" events:
What do you have login_process_size set to? The default should be 64MB, so unless you changed that a single login process couldn't be able to use up all memory (but all of them at once could, of course). So maybe setting login_process_per_connection=no and login_max_processes_count=(something small) could show something useful.
The login_process_size was commented out, so must've been 64MB. We'll try what you suggested and let you know what we see. Thanks Timo!
-Rich