[Dovecot] child (login) killed with signal 9
Hi, I'm running dovecot on an Ubuntu server (1:1.0.5-1ubuntu2). Dovecot provides pop3, imap, and sasl to postfix. The setup works quite nicely, however I do have one error that shows up repeatedly in my dovecot.log.
<snip> dovecot: 2007-12-12 09:29:06 Error: child 32765 (login) killed with signal 9 dovecot: 2007-12-12 09:29:06 Error: child 18039 (login) killed with signal 9 dovecot: 2007-12-12 10:05:07 Error: child 26088 (login) killed with signal 9 dovecot: 2007-12-12 10:05:07 Error: child 5271 (login) killed with signal 9 dovecot: 2007-12-12 10:05:07 Error: child 5273 (login) killed with signal 9 <snip>
I'm not aware that this is actually causing problems, but since it shows up several hundred times a day on a couple-user setup, I'm rather curious about what's causing it.
Thanks, Matt
Anyone? Bueller?
On Dec 12, 2007 9:19 PM, Matt LaPlante <cyberdog3k@gmail.com> wrote:
Hi, I'm running dovecot on an Ubuntu server (1:1.0.5-1ubuntu2). Dovecot provides pop3, imap, and sasl to postfix. The setup works quite nicely, however I do have one error that shows up repeatedly in my dovecot.log.
<snip> dovecot: 2007-12-12 09:29:06 Error: child 32765 (login) killed with signal 9 dovecot: 2007-12-12 09:29:06 Error: child 18039 (login) killed with signal 9 dovecot: 2007-12-12 10:05:07 Error: child 26088 (login) killed with signal 9 dovecot: 2007-12-12 10:05:07 Error: child 5271 (login) killed with signal 9 dovecot: 2007-12-12 10:05:07 Error: child 5273 (login) killed with signal 9 <snip>
I'm not aware that this is actually causing problems, but since it shows up several hundred times a day on a couple-user setup, I'm rather curious about what's causing it.
Thanks, Matt
At 1:13 PM -0500 12/16/07, Matt LaPlante wrote:
Anyone? Bueller?
Chances are that no one else is seeing this...
On Dec 12, 2007 9:19 PM, Matt LaPlante <cyberdog3k@gmail.com> wrote:
Hi, I'm running dovecot on an Ubuntu server (1:1.0.5-1ubuntu2). Dovecot provides pop3, imap, and sasl to postfix. The setup works quite nicely, however I do have one error that shows up repeatedly in my dovecot.log.
<snip> dovecot: 2007-12-12 09:29:06 Error: child 32765 (login) killed with signal 9 dovecot: 2007-12-12 09:29:06 Error: child 18039 (login) killed with signal 9 dovecot: 2007-12-12 10:05:07 Error: child 26088 (login) killed with signal 9 dovecot: 2007-12-12 10:05:07 Error: child 5271 (login) killed with signal 9 dovecot: 2007-12-12 10:05:07 Error: child 5273 (login) killed with signal 9 <snip>
I'm not aware that this is actually causing problems, but since it shows up several hundred times a day on a couple-user setup, I'm rather curious about what's causing it.
It is rather odd. Signal 9 is "KILL" and that is a rather rude way to be knocking out dovecot login processes. It isn't a normal occurrence on the Dovecot machines I work with, and it seems unlikely that anything internal to Dovecot would be doing that sort of thing.
Since the 5 you show imply (weakly) that the slaughter is taking out login processes in simultaneous bunches a few seconds into a minute, I'd start looking at whether you have any sort of ill-considered system housekeeping running from cron.
-- Bill Cole bill@scconsult.com
On 13.12.2007, at 4.19, Matt LaPlante wrote:
dovecot: 2007-12-12 09:29:06 Error: child 32765 (login) killed with
signal 9
Well, your message inspired me to waste some time and money on
creating a new wiki: http://www.unixcoding.org/ and add a page there:
http://www.unixcoding.org/Signals:
SIGKILL
Number: 9
Killed by kernel, for example Linux's Out of Memory killer. dmesg may
show the reason.
Does setting login_process_size=64 fix this? If so, I'll update that
text to mention setrlimit() address space limit.
On Dec 16, 2007 3:15 PM, Timo Sirainen <tss@iki.fi> wrote:
On 13.12.2007, at 4.19, Matt LaPlante wrote:
dovecot: 2007-12-12 09:29:06 Error: child 32765 (login) killed with signal 9
Well, your message inspired me to waste some time and money on creating a new wiki: http://www.unixcoding.org/ and add a page there: http://www.unixcoding.org/Signals: SIGKILL
Number: 9
Killed by kernel, for example Linux's Out of Memory killer. dmesg may show the reason.
I'm familiar with sigkill. :) The mystery to me is who is sending the signals...
There is no evidence of anything in dmesg or syslog. free reports consistent amounts of space available, and none of my other daemons seem to be subject to any ill effects. Client connection rates are very minimal and consistent. To answer Bill, there are no cron jobs on the system doing any sort of babysitting that would be killing processes.
Being openvz, I suppose there is room for shenanigans to be occurring in the background, but it doesn't really explain why only dovecot seems to be affected. (Unless it's a matter of perception, and only dovecot is logging such things).
Does setting login_process_size=64 fix this? If so, I'll update that text to mention setrlimit() address space limit.
This is currently 32 (according to dovecot.conf). Wouldn't raising it to 64 exacerbate memory problems if they do indeed exist?
On Sun, 2007-12-16 at 15:56 -0500, Matt LaPlante wrote:
Does setting login_process_size=64 fix this? If so, I'll update that text to mention setrlimit() address space limit.
This is currently 32 (according to dovecot.conf). Wouldn't raising it to 64 exacerbate memory problems if they do indeed exist?
No. The setting gives the maximum virtual size for the process. I've already increased it to 64MB because shared library mappings have caused 32MB not to be enough in some systems. Maybe in your system 32MB is just enough to get the process started, but not enough to always allocate enough memory for heap/stack.
On Dec 16, 2007 4:40 PM, Timo Sirainen <tss@iki.fi> wrote:
On Sun, 2007-12-16 at 15:56 -0500, Matt LaPlante wrote:
Does setting login_process_size=64 fix this? If so, I'll update that text to mention setrlimit() address space limit.
This is currently 32 (according to dovecot.conf). Wouldn't raising it to 64 exacerbate memory problems if they do indeed exist?
No. The setting gives the maximum virtual size for the process. I've already increased it to 64MB because shared library mappings have caused 32MB not to be enough in some systems. Maybe in your system 32MB is just enough to get the process started, but not enough to always allocate enough memory for heap/stack.
No joy with that fix I'm afraid. Still the same clusters of kill signals with no other explanation. I'm not terribly worried about it, so I don't want to bother anyone, but I'm still open to more suggestions. :)
participants (3)
-
Bill Cole
-
Matt LaPlante
-
Timo Sirainen