Hi Timo, thanks for replying :)
Timo Sirainen wrote:
On Tue, 2009-07-14 at 14:52 +0200, Sandro Tosi wrote:
dovecot: 2009-07-14 04:05:04 Warning: SIGHUP received - reloading configuration dovecot: 2009-07-14 04:05:04 Error: Temporary failure in creating login processes, slowing down for now
Does it say elsewhere why this temporary failure happened?
There is nothing relevant in the other logs in /var/log - is there a logfile you're expecting to contain interesting messages (other than /var/log/dovecot of course ;) )?
dovecot: 2009-07-14 04:05:04 Fatal: listen(143) failed: Interrupted system call
we can reproduce this error with this tight loop:
while date ; do kill -HUP $(cat /var/run/dovecot/master.pid) ; sleep 15s ; done
Yeah. listen() can be made to fail by sending a signal at the same time it's running. But if a signal is sent only once every 15 seconds, that's a bit strange.. If you run
ok, so that's a known behavior, but there are other situations apart that "stress" busy loop that can result in a crash of dovecot?
strace -tt -p
pidof dovecot
what does it say between the last two HUPs before failing? Ok, so I was able to replicate the problem:
$ while date ; do kill -HUP $(cat /var/run/dovecot/master.pid) ; sleep 15s ; done ... Wed Jul 15 09:04:55 CEST 2009 Wed Jul 15 09:05:10 CEST 2009 Wed Jul 15 09:05:25 CEST 2009 cat: /var/run/dovecot/master.pid: No such file or directory kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
so it's dead.
Attached the strace file from 09.04 to 09.05 , the time frame where the issue happened.
If you want me to do something else, I'm happy to :)
Thanks, Sandro