Understanding why Dovecot unexpectedly died
Luca Bertoncello
lucabert at lucabert.de
Sat Nov 15 19:00:13 UTC 2014
Hi list!
I use Dovecot 1.2.17 (I can't upgrade right now, due to many reasons),
controlled by Pacemaker (I have an HA-Cluster).
Now I see that Pacemaker restarts often Dovecot. I wrote my own script to
manage Dovecot, since Pacemaker does not have his own.
My script, by the "monitor" section has this:
monitor)
if [ ! -e $OCF_RESKEY_pid ]; then
echo "stopped (no pidfile)"
echo "DOVECOT STOPPED - NO PIDFILE" | /usr/bin/logger -p local0.info -t DOVECOT-MONITOR -i
exit $OCF_NOT_RUNNING
else
/bin/ps axuwf | /bin/grep `/bin/cat $OCF_RESKEY_pid` | /bin/grep -v grep > /dev/null 2>&1
if [ $? -ne 0 ]; then
echo "stopped"
echo "DOVECOT STOPPED - NO PROCESS" | /usr/bin/logger -p local0.info -t DOVECOT-MONITOR -i
exit $OCF_NOT_RUNNING
else
if [ "`/bin/netstat -tupan | /bin/grep dovecot | /bin/grep $OCF_RESKEY_bindaddr | /usr/bin/wc -l`" -ne 0 ]; then
exit $OCF_SUCCESS
else
echo "DOVECOT STOPPED - NO LISTEN [`/bin/netstat -tupan | /bin/grep dovecot`]" | /usr/bin/logger -p local0.info -t DOVECOT-MONITOR -i
exit $OCF_ERR_GENERIC
fi
fi
fi
exit $OCF_SUCCESS
;;
The "loggers" was added now to try to understand why it dies...
Well, I can see in my syslog, when Pacemaker restarts Dovecot, these lines:
ov 15 18:59:09 mail01 DOVECOT-MONITOR[530]: DOVECOT STOPPED - NO LISTEN [tcp 0 0 192.168.33.1:37545 192.168.33.3:3306 ESTABLISHED 637/dovecot-auth
Nov 15 18:59:09 mail01 DOVECOT-MONITOR[530]: tcp 0 0
192.168.33.1:37537 192.168.33.3:3306 ESTABLISHED 529/dovecot-auth]
So, there is no "dovecot"-Process listening anymore... Normally I have these:
tcp 0 0 0.0.0.0:110 0.0.0.0:* LISTEN 634/dovecot
tcp 0 0 0.0.0.0:143 0.0.0.0:* LISTEN 634/dovecot
tcp 0 0 0.0.0.0:993 0.0.0.0:* LISTEN 634/dovecot
tcp 0 0 0.0.0.0:995 0.0.0.0:* LISTEN 634/dovecot
tcp 0 0 192.168.33.1:40994 192.168.33.3:3306 VERBUNDEN 891/dovecot-auth
tcp 0 0 192.168.33.1:40984 192.168.33.3:3306 VERBUNDEN 638/dovecot-auth
tcp6 0 0 :::110 :::* LISTEN 634/dovecot
tcp6 0 0 :::143 :::* LISTEN 634/dovecot
tcp6 0 0 :::993 :::* LISTEN 634/dovecot
tcp6 0 0 :::995 :::* LISTEN 634/dovecot
In the mail.log and mail.err I can't see anything but:
Nov 15 18:59:13 mail01 dovecot: Dovecot v1.2.17 starting up
Nov 15 18:59:13 mail01 dovecot: auth-worker(default): mysql: Connected to 192.168.33.3 (exim)
And in the syslos there is nothing about Dovecot...
Any idea?
Thanks a lot!
Luca Bertoncello
(lucabert at lucabert.de)
More information about the dovecot
mailing list