On January 12, 2017 at 9:55 PM Matt Simpson <dclist@list.jmatt.net> wrote:
I’m running dovecot 2.2.27 and pigeonhole 0.4.16 on FreeBSD 11.
I’m using the pigeonhole/sieve external pipe plugin to run a Perl program to send a Pushover notification when certain messages are received.
The Perl script is executed, and the notification is sent. But then the script task seems to go zombie until it is killed after a timeout.
In the user’s sieve log, I get a message like
error: msgid=<20170112191921.66140.qmail@v1.redhorse.me>: pipe action: failed to pipe message to program `sievepush.pl': refer to server log for more information. [2017-01-12 14:19:36].
(even though the message really was piped to the program successfully)
In the dovecot server log, I see
Jan 12 14:19:21 v1 dovecot: lda(matt): Debug: sieve: Executing script from
/usr/home/matt/maildoms/.dovecot.svbin' Jan 12 14:19:21 v1 dovecot: lda(matt): Debug: sieve: action pipe: running program: sievepush.pl Jan 12 14:19:21 v1 dovecot: lda(matt): Debug: Mailbox stdin: Opened mail UID=1 because: mail stream Jan 12 14:19:21 v1 dovecot: lda(matt): Debug: waiting for program
/usr/local/lib/dovecot/sieve-pipe/sievepush.pl' to finish after 0 msecs Jan 12 14:19:31 v1 dovecot: lda(matt): Debug: program/usr/local/lib/dovecot/sieve-pipe/sievepush.pl'(66145) execution timed out after 10000 milliseconds: sending TERM signal Jan 12 14:19:36 v1 dovecot: lda(matt): Debug: program
/usr/local/lib/dovecot/sieve-pipe/sievepush.pl' (66145) did not die after 5000 milliseconds: sending KILL signalIn the process list during that 10 second interval, I see
matt 66142 29972 801 801 0 S - 0:00.00 bin/qmail-local -- matt /home/matt/maildoms jmn-matt - jmn-m matt 66143 66142 801 801 0 S - 0:00.00 /var/qmail/bin/preline -f /usr/local/libexec/dovecot/dovecot matt 66144 66143 801 801 0 S - 0:00.01 /usr/local/libexec/dovecot/dovecot-lda matt 66145 66144 801 801 0 Z - 0:00.65 <defunct>
I’m not a Unix programming ace, but from what I’ve been able to find out, this seems to mean that the lda process is forking another process to run the pipe script, and not getting the proper notification when it finishes (not issuing a wait?). So after 10 seconds, it sends a TERM to the task which is no longer running, and when that doesn’t work, it sends a KILL. Anybody know what’s happening here?
Seems that we are not doing waitpid() on your program when it's killed. Also, I guess we should wait longer than 0 msecs. I'll try and see if I can replicate this.
Aki