I found a work-around. I added an :output option to the execute command, and then it works like a charm.
I've run the whole dovecot-lda under strace (without the :output option, when it shows the problem):
6410 stat("/usr/local/lib/dovecot/sieve-execute/foofilter", {st_mode=S_IFREG|0755, st_size=57, ...}) = 0 6410 rt_sigaction(SIGCHLD, {sa_handler=0x7f65d5bdd010, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f65d57c3060}, NULL, 8) = 0 6410 epoll_ctl(6, EPOLL_CTL_DEL, 4, 0x7ffe88f53b2c) = 0 6410 epoll_create(128) = 10 6410 fcntl(10, F_GETFD) = 0 6410 fcntl(10, F_SETFD, FD_CLOEXEC) = 0 6410 epoll_ctl(10, EPOLL_CTL_ADD, 4, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=2603604112, u64=94543423723664}}) = 0 6410 pipe([11, 12]) = 0 6410 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f65d65a84d0) = 6411 6410 close(11) = 0 6410 fcntl(12, F_GETFL <unfinished ...> 6411 close(12 <unfinished ...> 6410 <... fcntl resumed> ) = 0x1 (flags O_WRONLY) 6411 <... close resumed> ) = 0 6410 fcntl(12, F_SETFL, O_WRONLY|O_NONBLOCK) = 0 6411 geteuid() = 1000 6410 lseek(12, 0, SEEK_CUR <unfinished ...> 6411 getuid( <unfinished ...> 6410 <... lseek resumed> ) = -1 ESPIPE (Illegal seek) 6411 <... getuid resumed> ) = 1000 6410 getsockname(12, <unfinished ...> 6411 getgid( <unfinished ...> 6410 <... getsockname resumed> 0x7ffe88f53a40, [28]) = -1 ENOTSOCK (Socket operation on non-socket) 6411 <... getgid resumed> ) = 1000 6411 getegid() = 1000 6411 setuid(1000 <unfinished ...> 6410 write(12, "Return-path: <master@blitz.conur"..., 706 <unfinished ...> 6411 <... setuid resumed> ) = 0 6410 <... write resumed> ) = 706 6411 setuid(0 <unfinished ...> 6410 read(0, <unfinished ...> 6411 <... setuid resumed> ) = -1 EPERM (Operation not permitted) 6410 <... read resumed> "", 3408) = 0 6411 getgid() = 1000 6410 close(12 <unfinished ...> 6411 getegid( <unfinished ...> 6410 <... close resumed> ) = 0 6411 <... getegid resumed> ) = 1000 6410 wait4(6411, <unfinished ...> 6411 setgid(0 <unfinished ...> 6410 <... wait4 resumed> 0x55fc9b30efe4, WNOHANG, NULL) = 0 6411 <... setgid resumed> ) = -1 EPERM (Operation not permitted) 6410 epoll_wait(10, <unfinished ...> (insert here the execution of the script as process 6411) 6411 exit_group(1) = ? 6411 +++ exited with 1 +++ 6410 <... epoll_wait resumed> [], 1, 300000) = 0 6410 kill(6411, SIGTERM) = 0 6410 epoll_wait(10, [], 1, 5000) = 0 6410 kill(6411, SIGKILL) = 0 6410 wait4(6411, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6411 6410 epoll_ctl(10, EPOLL_CTL_DEL, 4, 0x7ffe88f53b2c) = 0 6410 epoll_ctl(6, EPOLL_CTL_ADD, 4, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=2603424576, u64=94543423544128}}) = 0 6410 close(10) = 0
then it does the delivery into the test-spam mailbox.
You see that dovecot-lda doesn't do a waitpid()/wait4() on the script (child process) before starting to kill it.
On Tue, Nov 13, 2018 at 05:52:27PM +0100, Lionel Elie Mamane wrote:
And the Sieve script:
------- start sieve script --------- require ["vnd.dovecot.execute", "fileinto", "mailbox", "vnd.dovecot.debug"];
debug_log "about to execute"; if not execute :pipe "foofilter" { debug_log "if condition fulfilled"; fileinto :create "test-spam"; debug_log "about to stop"; stop; } debug_log "after execute"; ------- end sieve script ---------
The user's sieve log looks like:
sieve: info: started log at Nov 13 16:56:32. test-foofilter: line 3: info: DEBUG: about to execute. test-foofilter: line 4: error: execute action: failed to execute to program `foofilter': refer to server log for more information. [2018-11-13 16:59:57]. test-foofilter: line 5: info: DEBUG: if condition fulfilled. test-foofilter: line 7: info: DEBUG: about to stop. info: msgid=<20181113155632.k6pybm45sdwwp2bs@blitz.conuropsis.org>: stored mail into mailbox 'test-spam'.
On Tue, Nov 13, 2018 at 05:00:21PM +0100, Lionel Elie Mamane wrote:
Hi,
Running dovecot 2.2.27 (c0f36b0) on Debian GNU/Linux stable (version 9.6), I'm trying to use the execute command in a sieve script, with direct execution (not service socket).
The script is executed, and returns (the process exits), but the dovecot-lda process seems to not notice, and thinks the script did not terminate. After sieve_execute_exec_timeout has passed, it sends a TERM, a bit later a KILL and then logs a failure.
The "ps -Af f" looks like: S 0:00 \_ /usr/sbin/exim4 -Mc 1gMazS-0007IV-RB S 0:00 \_ /usr/lib/dovecot/dovecot-lda -f master@blitz.conuropsis.org Z 0:00 | \_ [foofilter] <defunct> S 0:00 \_ /usr/sbin/exim4 -Mc 1gMazS-0007IV-RB
The dovecot debug log looks like:
lda(user): Debug: Loading modules from directory: /usr/lib/dovecot/modules lda(user): Debug: Module loaded: /usr/lib/dovecot/modules/lib90_sieve_plugin.so lda(user): Debug: Effective uid=1000, gid=1000, home=/home/user lda(user): Debug: maildir++: root=/home/user/Maildir, index=, indexpvt=, control=, inbox=/home/user/Maildir, alt= lda(user): Debug: userdb lookup skipped, username taken from USER environment lda(user): Debug: none: root=, index=, indexpvt=, control=, inbox=, alt= lda(user): Debug: Destination address: user@blitz.conuropsis.org (source: user@hostname) lda(user): Debug: sieve: Pigeonhole version 0.4.16 (fed8554) initializing lda(user): Debug: sieve: include: sieve_global is not set; it is currently not possible to include
:global' scripts. lda(user): Debug: sieve: Sieve imapsieve plugin for Pigeonhole version 0.4.16 (fed8554) loaded lda(user): Debug: sieve: Sieve Extprograms plugin for Pigeonhole version 0.4.16 (fed8554) loaded lda(user): Debug: sieve: file storage: Using active Sieve script path: /data/home/user/.dovecot.sieve lda(user): Debug: sieve: file storage: Using script storage path: /home/user/sieve lda(user): Debug: sieve: file storage: Relative path to sieve storage in active link: sieve/ lda(user): Debug: sieve: file storage: Using Sieve script path: /data/home/user/.dovecot.sieve lda(user): Debug: sieve: file script: Opened script
test-foofilter' from/data/home/user/.dovecot.sieve' lda(user): Debug: sieve: Using the following location for user's Sieve script: /data/home/user/.dovecot.sieve lda(user): Debug: Mailbox stdin: Opened mail UID=1 because: header Message-ID (Cache file is unusable) lda(user): Debug: sieve: Opening script 1 of 1 from
/data/home/user/.dovecot.sieve' lda(user): Debug: sieve: Loading script /data/home/user/.dovecot.sieve lda(user): Debug: sieve: Script binary /data/home/user/.dovecot.svbin successfully loaded lda(user): Debug: sieve: binary save: not saving binary /data/home/user/.dovecot.svbin, because it is already stored lda(user): Debug: sieve: Executing script from/data/home/user/.dovecot.svbin' lda(user): Debug: sieve: action execute: running program: foofilter lda(user): Debug: Mailbox stdin: Opened mail UID=1 because: mail stream lda(user): Debug: waiting for program
/usr/local/lib/dovecot/sieve-execute/foofilter' to finish after 0 msecs lda(user): Debug: program/usr/local/lib/dovecot/sieve-execute/foofilter'(24669) execution timed out after 2000 milliseconds: sending TERM signal lda(user): Debug: program
/usr/local/lib/dovecot/sieve-execute/foofilter' (24669) did not die after 5000 milliseconds: sending KILL signaland an strace on the dovecot-lda process shows it is in an "epoll_wait(9 ....".
The script is very simple:
#! /bin/sh
cat > /dev/null exit 1
Anyone has a clue? Thanks in advance.
Best Regards,
Lionel