Sieve redirect does not collect the sendmail child process correctly (Dovecot 2.3.4.1, Pigeonhole 0.5.4)

Delian Krustev krustev at krustev.net
Sun Nov 3 19:01:44 EET 2019


On Saturday, November 2, 2019 10:40:45 PM EET Stephan Bosch <stephan at rename-it.nl> wrote:
> I think we already know about this issue (tracked as DOP-942). It
> apparently happens because qmail masks the SIGCHLD signal while Dovecot
> doesn't unmask it before waiting for children.

You're right.
Isn't this a simple fix - a call to sigprocmask ?

This should not be a problem in qmail-lspawn/qmail-local as their code
has not been changed for years.
So there should have been a breaking change in Dovecot/Pigeonhole.


What follows is some more debugging from me.

The chain of processes that lead to dovecot-lda and sendmail execution look like this:

  PID
32457   qmail-lspawn
 6923    qmail-local
 6924     sh -c vlelivermail
 6925      sh -c preline /usr/local/bin/vpop-lda
 6926       vpop-lda(perl script) -> deliver -> doveconf -> deliver
 6927         grep
 6928         pwd
 6929         sendmail -> qmail-inject
 6930           qmail-queue


Just before executing sendmail, dovecot-lda is registering a signal handler for SIGCHLD:

22:55:15.294647 epoll_create(128)       = 12
22:55:15.294668 fcntl(12, F_GETFD)      = 0
22:55:15.294687 fcntl(12, F_SETFD, FD_CLOEXEC) = 0
22:55:15.294708 epoll_ctl(12, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=47455536, u64=94699781365040}}) = 0
22:55:15.294740 rt_sigaction(SIGCHLD, {sa_handler=0x7f15404b3640, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f1540229840}, NULL, 8) = 0
22:55:15.294763 pipe([13, 14])          = 0
22:55:15.294788 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f15401ec2d0) = 6929


Later dovecot-lda checks non-blockingly whether sendmail has exited and waits for epoll IO event:

22:55:15.295175 wait4(6929, 0x562102d4fee4, WNOHANG, NULL) = 0
22:55:15.295211 epoll_wait(12, [], 1, 30000) = 0
22:55:45.325580 kill(6929, SIGTERM)     = 0
22:55:45.325769 epoll_wait(12, [], 1, 5000) = 0
22:55:50.331108 kill(6929, SIGKILL)     = 0


I'm not sure why it is waiting for IO event on FD 5, which is a PIPE FD initialized much earlier,
with no actual reads/writes on it:

22:55:15.286845 pipe([5, 6])            = 0
22:55:15.286870 fcntl(5, F_GETFL)       = 0 (flags O_RDONLY)
22:55:15.286889 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
22:55:15.286909 fcntl(6, F_GETFL)       = 0x1 (flags O_WRONLY)
22:55:15.286928 fcntl(6, F_SETFL, O_WRONLY|O_NONBLOCK) = 0

The pipe used for communication with the sendmail child is this:

22:55:15.294763 pipe([13, 14])          = 0
22:55:15.294788 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f15401ec2d0) = 6929
22:55:15.294928 close(13)               = 0
22:55:15.294969 fcntl(14, F_GETFL)      = 0x1 (flags O_WRONLY)
22:55:15.295006 fcntl(14, F_SETFL, O_WRONLY|O_NONBLOCK) = 0
22:55:15.295039 lseek(14, 0, SEEK_CUR)  = -1 ESPIPE (Illegal seek)
22:55:15.295065 getsockname(14, 0x7ffd59156a40, [28]) = -1 ENOTSOCK (Socket operation on non-socket)
22:55:15.295109 write(14, "X-Sieve: Pigeonhole Sieve 0.5.4 ()\nX-Sieve-Redirected-From: ...", 1461) = 1461
22:55:15.295150 close(14)               = 0
22:55:15.295175 wait4(6929, 0x562102d4fee4, WNOHANG, NULL) = 0
22:55:15.295211 epoll_wait(12, [], 1, 30000) = 0
22:55:45.325580 kill(6929, SIGTERM)     = 0




This is the process creation/waiting and signal mask debugging:


qmail-lspawn (PID 32457)
22:55:15.227742 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
...
22:55:15.227970 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f8a117de7d0) = 6923
...
22:55:15.228237 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
22:55:15.228267 select(4, [0 3], NULL, NULL, NULL) = 1 (in [3])
22:55:50.334325 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
22:55:50.334367 read(3, "Execution of Sieve filters was aborted due to temporary failure\n", 128) = 64
22:55:50.334406 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
22:55:50.334440 select(4, [0 3], NULL, NULL, NULL) = ? ERESTARTNOHAND (To be restarted if no handler)
22:55:50.335625 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=6923, si_uid=1002, si_status=111, si_utime=0, si_stime=0} ---
22:55:50.335649 wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 111}], WNOHANG, NULL) = 6923
22:55:50.335683 close(4)                = 0
22:55:50.335710 wait4(-1, 0x7fff05f68d8c, WNOHANG, NULL) = -1 ECHILD (No child processes)
22:55:50.335742 rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call)
22:55:50.335773 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
22:55:50.335801 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
22:55:50.335827 select(4, [0 3], NULL, NULL, NULL) = 1 (in [3])
22:55:50.335874 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
22:55:50.335899 read(3, "", 128)        = 0
22:55:50.335925 write(1, "\0ZExecution of Sieve filters was aborted due to temporary failure\n\0", 67) = 67
22:55:50.335974 close(3)                = 0
22:55:50.336006 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
22:55:50.336034 select(1, [0], NULL, NULL, NULL <detached ...>


qmail-local (PID 6923)
22:55:15.229053 execve("bin/qmail-local", ["bin/qmail-local", "--", "krustev.net", "/home/krustev/mail/domains/krustev.net", "krustev.net-tester", "-", "tester", "krustev.net", "krustev at krustev.net", "./Maildir/"], 0x7fff05f69560 /* 1 var */) = 0
...
22:55:15.230313 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fde0346a7d0) = 6924
22:55:15.230420 wait4(6924, [{WIFEXITED(s) && WEXITSTATUS(s) == 111}], 0, NULL) = 6924
22:55:50.335462 exit_group(111)         = ?
22:55:50.335572 +++ exited with 111 +++


sh -> vlelivermail (PID 6924)
22:55:15.230511 execve("/bin/sh", ["/bin/sh", "-c", " /var/lib/vpopmail/bin/vdelivermail '' bounce-no-mailbox"], 0x564c0be3ac30 /* 19 vars */) = 0
..
22:55:15.233866 rt_sigprocmask(SIG_BLOCK, NULL, [CHLD], 8) = 0
...
22:55:15.233997 execve("/var/lib/vpopmail/bin/vdelivermail", ["/var/lib/vpopmail/bin/vdelivermail", "", "bounce-no-mailbox"], 0x55e1e61bcde0 /* 22 vars */) = 0
...
22:55:15.237461 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f7bea310a10) = 6925
22:55:15.237581 wait4(6925, [{WIFEXITED(s) && WEXITSTATUS(s) == 75}], 0, NULL) = 6925
22:55:50.335255 lseek(0, -1337, SEEK_CUR) = 0
22:55:50.335283 exit_group(111)         = ?
22:55:50.335385 +++ exited with 111 +++



sh -> preline (PID 6925)
22:55:15.237676 execve("/bin/sh", ["/bin/sh", "-c", "preline /usr/local/bin/vpop-lda"], 0x20a8b30 /* 23 vars */) = 0
...
22:55:15.240248 rt_sigprocmask(SIG_BLOCK, NULL, [CHLD], 8) = 0
...
22:55:15.241235 execve("/var/lib/qmail/bin/preline", ["preline", "/usr/local/bin/vpop-lda"], 0x564b8bbbbc50 /* 23 vars */) = 0
...
22:55:15.242411 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f68e199c7d0) = 6926
...
22:55:15.243005 wait4(6926, [{WIFEXITED(s) && WEXITSTATUS(s) == 75}], 0, NULL) = 6926
22:55:50.335131 exit_group(75)          = ?
22:55:50.335203 +++ exited with 75 +++




vpop-lda->deliver->doveconf->deliver (PID 6926
22:55:15.242860 execve("/usr/local/bin/vpop-lda", ["/usr/local/bin/vpop-lda"], 0x7ffe55a89810 /* 23 vars */) = 0
...
22:55:15.257985 rt_sigprocmask(SIG_BLOCK, [CHLD], [CHLD], 8) = 0
22:55:15.258010 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fd5e8098490) = 6927
...
22:55:15.258305 wait4(6927, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 6927
...
22:55:15.260774 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
...
22:55:15.261905 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fd5e8098490) = 6928
...
22:55:15.263580 wait4(6928, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6928
...
22:55:15.263650 execve("/usr/lib/dovecot/deliver", ["/usr/lib/dovecot/deliver", "-e", "-a", "tester at krustev.net"], 0x563ae02171e0 /* 23 vars */) = 0
...
22:55:15.267914 execve("/usr/bin/doveconf", ["/usr/bin/doveconf", "-f", "service=lda", "-c", "/etc/dovecot/dovecot.conf", "-m", "lda", "-e", "/usr/lib/dovecot/deliver", "-e", "-a", "tester at krustev.net"], 0x55ad578460b0 /* 25 vars */) = 0
...
22:55:15.283482 execve("/usr/lib/dovecot/deliver", ["/usr/lib/dovecot/deliver", "-e", "-a", "tester at krustev.net"], 0x5620b4330c40 /* 61 vars */) = 0
...
22:55:15.294788 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f15401ec2d0) = 6929
...
22:55:15.295175 wait4(6929, 0x562102d4fee4, WNOHANG, NULL) = 0
22:55:15.295211 epoll_wait(12, [], 1, 30000) = 0
22:55:45.325580 kill(6929, SIGTERM)     = 0
22:55:45.325769 epoll_wait(12, [], 1, 5000) = 0
22:55:50.331108 kill(6929, SIGKILL)     = 0
22:55:50.332867 wait4(6929, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6929


grep (PID 6927)
22:55:15.258651 execve("/bin/grep", ["grep", "-ql", "Spam", ".dovecot.sieve"], 0x563ae02171e0 /* 23 vars */) = 0
...
22:55:15.260681 exit_group(1)           = ?                
22:55:15.260751 +++ exited with 1 +++


pwd (PID 6928)
22:55:15.262492 execve("/bin/pwd", ["pwd"], 0x563ae02171e0 /* 23 vars */) = 0
...
22:55:15.263532 exit_group(0)           = ?
22:55:15.263587 +++ exited with 0 +++


22:55:15.295268 execve("/usr/sbin/sendmail", ["/usr/sbin/sendmail", "-i", "-f", "tester at krustev.net", "--", "krustev at krustev.net"], NULL) = 0
...
22:55:15.296055 execve("bin/qmail-inject", ["bin/qmail-inject", "-a", "-f", "tester at krustev.net", "--", "krustev at krustev.net"], 0x55741b1890e0 /* 2 vars */) = 0
...
22:55:15.297264 vfork()                 = 6930             
...
22:55:15.297930 wait4(6930, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6930
22:55:15.305391 exit_group(0)           = ?
22:55:15.305472 +++ exited with 0 +++







Best regards
--
Delian




More information about the dovecot mailing list