mail_crypt crashing randomly during LMTP
Hello everybody,
hope you’re all well!
We have a really strange bug with our dovecot setup. After weeks and month of debugging we’re finally reached the point when we ask you for help.
Our basic setup is quite simple: a single dovecot 2.3.10 (0da0eff44) install on Debian 10 with LDAP as user/passdb and sieve, imap_sieve and mail_crypt as mail_plugins.
For the mail_crypt we have a special rule which allows us to enable crypting for each user individually.
See this line from dovecot-ldap.conf.ext:
pass_attrs = userPassword=password,=userdb_mail_plugins=%{if;%{real_lport};eq;993;imap_sieve;} %{if;%{real_lport};eq;143;imap_sieve;} notify mail_log %{if;%{ldap:encryptMail};eq;TRUE;mail_crypt; },=userdb_mail_crypt_private_password=%w
As you can see for IMAP connections we enable imap_sieve and if the LDAP attribute “encryptMail” is set to TRUE we enable “mail_crypt” as mail_plugin.
This works fine except that around 5-10 times a day the dovecot server crashes during a LMTP connection. It crashes in the last steps of LMTP which result that the mail get delivered to Maildir but Postfix doesn’t get a positive response and delivers the mail again after a few minutes. So all of our users get some mails twice or even more times.
This is what the log says in the moment of crashing: dovecot: lmtp(53852): Panic: Module context mail_crypt_user_module missing *** backtrace *** (see: https://pastebin.com/YCiFtxmy) dovecot: lmtp(53852): Fatal: master: service(lmtp): child 53852 killed with signal 6 (core dumps disabled - https://dovecot.org/bugreport.html#coredumps)
Here I’ve pasted the LTMP session in mail.log: https://pastebin.com/A4ughNUF
Obviously it looks like it has something to do with our mail_crypt setup. But the thing is that although we’re monitoring this for months we cannot find a condition which cause this. It seems to happen more frequent for mails with lot of people in To or CC (large headers), but definitively not all of them…
Some more things we’ve already thought on: First my guess was that the mail_crypt plugin isn’t loaded in some case but in the backtrace are things from “lib10_mail_crypt_plugin.so”, so it should be loaded, right? On the other hand a non-crashing LTMP session looks exactly the same in log except it has three additional lines in mail.log: Debug: Loading modules from directory: /usr/lib/dovecot/modules Debug: Module loaded: /usr/lib/dovecot/modules/lib15_notify_plugin.so Debug: Module loaded: /usr/lib/dovecot/modules/lib20_mail_log_plugin.so
My college already had a look in the source code, but this didn’t make us much smarter. Here his references: mail context is defined here: https://github.com/dovecot/core/blob/81b5b188c478ec36bea8bda8fcad1e5f32ac612... and removed again here: https://github.com/dovecot/core/blob/81b5b188c478ec36bea8bda8fcad1e5f32ac612...
Does any of the experts on this list have an idea what could cause these crashes and how to fix them?
Thanks in advance!
Best, Andi
On 29. Mar 2020, at 15.15, Andi Krischer <ak@michaelpelzer.de> wrote:
We have a really strange bug with our dovecot setup. After weeks and month of debugging we’re finally reached the point when we ask you for help.
Our basic setup is quite simple: a single dovecot 2.3.10 (0da0eff44) install on Debian 10 with LDAP as user/passdb and sieve, imap_sieve and mail_crypt as mail_plugins.
Was this happening with old Dovecot versions, or did it start after v2.3.10 upgrade?
This is what the log says in the moment of crashing: dovecot: lmtp(53852): Panic: Module context mail_crypt_user_module missing *** backtrace *** (see: https://pastebin.com/YCiFtxmy)
It shows mailbox_free() in the backtrace, so it's probably crashing in mail_crypt_mailbox_close()'s MAIL_CRYPT_CONTEXT(box). But looking at the code, I can't see how that could be possible. Maybe there is some kind of memory corruption or something.
dovecot: lmtp(53852): Fatal: master: service(lmtp): child 53852 killed with signal 6 (core dumps disabled - https://dovecot.org/bugreport.html#coredumps)
Could you get core dumps enabled? See the link. And from the core dump send at least:
bt full fr 6 (or maybe 5 or 7, whichever makes the following command work) p *box
Also please send your doveconf -n output.
On 29. Mar 2020, at 16:07, Timo Sirainen <timo@sirainen.com> wrote:
On 29. Mar 2020, at 15.15, Andi Krischer <ak@michaelpelzer.de> wrote:
We have a really strange bug with our dovecot setup. After weeks and month of debugging we’re finally reached the point when we ask you for help.
Our basic setup is quite simple: a single dovecot 2.3.10 (0da0eff44) install on Debian 10 with LDAP as user/passdb and sieve, imap_sieve and mail_crypt as mail_plugins.
Was this happening with old Dovecot versions, or did it start after v2.3.10 upgrade?
Yes, we implemented this setup with v2.3.5.1 on Debian 9 and had the issue right from the beginning.
This is what the log says in the moment of crashing: dovecot: lmtp(53852): Panic: Module context mail_crypt_user_module missing *** backtrace *** (see: https://pastebin.com/YCiFtxmy)
It shows mailbox_free() in the backtrace, so it's probably crashing in mail_crypt_mailbox_close()'s MAIL_CRYPT_CONTEXT(box). But looking at the code, I can't see how that could be possible. Maybe there is some kind of memory corruption or something.
dovecot: lmtp(53852): Fatal: master: service(lmtp): child 53852 killed with signal 6 (core dumps disabled - https://dovecot.org/bugreport.html#coredumps)
Could you get core dumps enabled? See the link. And from the core dump send at least:
bt full fr 6 (or maybe 5 or 7, whichever makes the following command work) p *box
Also please send your doveconf -n output.
Core dump: https://pastebin.com/g4VkMXsT (hope this is everything - I’m not that familiar with core dumps…)
Doveconf: https://pastebin.com/JnhXkYUA
Hope this helps!
On 30. Mar 2020, at 14.36, Andi Krischer <ak@michaelpelzer.de> wrote:
This is what the log says in the moment of crashing: dovecot: lmtp(53852): Panic: Module context mail_crypt_user_module missing *** backtrace *** (see: https://pastebin.com/YCiFtxmy)
It shows mailbox_free() in the backtrace, so it's probably crashing in mail_crypt_mailbox_close()'s MAIL_CRYPT_CONTEXT(box). But looking at the code, I can't see how that could be possible. Maybe there is some kind of memory corruption or something.
dovecot: lmtp(53852): Fatal: master: service(lmtp): child 53852 killed with signal 6 (core dumps disabled - https://dovecot.org/bugreport.html#coredumps)
Could you get core dumps enabled? See the link. And from the core dump send at least:
bt full fr 6 (or maybe 5 or 7, whichever makes the following command work) p *box
Also please send your doveconf -n output.
Core dump: https://pastebin.com/g4VkMXsT (hope this is everything - I’m not that familiar with core dumps…)
It's missing debug symbols. Are you using binaries from repo.dovecot.org <http://repo.dovecot.org/>? You'd need to install dovecot-dbg package to get more useful output.
And/or assuming it's installed from repo.dovecot.org <http://repo.dovecot.org/>, it would help if you could send me privately the core file and related libs (the core could contain some sensitive info like pieces of emails): dovecot-sysreport --core /usr/lib/dovecot/lmtp /path/to/core
On 29. Mar 2020, at 17.07, Timo Sirainen <timo@sirainen.com> wrote:
This is what the log says in the moment of crashing: dovecot: lmtp(53852): Panic: Module context mail_crypt_user_module missing *** backtrace *** (see: https://pastebin.com/YCiFtxmy)
It shows mailbox_free() in the backtrace, so it's probably crashing in mail_crypt_mailbox_close()'s MAIL_CRYPT_CONTEXT(box). But looking at the code, I can't see how that could be possible. Maybe there is some kind of memory corruption or something.
Looks like there's a bug when some users have mail_crypt plugin loaded and other users don't. If LMTP delivers a mail to both users within the same session, it crashes at deinit. A workaround could be to tell MTA to deliver to max recipient at a time. We'll get this fixed also to some future release. You could also apply a patch: diff --git a/src/plugins/mail-crypt/mail-crypt-plugin.c b/src/plugins/mail-crypt/mail-crypt-plugin.c index 43ece3d3b..a37fb4a35 100644 --- a/src/plugins/mail-crypt/mail-crypt-plugin.c +++ b/src/plugins/mail-crypt/mail-crypt-plugin.c @@ -345,7 +345,7 @@ static void mail_crypt_mailbox_close(struct mailbox *box) { struct mail_crypt_mailbox *mbox = MAIL_CRYPT_CONTEXT(box); struct mail_crypt_user *muser = - MAIL_CRYPT_USER_CONTEXT_REQUIRE(box->storage->user); + MAIL_CRYPT_USER_CONTEXT(box->storage->user); if (mbox->pub_key != NULL) dcrypt_key_unref_public(&mbox->pub_key);
On 31. Mar 2020, at 13.47, Timo Sirainen <timo@sirainen.com> wrote:
On 29. Mar 2020, at 17.07, Timo Sirainen <timo@sirainen.com> wrote:
This is what the log says in the moment of crashing: dovecot: lmtp(53852): Panic: Module context mail_crypt_user_module missing *** backtrace *** (see: https://pastebin.com/YCiFtxmy)
It shows mailbox_free() in the backtrace, so it's probably crashing in mail_crypt_mailbox_close()'s MAIL_CRYPT_CONTEXT(box). But looking at the code, I can't see how that could be possible. Maybe there is some kind of memory corruption or something.
Looks like there's a bug when some users have mail_crypt plugin loaded and other users don't. If LMTP delivers a mail to both users within the same session, it crashes at deinit. A workaround could be to tell MTA to deliver to max recipient at a time. We'll get this fixed also to some future release. You could also apply a patch:
Looks like there is also a better way to do this. Enable mail_crypt plugin always in dovecot.conf, but return mail_crypt_save_version differently for users. So mail_crypt_save_version=0 for users who don't want encryption, mail_crypt_save_version=2 for users who do want encryption.
participants (2)
-
Andi Krischer
-
Timo Sirainen