Occasional crash in db-auth.c (Valgrind: Invalid read of size 4 et al.), Dovecot 2.2.27+
Dear list,
We are encountering troubles with dovecot using LDAP userdbs on Debian stretch (but if I’m reading valgrind correctly, we can reproduce this with vanilla dovecot master). Minimal reproducer below.
While testing an upgrade to Debian stretch (dovecot-core=1:2.2.27-3+deb9u1), auth-worker has stopped working. We are using two LDAP user databases; one which is iterable, and one which is not (for reasons; I think this is not relevant; if it is, we’re happy to elaborate).
The issue seems to be present in any LDAP userdb iteration handling, but it only seems to cause a crash in certain conditions which seem to be reproducibly caused by our setup for some reason.
We’re seeing one of two errors.
Variant A:
Jan 27 13:40:05 up2 dovecot: auth-worker: Error: *** Error in `dovecot/auth': free(): corrupted unsorted chunks: 0x000056553f0fcfa0 *** Jan 27 13:40:05 up2 dovecot: auth-worker: Error: ======= Backtrace: ========= Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(+0x70bcb)[0x7f78426adbcb] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(+0x76f96)[0x7f78426b3f96] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(+0x777de)[0x7f78426b47de] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/modules/ auth/libauthdb_ldap.so(+0x5bde)[0x7f7842008bde] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_call_io+0x52)[0x7f78430cfdd2] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_handler_run_internal+0x109)[0x7f78430d1409] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_handler_run+0x3c)[0x7f78430cfe6c] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_run+0x38)[0x7f78430d0018] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(master_service_run+0x13)[0x7f7843057e93] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: dovecot/auth(main+0x398) [0x56553da45f98] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(__libc_start_main+0xf1)[0x7f784265d2b1] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: dovecot/auth(_start+0x2a) [0x56553da461aa]
Variant B:
Jan 27 13:06:56 up2 dovecot: auth-worker(27495): Panic: file db-ldap.c: line 840 (db_ldap_result_unref): assertion failed: (res->refcount > 0) Jan 27 13:06:56 up2 dovecot: auth-worker(27495): Error: Raw backtrace: /usr/ lib/dovecot/libdovecot.so.0(+0x95272) [0x7f027a4fc272] -> /usr/lib/dovecot/ libdovecot.so.0(+0x9536d) [0x7f027a4fc36d] -> /usr/lib/dovecot/libdovecot.so. 0(i_fatal+0) [0x7f027a492951] -> /usr/lib/dovecot/modules/auth/ libauthdb_ldap.so(+0x3868) [0x7f0279447868] -> /usr/lib/dovecot/modules/auth/ libauthdb_ldap.so(+0x5d7c) [0x7f0279449d7c] -> /usr/lib/dovecot/libdovecot.so. 0(io_loop_call_io+0x52) [0x7f027a510dd2] -> /usr/lib/dovecot/libdovecot.so. 0(io_loop_handler_run_internal+0x109) [0x7f027a512409] -> /usr/lib/dovecot/ libdovecot.so.0(io_loop_handler_run+0x3c) [0x7f027a510e6c] -> /usr/lib/ dovecot/libdovecot.so.0(io_loop_run+0x38) [0x7f027a511018] -> /usr/lib/ dovecot/libdovecot.so.0(master_service_run+0x13) [0x7f027a498e93] -> dovecot/ auth(main+0x398) [0x55b981cd2f98] -> /lib/x86_64-linux-gnu/libc.so. 6(__libc_start_main+0xf1) [0x7f0279a9e2b1] -> dovecot/auth(_start+0x2a) [0x55b981cd31aa]
When running auth-worker within valgrind, we get more information (valgrind extract from mail.log attached for two versions of dovecot). We have attached mail.log contents for the respective valgrind runs.
This seems to affect at least 2.2.27 onwards. We can reproduce the crash with our production data 100% of the time. The valgrind error is present even with very minimal setups (see below). Unfortunately, I haven’t been able so far to create an LDAP database which produces the crash as reliable as our production database does (which we cannot share for obvious reasons).
Minimal Reproducer of Valgrind error:
Install dovecot 2.2.27 from debian stretch, or compile from master.
Install valgrind
In conf.d/10-auth.conf, disable auth-system.conf.ext
In conf.d/10-auth.conf, enable auth-ldap.conf.ext
Set contents of dovecot-ldap.conf.ext to:
hosts = localhost base = dc=nodomain
In conf.d/10-master.conf, in section "service auth-worker", set
executable = /usr/bin/valgrind /usr/lib/dovecot/auth -w
(path may differ on your system)
Install an LDAP server with a database for dc=nodomain; this is trivial to do with debian: Simply install slapd and systemctl start slapd.
Start dovecot
Run doveadm user '*'
You should find the error in the mail.log.
I hope this is somehow useful to fix our crash issue. We’ll be happy to provide more information as needed.
kind regards, Jonas
Hi thank you for these, can you send doveconf -n for your minimal reproducer?
Aki
On January 27, 2018 at 4:37 PM Jonas Wielicki <jonas@wielicki.name> wrote:
Dear list,
We are encountering troubles with dovecot using LDAP userdbs on Debian stretch (but if I’m reading valgrind correctly, we can reproduce this with vanilla dovecot master). Minimal reproducer below.
While testing an upgrade to Debian stretch (dovecot-core=1:2.2.27-3+deb9u1), auth-worker has stopped working. We are using two LDAP user databases; one which is iterable, and one which is not (for reasons; I think this is not relevant; if it is, we’re happy to elaborate).
The issue seems to be present in any LDAP userdb iteration handling, but it only seems to cause a crash in certain conditions which seem to be reproducibly caused by our setup for some reason.
We’re seeing one of two errors.
Variant A:
Jan 27 13:40:05 up2 dovecot: auth-worker: Error: *** Error in `dovecot/auth': free(): corrupted unsorted chunks: 0x000056553f0fcfa0 *** Jan 27 13:40:05 up2 dovecot: auth-worker: Error: ======= Backtrace: ========= Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(+0x70bcb)[0x7f78426adbcb] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(+0x76f96)[0x7f78426b3f96] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(+0x777de)[0x7f78426b47de] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/modules/ auth/libauthdb_ldap.so(+0x5bde)[0x7f7842008bde] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_call_io+0x52)[0x7f78430cfdd2] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_handler_run_internal+0x109)[0x7f78430d1409] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_handler_run+0x3c)[0x7f78430cfe6c] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_run+0x38)[0x7f78430d0018] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(master_service_run+0x13)[0x7f7843057e93] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: dovecot/auth(main+0x398) [0x56553da45f98] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(__libc_start_main+0xf1)[0x7f784265d2b1] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: dovecot/auth(_start+0x2a) [0x56553da461aa]
Variant B:
Jan 27 13:06:56 up2 dovecot: auth-worker(27495): Panic: file db-ldap.c: line 840 (db_ldap_result_unref): assertion failed: (res->refcount > 0) Jan 27 13:06:56 up2 dovecot: auth-worker(27495): Error: Raw backtrace: /usr/ lib/dovecot/libdovecot.so.0(+0x95272) [0x7f027a4fc272] -> /usr/lib/dovecot/ libdovecot.so.0(+0x9536d) [0x7f027a4fc36d] -> /usr/lib/dovecot/libdovecot.so. 0(i_fatal+0) [0x7f027a492951] -> /usr/lib/dovecot/modules/auth/ libauthdb_ldap.so(+0x3868) [0x7f0279447868] -> /usr/lib/dovecot/modules/auth/ libauthdb_ldap.so(+0x5d7c) [0x7f0279449d7c] -> /usr/lib/dovecot/libdovecot.so. 0(io_loop_call_io+0x52) [0x7f027a510dd2] -> /usr/lib/dovecot/libdovecot.so. 0(io_loop_handler_run_internal+0x109) [0x7f027a512409] -> /usr/lib/dovecot/ libdovecot.so.0(io_loop_handler_run+0x3c) [0x7f027a510e6c] -> /usr/lib/ dovecot/libdovecot.so.0(io_loop_run+0x38) [0x7f027a511018] -> /usr/lib/ dovecot/libdovecot.so.0(master_service_run+0x13) [0x7f027a498e93] -> dovecot/ auth(main+0x398) [0x55b981cd2f98] -> /lib/x86_64-linux-gnu/libc.so. 6(__libc_start_main+0xf1) [0x7f0279a9e2b1] -> dovecot/auth(_start+0x2a) [0x55b981cd31aa]
When running auth-worker within valgrind, we get more information (valgrind extract from mail.log attached for two versions of dovecot). We have attached mail.log contents for the respective valgrind runs.
This seems to affect at least 2.2.27 onwards. We can reproduce the crash with our production data 100% of the time. The valgrind error is present even with very minimal setups (see below). Unfortunately, I haven’t been able so far to create an LDAP database which produces the crash as reliable as our production database does (which we cannot share for obvious reasons).
Minimal Reproducer of Valgrind error:
Install dovecot 2.2.27 from debian stretch, or compile from master.
Install valgrind
In conf.d/10-auth.conf, disable auth-system.conf.ext
In conf.d/10-auth.conf, enable auth-ldap.conf.ext
Set contents of dovecot-ldap.conf.ext to:
hosts = localhost base = dc=nodomain
In conf.d/10-master.conf, in section "service auth-worker", set
executable = /usr/bin/valgrind /usr/lib/dovecot/auth -w
(path may differ on your system)
Install an LDAP server with a database for dc=nodomain; this is trivial to do with debian: Simply install slapd and systemctl start slapd.
Start dovecot
Run doveadm user '*'
You should find the error in the mail.log.
I hope this is somehow useful to fix our crash issue. We’ll be happy to provide more information as needed.
kind regards, Jonas
On Samstag, 27. Januar 2018 21:33:51 CET you wrote:
Hi thank you for these, can you send doveconf -n for your minimal reproducer?
Ah darn, I was so caught up getting the valgrind traces that I forgot about that. Here you go:
# 2.4.devel (54d0a5a30): /usr/local/etc/dovecot/dovecot.conf # OS: Linux 4.14.0-2-amd64 x86_64 Debian buster/sid # Hostname: sinistra.sotecware.net auth_debug = yes mail_debug = yes namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = } passdb { args = /usr/local/etc/dovecot/dovecot-ldap.conf.ext driver = ldap } service auth-worker { executable = /usr/bin/valgrind /usr/local/libexec/dovecot/auth -w } ssl = no userdb { args = /usr/local/etc/dovecot/dovecot-ldap.conf.ext driver = ldap }
(This is from the compile-from-source setup on my Debian buster/testing machine. The stretch one looks essentially identical (I didn’t have to disable SSL there and the paths differ).)
kind regards, Jonas
On Sonntag, 28. Januar 2018 13:59:24 CEST Jonas Wielicki wrote:
On Samstag, 27. Januar 2018 21:33:51 CET you wrote:
Hi thank you for these, can you send doveconf -n for your minimal reproducer?
Has this been fixed in any release? I’m not sure how to figure this out, unfortunately.
kind regards, Jonas
On 14 Sep 2018, at 12.22, Jonas Schäfer <jonas@wielicki.name> wrote:
On Sonntag, 28. Januar 2018 13:59:24 CEST Jonas Wielicki wrote:
On Samstag, 27. Januar 2018 21:33:51 CET you wrote: Hi thank you for these, can you send doveconf -n for your minimal reproducer?
Has this been fixed in any release? I’m not sure how to figure this out, unfortunately.
Not in a release yet, but it is in git master: https://github.com/dovecot/core/commit/90bd9600a0e38e55c02c6266c1270fdd4138c...
participants (4)
-
Aki Tuomi
-
Jonas Schäfer
-
Jonas Wielicki
-
Timo Sirainen