Hi thank you for these, can you send doveconf -n for your minimal reproducer?
Aki
On January 27, 2018 at 4:37 PM Jonas Wielicki jonas@wielicki.name wrote:
Dear list,
We are encountering troubles with dovecot using LDAP userdbs on Debian stretch (but if I’m reading valgrind correctly, we can reproduce this with vanilla dovecot master). Minimal reproducer below.
While testing an upgrade to Debian stretch (dovecot-core=1:2.2.27-3+deb9u1), auth-worker has stopped working. We are using two LDAP user databases; one which is iterable, and one which is not (for reasons; I think this is not relevant; if it is, we’re happy to elaborate).
The issue seems to be present in any LDAP userdb iteration handling, but it only seems to cause a crash in certain conditions which seem to be reproducibly caused by our setup for some reason.
We’re seeing one of two errors.
Variant A:
Jan 27 13:40:05 up2 dovecot: auth-worker: Error: *** Error in `dovecot/auth': free(): corrupted unsorted chunks: 0x000056553f0fcfa0 *** Jan 27 13:40:05 up2 dovecot: auth-worker: Error: ======= Backtrace: ========= Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(+0x70bcb)[0x7f78426adbcb] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(+0x76f96)[0x7f78426b3f96] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(+0x777de)[0x7f78426b47de] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/modules/ auth/libauthdb_ldap.so(+0x5bde)[0x7f7842008bde] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_call_io+0x52)[0x7f78430cfdd2] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_handler_run_internal+0x109)[0x7f78430d1409] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_handler_run+0x3c)[0x7f78430cfe6c] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(io_loop_run+0x38)[0x7f78430d0018] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /usr/lib/dovecot/ libdovecot.so.0(master_service_run+0x13)[0x7f7843057e93] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: dovecot/auth(main+0x398) [0x56553da45f98] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: /lib/x86_64-linux-gnu/ libc.so.6(__libc_start_main+0xf1)[0x7f784265d2b1] Jan 27 13:40:05 up2 dovecot: auth-worker: Error: dovecot/auth(_start+0x2a) [0x56553da461aa]
Variant B:
Jan 27 13:06:56 up2 dovecot: auth-worker(27495): Panic: file db-ldap.c: line 840 (db_ldap_result_unref): assertion failed: (res->refcount > 0) Jan 27 13:06:56 up2 dovecot: auth-worker(27495): Error: Raw backtrace: /usr/ lib/dovecot/libdovecot.so.0(+0x95272) [0x7f027a4fc272] -> /usr/lib/dovecot/ libdovecot.so.0(+0x9536d) [0x7f027a4fc36d] -> /usr/lib/dovecot/libdovecot.so. 0(i_fatal+0) [0x7f027a492951] -> /usr/lib/dovecot/modules/auth/ libauthdb_ldap.so(+0x3868) [0x7f0279447868] -> /usr/lib/dovecot/modules/auth/ libauthdb_ldap.so(+0x5d7c) [0x7f0279449d7c] -> /usr/lib/dovecot/libdovecot.so. 0(io_loop_call_io+0x52) [0x7f027a510dd2] -> /usr/lib/dovecot/libdovecot.so. 0(io_loop_handler_run_internal+0x109) [0x7f027a512409] -> /usr/lib/dovecot/ libdovecot.so.0(io_loop_handler_run+0x3c) [0x7f027a510e6c] -> /usr/lib/ dovecot/libdovecot.so.0(io_loop_run+0x38) [0x7f027a511018] -> /usr/lib/ dovecot/libdovecot.so.0(master_service_run+0x13) [0x7f027a498e93] -> dovecot/ auth(main+0x398) [0x55b981cd2f98] -> /lib/x86_64-linux-gnu/libc.so. 6(__libc_start_main+0xf1) [0x7f0279a9e2b1] -> dovecot/auth(_start+0x2a) [0x55b981cd31aa]
When running auth-worker within valgrind, we get more information (valgrind extract from mail.log attached for two versions of dovecot). We have attached mail.log contents for the respective valgrind runs.
This seems to affect at least 2.2.27 onwards. We can reproduce the crash with our production data 100% of the time. The valgrind error is present even with very minimal setups (see below). Unfortunately, I haven’t been able so far to create an LDAP database which produces the crash as reliable as our production database does (which we cannot share for obvious reasons).
Minimal Reproducer of Valgrind error:
Install dovecot 2.2.27 from debian stretch, or compile from master.
Install valgrind
In conf.d/10-auth.conf, disable auth-system.conf.ext
In conf.d/10-auth.conf, enable auth-ldap.conf.ext
Set contents of dovecot-ldap.conf.ext to:
hosts = localhost base = dc=nodomain
In conf.d/10-master.conf, in section "service auth-worker", set
executable = /usr/bin/valgrind /usr/lib/dovecot/auth -w
(path may differ on your system)
Install an LDAP server with a database for dc=nodomain; this is trivial to do with debian: Simply install slapd and systemctl start slapd.
Start dovecot
Run doveadm user '*'
You should find the error in the mail.log.
I hope this is somehow useful to fix our crash issue. We’ll be happy to provide more information as needed.
kind regards, Jonas