Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped)
Dear list,
We been having some issues where the indexer-worker is crashing. This happens on production servers which are handling a slight amount of mail, but is also reproducible by moving messages. Also, users on my server are complaining about "Trashed" items coming back etc.
Some details:
Dovecot 2.3.2.1 in alpine:3.8 based Docker container. As part of the Mailu distribution. (https://github.com/Mailu/Mailu)
I've first seen this issues on my production server, which stores mail on GlusterFS
I've been able to reproduce running the Docker container on a Virtual machine, using local storage.
There is another Mailu user reporting the same problem on a different VM provider / disk infrastructure: https://github.com/Mailu/Mailu/issues/751
Libc: musl-1.1.19-r10
Output of dovecot -n:
# dovecot -n # 2.3.2.1 (0719df592): /etc/dovecot/dovecot.conf # Pigeonhole version 0.5.2 (7704de5e) # OS: Linux 4.16.3-301.fc28.x86_64 x86_64 ext4 # Hostname: 98a2726271d3 auth_mechanisms = plain login disable_plaintext_auth = no first_valid_gid = 8 first_valid_uid = 8 hostname = mail.usrpro.io log_path = /dev/stderr mail_access_groups = mail mail_gid = mail mail_home = /mail/%u mail_location = maildir:/mail/%u mail_plugins = " fts fts_lucene quota quota_clone zlib" mail_privileged_group = mail mail_uid = mail mail_vsize_bg_after_count = 100 maildir_stat_dirs = yes managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext spamtest spamtestplus editheader imapsieve vnd.dovecot.imapsieve namespace inbox { inbox = yes location = mailbox Drafts { auto = subscribe special_use = \Drafts } mailbox Junk { auto = subscribe special_use = \Junk } mailbox Sent { auto = subscribe special_use = \Sent } mailbox Trash { auto = subscribe special_use = \Trash } prefix = } passdb { args = /etc/dovecot/auth.conf driver = dict } plugin { fts = lucene fts_autoindex = yes fts_autoindex_exclude = \Junk fts_lucene = whitespace_chars=@. imapsieve_mailbox1_before = file:/conf/report-spam.sieve imapsieve_mailbox1_causes = COPY imapsieve_mailbox1_name = Junk imapsieve_mailbox2_before = file:/conf/report-ham.sieve imapsieve_mailbox2_causes = COPY imapsieve_mailbox2_from = Junk imapsieve_mailbox2_name = * quota = count:User quota quota_clone_dict = proxy:/tmp/podop.socket:quota quota_vsizes = yes sieve = file:~/sieve;active=~/.dovecot.sieve sieve_before = dict:proxy:/tmp/podop.socket:sieve sieve_execute_bin_dir = /conf/bin sieve_extensions = +spamtest +spamtestplus +editheader sieve_global_extensions = +vnd.dovecot.execute sieve_plugins = sieve_imapsieve sieve_extprograms sieve_spamtest_max_value = 15 sieve_spamtest_status_header = X-Spam-Level sieve_spamtest_status_type = strlen sieve_vacation_dont_check_recipient = yes sieve_vacation_send_from_recipient = yes } postmaster_address = admin@usrpro.io protocols = imap pop3 lmtp sieve service auth-worker { unix_listener auth-worker { group = mail mode = 0660 user = dovecot } user = mail } service auth { user = dovecot } service imap-login { inet_listener imap { port = 143 } } service lmtp { inet_listener lmtp { port = 2525 } } service managesieve-login { inet_listener sieve { port = 4190 } } submission_host = 192.168.203.6 userdb { args = /etc/dovecot/auth.conf driver = dict } protocol imap { mail_plugins = " fts fts_lucene quota quota_clone zlib imap_quota imap_sieve" } protocol lmtp { mail_plugins = " fts fts_lucene quota quota_clone zlib sieve" recipient_delimiter = + }
And the actual error log:
imap_1 | Dec 19 16:31:08 indexer-worker(admin@usrpro.io)<490><m+t5VmJ93K7AqMsG:grc9HUxyGlzqAQAANEhNiw>: Fatal: master: service(indexer-worker): child 490 killed with signal 11 (core dumped) imap_1 | Dec 19 16:31:09 indexer-worker(admin@usrpro.io)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: Error: lucene index /mail/admin@usrpro.io/lucene-indexes: IndexWriter() failed (#1): Lock obtain timed out imap_1 | Dec 19 16:31:09 indexer-worker(admin@usrpro.io)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: Error: Mailbox INBOX: Mail search failed: Internal error occurred. Refer to server log for more information. [2018-12-19 16:31:08] imap_1 | Dec 19 16:31:09 indexer-worker(admin@usrpro.io)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: Error: Mailbox INBOX: Transaction commit failed: FTS transaction commit failed: backend deinit (attempted to index 1 messages (UIDs 1299..1299)) imap_1 | Dec 19 16:31:10 indexer-worker(admin@usrpro.io)<493><m+t5VmJ93K7AqMsG:GKWdMU1yGlztAQAANEhNiw>: Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped) imap_1 | Dec 19 16:31:10 indexer: Error: Indexer worker disconnected, discarding 1 requests for admin@usrpro.io imap_1 | Dec 19 16:31:11 indexer-worker(admin@usrpro.io)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: Error: lucene index /mail/admin@usrpro.io/lucene-indexes: IndexWriter() failed (#1): Lock obtain timed out imap_1 | Dec 19 16:31:11 indexer-worker(admin@usrpro.io)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: Error: Mailbox INBOX: Mail search failed: Internal error occurred. Refer to server log for more information. [2018-12-19 16:31:10] imap_1 | Dec 19 16:31:11 indexer-worker(admin@usrpro.io)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: Error: Mailbox INBOX: Transaction commit failed: FTS transaction commit failed: backend deinit (attempted to index 1 messages (UIDs 1310..1310)) imap_1 | Dec 19 16:31:11 indexer: Error: Indexer worker disconnected, discarding 1 requests for admin@usrpro.io
I managed to find a core dump file, which appeared outside of the container. So I copied it back in, installed and ran gdb:
GNU gdb (GDB) 8.0.1 Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-alpine-linux-musl". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/libexec/dovecot/indexer-worker...(no debugging symbols found)...done. [New LWP 1075]
warning: Can't read pathname for load map: No error information. Core was generated by `dovecot/indexer-worker'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fbd9a31c11a in free () from /lib/ld-musl-x86_64.so.1
So this seems musl related. I installed musl-dbg and ran again:
GNU gdb (GDB) 8.0.1 Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-alpine-linux-musl". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/libexec/dovecot/indexer-worker...(no debugging symbols found)...done. [New LWP 1075]
warning: Can't read pathname for load map: No error information. Core was generated by `dovecot/indexer-worker'. Program terminated with signal SIGSEGV, Segmentation fault. #0 a_crash () at ./arch/x86_64/atomic_arch.h:108 108 ./arch/x86_64/atomic_arch.h: No such file or directory.
Now I'm kinda lost in space. I don't know where that header file is. Tried running "find" on the filesystem and a google search. But nothing specific showed up.
I am starting to feel this bug is more musl related than Dovecot. Since this has bitten our project more in the past, I'm considering to move the project to Debian based images. But I want to be 100% sure this is not a dovecot bug.
Note, earlier I created an image with Alpine:edge with musl 1.1.20-r2 and Dovecot 2.3.3. Running that image I saw the same error symptoms (Was just a hopeful trail and error). I did not do any debugging on that one.
Thanks in advance! Tim
Hope this helps. There are no debugging symbols available for dovecot in the alpine repository.
(gdb) bt full #0 a_crash () at ./arch/x86_64/atomic_arch.h:108 No locals. #1 free (p=0x55d949250660) at src/malloc/malloc.c:467 extra = 65 base = 0x55d94925060f "" len = 94391723427649 self = 0x55d949250650 next = <optimized out> final_size = <optimized out> new_size = <optimized out> size = <optimized out> reclaim = 0 i = <optimized out> #2 0x00007fbd98ae7e29 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #3 0x00007fbd98ae87ee in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #4 0x00007fbd98b3639d in lucene::index::IndexWriter::init(lucene::store::Directory*, lucene::analysis::Analyzer*, bool, bool, lucene::index::IndexDeletionPolicy*, bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #5 0x00007fbd98b36883 in lucene::index::IndexWriter::IndexWriter(char const*, lucene::analysis::Analyzer*, bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #6 0x00007fbd98df7cea in lucene_index_build_init () from /usr/lib/dovecot/lib21_fts_lucene_plugin.so No symbol table info available. #7 0x00007fbd98df677b in ?? () from /usr/lib/dovecot/lib21_fts_lucene_plugin.so No symbol table info available. #8 0x00007fbd99836706 in fts_backend_update_set_build_key () from /usr/lib/dovecot/lib20_fts_plugin.so No symbol table info available. #9 0x00007fbd99837c25 in fts_build_mail () from /usr/lib/dovecot/lib20_fts_plugin.so No symbol table info available. #10 0x00007fbd9983c63a in ?? () from /usr/lib/dovecot/lib20_fts_plugin.so No symbol table info available. #11 0x00007fbd9a008ea7 in mail_precache () from /usr/lib/dovecot/libdovecot-storage.so.0 No symbol table info available. #12 0x000055d9471ce6c0 in ?? () No symbol table info available. #13 0x00007fbd99d41beb in io_loop_call_io () from /usr/lib/dovecot/libdovecot.so.0 No symbol table info available. #14 0x00007fbd99d430ab in io_loop_handler_run_internal () from /usr/lib/dovecot/libdovecot.so.0 No symbol table info available. #15 0x00007fbd99d41d56 in io_loop_handler_run () from /usr/lib/dovecot/libdovecot.so.0 No symbol table info available. #16 0x00007fbd99d41e69 in io_loop_run () from /usr/lib/dovecot/libdovecot.so.0 No symbol table info available. #17 0x00007fbd99cd5d32 in master_service_run () from /usr/lib/dovecot/libdovecot.so.0 No symbol table info available. #18 0x000055d9471cdfa9 in main () No symbol table info available.
On 12/19/18 8:08 PM, Aki Tuomi wrote:
On 19 December 2018 at 20:00 Tim Mohlmann via dovecot < dovecot@dovecot.org <mailto:dovecot@dovecot.org>> wrote:
Dear list,
We been having some issues where the indexer-worker is crashing. This happens on production servers which are handling a slight amount of mail, but is also reproducible by moving messages. Also, users on my server are complaining about "Trashed" items coming back etc.
Some details:
Dovecot 2.3.2.1 in alpine:3.8 based Docker container. As part of the Mailu distribution. ( https://github.com/Mailu/Mailu)
I've first seen this issues on my production server, which stores mail on GlusterFS
I've been able to reproduce running the Docker container on a Virtual machine, using local storage.
There is another Mailu user reporting the same problem on a different VM provider / disk infrastructure: https://github.com/Mailu/Mailu/issues/751
Libc: musl-1.1.19-r10
Output of dovecot -n:
# dovecot -n # 2.3.2.1 (0719df592): /etc/dovecot/dovecot.conf # Pigeonhole version 0.5.2 (7704de5e) # OS: Linux 4.16.3-301.fc28.x86_64 x86_64 ext4 # Hostname: 98a2726271d3 auth_mechanisms = plain login disable_plaintext_auth = no first_valid_gid = 8 first_valid_uid = 8 hostname = mail.usrpro.io log_path = /dev/stderr mail_access_groups = mail mail_gid = mail mail_home = /mail/%u mail_location = maildir:/mail/%u mail_plugins = " fts fts_lucene quota quota_clone zlib" mail_privileged_group = mail mail_uid = mail mail_vsize_bg_after_count = 100 maildir_stat_dirs = yes managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext spamtest spamtestplus editheader imapsieve vnd.dovecot.imapsieve namespace inbox { inbox = yes location = mailbox Drafts { auto = subscribe special_use = \Drafts } mailbox Junk { auto = subscribe special_use = \Junk } mailbox Sent { auto = subscribe special_use = \Sent } mailbox Trash { auto = subscribe special_use = \Trash } prefix = } passdb { args = /etc/dovecot/auth.conf driver = dict } plugin { fts = lucene fts_autoindex = yes fts_autoindex_exclude = \Junk fts_lucene = whitespace_chars=@. imapsieve_mailbox1_before = file:/conf/report-spam.sieve imapsieve_mailbox1_causes = COPY imapsieve_mailbox1_name = Junk imapsieve_mailbox2_before = file:/conf/report-ham.sieve imapsieve_mailbox2_causes = COPY imapsieve_mailbox2_from = Junk imapsieve_mailbox2_name = * quota = count:User quota quota_clone_dict = proxy:/tmp/podop.socket:quota quota_vsizes = yes sieve = file:~/sieve;active=~/.dovecot.sieve sieve_before = dict:proxy:/tmp/podop.socket:sieve sieve_execute_bin_dir = /conf/bin sieve_extensions = +spamtest +spamtestplus +editheader sieve_global_extensions = +vnd.dovecot.execute sieve_plugins = sieve_imapsieve sieve_extprograms sieve_spamtest_max_value = 15 sieve_spamtest_status_header = X-Spam-Level sieve_spamtest_status_type = strlen sieve_vacation_dont_check_recipient = yes sieve_vacation_send_from_recipient = yes } postmaster_address = admin@usrpro.io <mailto:admin@usrpro.io> protocols = imap pop3 lmtp sieve service auth-worker { unix_listener auth-worker { group = mail mode = 0660 user = dovecot } user = mail } service auth { user = dovecot } service imap-login { inet_listener imap { port = 143 } } service lmtp { inet_listener lmtp { port = 2525 } } service managesieve-login { inet_listener sieve { port = 4190 } } submission_host = 192.168.203.6 userdb { args = /etc/dovecot/auth.conf driver = dict } protocol imap { mail_plugins = " fts fts_lucene quota quota_clone zlib imap_quota imap_sieve" } protocol lmtp { mail_plugins = " fts fts_lucene quota quota_clone zlib sieve" recipient_delimiter = + }
And the actual error log:
imap_1 | Dec 19 16:31:08 indexer-worker( admin@usrpro.io <mailto:admin@usrpro.io>)<490><m+t5VmJ93K7AqMsG:grc9HUxyGlzqAQAANEhNiw>: Fatal: master: service(indexer-worker): child 490 killed with signal 11 (core dumped) imap_1 | Dec 19 16:31:09 indexer-worker( admin@usrpro.io <mailto:admin@usrpro.io>)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: Error: lucene index /mail/admin@usrpro.io <mailto:/mail/admin@usrpro.io>/lucene-indexes: IndexWriter() failed (#1): Lock obtain timed out imap_1 | Dec 19 16:31:09 indexer-worker( admin@usrpro.io <mailto:admin@usrpro.io>)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: Error: Mailbox INBOX: Mail search failed: Internal error occurred. Refer to server log for more information. [2018-12-19 16:31:08] imap_1 | Dec 19 16:31:09 indexer-worker( admin@usrpro.io <mailto:admin@usrpro.io>)<493><m+t5VmJ93K7AqMsG:HRLEK0xyGlztAQAANEhNiw>: Error: Mailbox INBOX: Transaction commit failed: FTS transaction commit failed: backend deinit (attempted to index 1 messages (UIDs 1299..1299)) imap_1 | Dec 19 16:31:10 indexer-worker( admin@usrpro.io <mailto:admin@usrpro.io>)<493><m+t5VmJ93K7AqMsG:GKWdMU1yGlztAQAANEhNiw>: Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped) imap_1 | Dec 19 16:31:10 indexer: Error: Indexer worker disconnected, discarding 1 requests for admin@usrpro.io <mailto:admin@usrpro.io> imap_1 | Dec 19 16:31:11 indexer-worker( admin@usrpro.io <mailto:admin@usrpro.io>)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: Error: lucene index /mail/admin@usrpro.io <mailto:/mail/admin@usrpro.io>/lucene-indexes: IndexWriter() failed (#1): Lock obtain timed out imap_1 | Dec 19 16:31:11 indexer-worker( admin@usrpro.io <mailto:admin@usrpro.io>)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: Error: Mailbox INBOX: Mail search failed: Internal error occurred. Refer to server log for more information. [2018-12-19 16:31:10] imap_1 | Dec 19 16:31:11 indexer-worker( admin@usrpro.io <mailto:admin@usrpro.io>)<494><m+t5VmJ93K7AqMsG:MRCzBE5yGlzuAQAANEhNiw>: Error: Mailbox INBOX: Transaction commit failed: FTS transaction commit failed: backend deinit (attempted to index 1 messages (UIDs 1310..1310)) imap_1 | Dec 19 16:31:11 indexer: Error: Indexer worker disconnected, discarding 1 requests for admin@usrpro.io <mailto:admin@usrpro.io>
I managed to find a core dump file, which appeared outside of the container. So I copied it back in, installed and ran gdb:
GNU gdb (GDB) 8.0.1 Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later < http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-alpine-linux-musl". Type "show configuration" for configuration details. For bug reporting instructions, please see: < ">http://www.gnu.org/software/gdb/bugs/> <http://www.gnu.org/software/gdb/bugs/>>. Find the GDB manual and other documentation resources online at: < ">http://www.gnu.org/software/gdb/documentation/> <http://www.gnu.org/software/gdb/documentation/>>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/libexec/dovecot/indexer-worker...(no debugging symbols found)...done. [New LWP 1075]
warning: Can't read pathname for load map: No error information. Core was generated by `dovecot/indexer-worker'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fbd9a31c11a in free () from /lib/ld-musl-x86_64.so.1
So this seems musl related. I installed musl-dbg and ran again:
GNU gdb (GDB) 8.0.1 Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later < http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-alpine-linux-musl". Type "show configuration" for configuration details. For bug reporting instructions, please see: < ">http://www.gnu.org/software/gdb/bugs/> <http://www.gnu.org/software/gdb/bugs/>>. Find the GDB manual and other documentation resources online at: < ">http://www.gnu.org/software/gdb/documentation/> <http://www.gnu.org/software/gdb/documentation/>>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/libexec/dovecot/indexer-worker...(no debugging symbols found)...done. [New LWP 1075]
warning: Can't read pathname for load map: No error information. Core was generated by `dovecot/indexer-worker'. Program terminated with signal SIGSEGV, Segmentation fault. #0 a_crash () at ./arch/x86_64/atomic_arch.h:108 108 ./arch/x86_64/atomic_arch.h: No such file or directory.
Now I'm kinda lost in space. I don't know where that header file is. Tried running "find" on the filesystem and a google search. But nothing specific showed up.
I am starting to feel this bug is more musl related than Dovecot. Since this has bitten our project more in the past, I'm considering to move the project to Debian based images. But I want to be 100% sure this is not a dovecot bug.
Note, earlier I created an image with Alpine:edge with musl 1.1.20-r2 and Dovecot 2.3.3. Running that image I saw the same error symptoms (Was just a hopeful trail and error). I did not do any debugging on that one.
Thanks in advance! Tim
Can you run bt full on gdb and post that?
Aki Tuomi
I've managed to re-compile the alpine packages inside the container, creating the debug symbols and disabling compile time optimizations, as they also disturbed the backtrace. The following output looks much more meaningful. But unfortunately there are still some symbols missing for lucene-core. Let me know if this is enough to work with. (It doesn't seem to be musl after all?)
Reading symbols from /usr/libexec/dovecot/indexer-worker...Reading symbols from /usr/lib/debug//usr/libexec/dovecot/indexer-worker.debug...done. done. [New LWP 30417]
warning: Can't read pathname for load map: No error information. Core was generated by `dovecot/indexer-worker'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 (gdb) bt full #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #1 0x00007f1653edeba2 in lucene::index::KeepOnlyLastCommitDeletionPolicy::onCommit(std::vector<lucene::index::IndexCommitPoint*, std::allocator<lucene::index::IndexCommitPoint*> >&) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #2 0x00007f1653f15312 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #3 0x00007f1653f325e3 in lucene::index::IndexWriter::doFlush(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #4 0x00007f1653f32f9c in lucene::index::IndexWriter::flush(bool, bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #5 0x00007f1653f33052 in lucene::index::IndexWriter::closeInternal(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #6 0x00007f1653f337c4 in lucene::index::IndexWriter::close(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #7 0x00007f16541fc8c6 in lucene_index_build_deinit (index=0x55945fe7a860) at lucene-wrapper.cc:646 ret = 0 #8 0x00007f16541f9ba4 in fts_backend_lucene_update_deinit (_ctx=0x55945fe910a0) at fts-backend-lucene.c:259 ctx = 0x55945fe910a0 backend = 0x55945fe7a760 ret = 0 __func__ = "fts_backend_lucene_update_deinit" #9 0x00007f1654649baa in fts_backend_update_deinit (_ctx=0x55945fe7a480) at fts-api.c:144 ctx = 0x55945fe910a0 backend = 0x55945fe7a760 ret = 0 #10 0x00007f1654655d83 in fts_transaction_end (t=0x55945fe88820, error_r=0x7fff1b28d1b8) at fts-storage.c:613 ft = 0x55945fe84400 flist = 0x55945fe7a3a0 ret = 0 __func__ = "fts_transaction_end" #11 0x00007f16546562e8 in fts_transaction_commit (t=0x55945fe88820, changes_r=0x7fff1b28d2c0) at fts-storage.c:696 ft = 0x55945fe84400 --Type <RET> for more, q to quit, c to continue without paging--c fbox = 0x55945fe7f798 box = 0x55945fe7efa8 autoindex = false ret = 0 error = 0x7f1654813665 <p_strdup_vprintf+90> "H\213E\350\353\067\213E\344\211\302H\213E\330H\211\326H\211\307\350\301\367\377\377H\211E\360\213E\344\203\350\001\211\302H\213M\350H\213E\360H\211\316H\211\307\350\302\357\354\377H\213E\360H\213M\370dH3\f%(" #12 0x00007f165469b83e in quota_mailbox_transaction_commit (ctx=0x55945fe88820, changes_r=0x7fff1b28d2c0) at quota-storage.c:174 qbox = 0x55945fe7f600 qt = 0x55945fe8f160 __func__ = "quota_mailbox_transaction_commit" #13 0x00007f16548fefad in mailbox_transaction_commit_get_changes (_t=0x7fff1b28d348, changes_r=0x7fff1b28d2c0) at mail-storage.c:2215 _data_stack_cur_id = 4 t = 0x55945fe88820 box = 0x55945fe7efa8 save_count = 0 ret = 21908 __func__ = "mailbox_transaction_commit_get_changes" #14 0x00007f16548feefc in mailbox_transaction_commit (t=0x7fff1b28d348) at mail-storage.c:2197 changes = {pool = 0x0, uid_validity = 1418718114, saved_uids = {arr = {buffer = 0x7f16548e5d9e <__x86_return_thunk+5>, element_size = 140733649048408}, v = 0x7f16548e5d9e <__x86_return_thunk+5>, v_modifiable = 0x7f16548e5d9e <__x86_return_thunk+5>}, ignored_modseq_changes = 455660384, changes_mask = (unknown: 0), no_read_perm = false} ret = 21908 #15 0x000055945e4abdcb in index_mailbox_precache (conn=0x55945fe57440, box=0x55945fe7efa8) at master-connection.c:126 storage = 0x55945fe77f28 username = 0x55945fe73ee0 "admin@usrpro.io" box_vname = 0x55945fe7f310 "Sent" errstr = 0x7f165469cb53 <quota_mailbox_sync_deinit+164> "\213E\344H\203\304([]\351\320f\377\377UH\211\345H\203\354\060H\211}\330dH\213\004%(" error = MAIL_ERROR_TEMP status = {messages = 10, recent = 0, unseen = 0, uidvalidity = 1545140227, uidnext = 11, first_unseen_seq = 0, first_recent_uid = 3, last_cached_seq = 1, highest_modseq = 0, highest_pvt_modseq = 0, keywords = 0x0, permanent_flags = 0, flags = 0, permanent_keywords = false, allow_new_keywords = false, nonpermanent_modseqs = false, no_modseq_tracking = false, have_guids = true, have_save_guids = true, have_only_guid128 = false} trans = 0x0 search_args = 0x0 ctx = 0x0 mail = 0x0 metadata = {guid = '\000' <repeats 15 times>, virtual_size = 0, physical_size = 0, first_save_date = 0, cache_fields = 0x0, precache_fields = (MAIL_FETCH_STREAM_HEADER | MAIL_FETCH_STREAM_BODY | MAIL_FETCH_RECEIVED_DATE | MAIL_FETCH_SAVE_DATE | MAIL_FETCH_PHYSICAL_SIZE | MAIL_FETCH_VIRTUAL_SIZE | MAIL_FETCH_UIDL_BACKEND | MAIL_FETCH_GUID | MAIL_FETCH_POP3_ORDER), backend_ns_prefix = 0x0, backend_ns_type = (unknown: 0)} seq = 2 first_uid = 2 last_uid = 10 percentage_str = "\026\177\000" counter = 9 max = 9 percentage = 32534 percentage_sent = 0 ret = 0 uids = 0x55945fe48938 " (UIDs 2..10)" #16 0x000055945e4ac1c1 in index_mailbox (conn=0x55945fe57440, user=0x55945fe73dc8, mailbox=0x55945fe46058 "Sent", max_recent_msgs=0, what=0x55945fe46070 "i") at master-connection.c:205 ns = 0x55945fe77da0 box = 0x55945fe7efa8 status = {messages = 1608873868, recent = 0, unseen = 1581972288, uidvalidity = 21908, uidnext = 455660880, first_unseen_seq = 32767, first_recent_uid = 1608810272, last_cached_seq = 21908, highest_modseq = 139733882519610, highest_pvt_modseq = 94095725476155, keywords = 0x7fff1b28d540, permanent_flags = 0, flags = 0, permanent_keywords = false, allow_new_keywords = false, nonpermanent_modseqs = false, no_modseq_tracking = true, have_guids = true, have_save_guids = false, have_only_guid128 = true} path = 0x55945fe47f38 "/mail/admin@usrpro.io/.Sent" errstr = 0x7f16546e403a <__x86_return_thunk+5> "\363\220\017\256\350\353\371H\215d$\b\303\350\a" error = 32534 sync_flags = MAILBOX_SYNC_FLAG_FULL_READ ret = 0 #17 0x000055945e4ac3d8 in master_connection_input_line (conn=0x55945fe57440, line=0x55945fe5778c "admin@usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti") at master-connection.c:247 args = 0x55945fe46078 input = {parent_event = 0x0, module = 0x55945e4ad2ec "mail", service = 0x55945e4ad2f1 "indexer-worker", username = 0x55945fe46048 "admin@usrpro.io", session_id = 0x0, session_id_prefix = 0x55945fe4605d "2HvXT2h9epPAqMsG", session_create_time = 0, local_ip = {family = 0, u = {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, remote_ip = {family = 0, u = {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, local_port = 0, remote_port = 0, userdb_fields = 0x0, flags_override_add = (unknown: 0), flags_override_remove = (unknown: 0), no_userdb_lookup = false, debug = false, conn_secured = false, conn_ssl_secured = false} service_user = 0x55945fe59958 user = 0x55945fe73dc8 str = 0x7f16547f6231 <pool_system_free+28> "\220\311\351\375\335\356\377UH\211\345H\203\354 H\211}\370H\211u\360H\211U\350H\211M\340H\203", <incomplete sequence \340> error = 0x0 max_recent_msgs = 0 ret = 32767 #18 0x000055945e4ac58f in master_connection_input (conn=0x55945fe57440) at master-connection.c:287 _data_stack_cur_id = 3 line = 0x55945fe5778c "admin@usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti" ret = 32534 #19 0x00007f16547e73c8 in io_loop_call_io (io=0x55945fe57480) at ioloop.c:698 ioloop = 0x55945e4afca0 t_id = 2 __func__ = "io_loop_call_io" #20 0x00007f16547ea409 in io_loop_handler_run_internal (ioloop=0x55945e4afca0) at ioloop-epoll.c:221 ctx = 0x7f1654b0dda0 events = 0x55945fe548c0 event = 0x55945fe548c0 list = 0x55945fe57500 io = 0x55945fe57480 tv = {tv_sec = 2147483, tv_usec = 0} events_count = 5 msecs = -1 ret = 1 i = 0 j = 0 call = true __func__ = "io_loop_handler_run_internal" #21 0x00007f16547e7657 in io_loop_handler_run (ioloop=0x55945e4afca0) at ioloop.c:750 __func__ = "io_loop_handler_run" #22 0x00007f16547e752e in io_loop_run (ioloop=0x55945e4afca0) at ioloop.c:723 __func__ = "io_loop_run" #23 0x00007f165471b3d5 in master_service_run (service=0x55945e4afb40, callback=0x55945e4ab5ed <client_connected>) at master-service.c:775 No locals. #24 0x000055945e4ab7ce in main (argc=1, argv=0x55945e4af8e0) at indexer-worker.c:78 service_flags = (MASTER_SERVICE_FLAG_KEEP_CONFIG_OPEN | MASTER_SERVICE_FLAG_SEND_STATS) storage_service_flags = (MAIL_STORAGE_SERVICE_FLAG_USERDB_LOOKUP | MAIL_STORAGE_SERVICE_FLAG_TEMP_PRIV_DROP | MAIL_STORAGE_SERVICE_FLAG_NO_IDLE_TIMEOUT) c = -1
On 12/19/18 8:15 PM, Tim Mohlmann via dovecot wrote:
<Deleted due to message size limit>
I'm really sorry, but it was a long day yesterday. I forgot to mention that I used alpine:edge (based on the alpine master branch) image to set up the build environment. So there were some version changes.
dovecot --version: 2.3.3 (dcead646b)
apk info musl:
musl-1.1.20-r2 description: the musl c library (libc) implementation
Configuration was unchanged wrt to my first mail. Maybe it's helpful to share the build flags?
export CFLAGS="-O0" export CXXFLAGS="$CFLAGS"
export CPPFLAGS="$CFLAGS"
export LDFLAGS="-Wl,--as-needed"
In the APKBUILD file can be found here: https://git.alpinelinux.org/cgit/aports/tree/main/dovecot/APKBUILD and I've modified the following:
options="libtool !strip"
subpackages="
....
$pkgname-dbg
"
And if it is of any interest, I've done a docker commit
on the
container and pushed the resulting image as muhlemmer/dovecot:debug.
https://cloud.docker.com/repository/docker/muhlemmer/dovecot/tags. Don't
know if its of much use by itself, without the Mailu ecosystem around.
Just in case.
Cheers!
On 12/20/18 2:04 AM, Tim Mohlmann via dovecot wrote:
I've managed to re-compile the alpine packages inside the container, creating the debug symbols and disabling compile time optimizations, as they also disturbed the backtrace. The following output looks much more meaningful. But unfortunately there are still some symbols missing for lucene-core. Let me know if this is enough to work with. (It doesn't seem to be musl after all?)
Reading symbols from /usr/libexec/dovecot/indexer-worker...Reading symbols from /usr/lib/debug//usr/libexec/dovecot/indexer-worker.debug...done. done. [New LWP 30417]
warning: Can't read pathname for load map: No error information. Core was generated by `dovecot/indexer-worker'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 (gdb) bt full #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #1 0x00007f1653edeba2 in lucene::index::KeepOnlyLastCommitDeletionPolicy::onCommit(std::vector<lucene::index::IndexCommitPoint*, std::allocator<lucene::index::IndexCommitPoint*> >&) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #2 0x00007f1653f15312 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #3 0x00007f1653f325e3 in lucene::index::IndexWriter::doFlush(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #4 0x00007f1653f32f9c in lucene::index::IndexWriter::flush(bool, bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #5 0x00007f1653f33052 in lucene::index::IndexWriter::closeInternal(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #6 0x00007f1653f337c4 in lucene::index::IndexWriter::close(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #7 0x00007f16541fc8c6 in lucene_index_build_deinit (index=0x55945fe7a860) at lucene-wrapper.cc:646 ret = 0 #8 0x00007f16541f9ba4 in fts_backend_lucene_update_deinit (_ctx=0x55945fe910a0) at fts-backend-lucene.c:259 ctx = 0x55945fe910a0 backend = 0x55945fe7a760 ret = 0 __func__ = "fts_backend_lucene_update_deinit" #9 0x00007f1654649baa in fts_backend_update_deinit (_ctx=0x55945fe7a480) at fts-api.c:144 ctx = 0x55945fe910a0 backend = 0x55945fe7a760 ret = 0 #10 0x00007f1654655d83 in fts_transaction_end (t=0x55945fe88820, error_r=0x7fff1b28d1b8) at fts-storage.c:613 ft = 0x55945fe84400 flist = 0x55945fe7a3a0 ret = 0 __func__ = "fts_transaction_end" #11 0x00007f16546562e8 in fts_transaction_commit (t=0x55945fe88820, changes_r=0x7fff1b28d2c0) at fts-storage.c:696 ft = 0x55945fe84400 --Type <RET> for more, q to quit, c to continue without paging--c fbox = 0x55945fe7f798 box = 0x55945fe7efa8 autoindex = false ret = 0 error = 0x7f1654813665 <p_strdup_vprintf+90> "H\213E\350\353\067\213E\344\211\302H\213E\330H\211\326H\211\307\350\301\367\377\377H\211E\360\213E\344\203\350\001\211\302H\213M\350H\213E\360H\211\316H\211\307\350\302\357\354\377H\213E\360H\213M\370dH3\f%(" #12 0x00007f165469b83e in quota_mailbox_transaction_commit (ctx=0x55945fe88820, changes_r=0x7fff1b28d2c0) at quota-storage.c:174 qbox = 0x55945fe7f600 qt = 0x55945fe8f160 __func__ = "quota_mailbox_transaction_commit" #13 0x00007f16548fefad in mailbox_transaction_commit_get_changes (_t=0x7fff1b28d348, changes_r=0x7fff1b28d2c0) at mail-storage.c:2215 _data_stack_cur_id = 4 t = 0x55945fe88820 box = 0x55945fe7efa8 save_count = 0 ret = 21908 __func__ = "mailbox_transaction_commit_get_changes" #14 0x00007f16548feefc in mailbox_transaction_commit (t=0x7fff1b28d348) at mail-storage.c:2197 changes = {pool = 0x0, uid_validity = 1418718114, saved_uids = {arr = {buffer = 0x7f16548e5d9e <__x86_return_thunk+5>, element_size = 140733649048408}, v = 0x7f16548e5d9e <__x86_return_thunk+5>, v_modifiable = 0x7f16548e5d9e <__x86_return_thunk+5>}, ignored_modseq_changes = 455660384, changes_mask = (unknown: 0), no_read_perm = false} ret = 21908 #15 0x000055945e4abdcb in index_mailbox_precache (conn=0x55945fe57440, box=0x55945fe7efa8) at master-connection.c:126 storage = 0x55945fe77f28 username = 0x55945fe73ee0 "admin@usrpro.io" box_vname = 0x55945fe7f310 "Sent" errstr = 0x7f165469cb53 <quota_mailbox_sync_deinit+164> "\213E\344H\203\304([]\351\320f\377\377UH\211\345H\203\354\060H\211}\330dH\213\004%(" error = MAIL_ERROR_TEMP status = {messages = 10, recent = 0, unseen = 0, uidvalidity = 1545140227, uidnext = 11, first_unseen_seq = 0, first_recent_uid = 3, last_cached_seq = 1, highest_modseq = 0, highest_pvt_modseq = 0, keywords = 0x0, permanent_flags = 0, flags = 0, permanent_keywords = false, allow_new_keywords = false, nonpermanent_modseqs = false, no_modseq_tracking = false, have_guids = true, have_save_guids = true, have_only_guid128 = false} trans = 0x0 search_args = 0x0 ctx = 0x0 mail = 0x0 metadata = {guid = '\000' <repeats 15 times>, virtual_size = 0, physical_size = 0, first_save_date = 0, cache_fields = 0x0, precache_fields = (MAIL_FETCH_STREAM_HEADER | MAIL_FETCH_STREAM_BODY | MAIL_FETCH_RECEIVED_DATE | MAIL_FETCH_SAVE_DATE | MAIL_FETCH_PHYSICAL_SIZE | MAIL_FETCH_VIRTUAL_SIZE | MAIL_FETCH_UIDL_BACKEND | MAIL_FETCH_GUID | MAIL_FETCH_POP3_ORDER), backend_ns_prefix = 0x0, backend_ns_type = (unknown: 0)} seq = 2 first_uid = 2 last_uid = 10 percentage_str = "\026\177\000" counter = 9 max = 9 percentage = 32534 percentage_sent = 0 ret = 0 uids = 0x55945fe48938 " (UIDs 2..10)" #16 0x000055945e4ac1c1 in index_mailbox (conn=0x55945fe57440, user=0x55945fe73dc8, mailbox=0x55945fe46058 "Sent", max_recent_msgs=0, what=0x55945fe46070 "i") at master-connection.c:205 ns = 0x55945fe77da0 box = 0x55945fe7efa8 status = {messages = 1608873868, recent = 0, unseen = 1581972288, uidvalidity = 21908, uidnext = 455660880, first_unseen_seq = 32767, first_recent_uid = 1608810272, last_cached_seq = 21908, highest_modseq = 139733882519610, highest_pvt_modseq = 94095725476155, keywords = 0x7fff1b28d540, permanent_flags = 0, flags = 0, permanent_keywords = false, allow_new_keywords = false, nonpermanent_modseqs = false, no_modseq_tracking = true, have_guids = true, have_save_guids = false, have_only_guid128 = true} path = 0x55945fe47f38 "/mail/admin@usrpro.io/.Sent" errstr = 0x7f16546e403a <__x86_return_thunk+5> "\363\220\017\256\350\353\371H\215d$\b\303\350\a" error = 32534 sync_flags = MAILBOX_SYNC_FLAG_FULL_READ ret = 0 #17 0x000055945e4ac3d8 in master_connection_input_line (conn=0x55945fe57440, line=0x55945fe5778c "admin@usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti") at master-connection.c:247 args = 0x55945fe46078 input = {parent_event = 0x0, module = 0x55945e4ad2ec "mail", service = 0x55945e4ad2f1 "indexer-worker", username = 0x55945fe46048 "admin@usrpro.io", session_id = 0x0, session_id_prefix = 0x55945fe4605d "2HvXT2h9epPAqMsG", session_create_time = 0, local_ip = {family = 0, u = {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, remote_ip = {family = 0, u = {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, local_port = 0, remote_port = 0, userdb_fields = 0x0, flags_override_add = (unknown: 0), flags_override_remove = (unknown: 0), no_userdb_lookup = false, debug = false, conn_secured = false, conn_ssl_secured = false} service_user = 0x55945fe59958 user = 0x55945fe73dc8 str = 0x7f16547f6231 <pool_system_free+28> "\220\311\351\375\335\356\377UH\211\345H\203\354 H\211}\370H\211u\360H\211U\350H\211M\340H\203", <incomplete sequence \340> error = 0x0 max_recent_msgs = 0 ret = 32767 #18 0x000055945e4ac58f in master_connection_input (conn=0x55945fe57440) at master-connection.c:287 _data_stack_cur_id = 3 line = 0x55945fe5778c "admin@usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti" ret = 32534 #19 0x00007f16547e73c8 in io_loop_call_io (io=0x55945fe57480) at ioloop.c:698 ioloop = 0x55945e4afca0 t_id = 2 __func__ = "io_loop_call_io" #20 0x00007f16547ea409 in io_loop_handler_run_internal (ioloop=0x55945e4afca0) at ioloop-epoll.c:221 ctx = 0x7f1654b0dda0 events = 0x55945fe548c0 event = 0x55945fe548c0 list = 0x55945fe57500 io = 0x55945fe57480 tv = {tv_sec = 2147483, tv_usec = 0} events_count = 5 msecs = -1 ret = 1 i = 0 j = 0 call = true __func__ = "io_loop_handler_run_internal" #21 0x00007f16547e7657 in io_loop_handler_run (ioloop=0x55945e4afca0) at ioloop.c:750 __func__ = "io_loop_handler_run" #22 0x00007f16547e752e in io_loop_run (ioloop=0x55945e4afca0) at ioloop.c:723 __func__ = "io_loop_run" #23 0x00007f165471b3d5 in master_service_run (service=0x55945e4afb40, callback=0x55945e4ab5ed <client_connected>) at master-service.c:775 No locals. #24 0x000055945e4ab7ce in main (argc=1, argv=0x55945e4af8e0) at indexer-worker.c:78 service_flags = (MASTER_SERVICE_FLAG_KEEP_CONFIG_OPEN | MASTER_SERVICE_FLAG_SEND_STATS) storage_service_flags = (MAIL_STORAGE_SERVICE_FLAG_USERDB_LOOKUP | MAIL_STORAGE_SERVICE_FLAG_TEMP_PRIV_DROP | MAIL_STORAGE_SERVICE_FLAG_NO_IDLE_TIMEOUT) c = -1
On 12/19/18 8:15 PM, Tim Mohlmann via dovecot wrote:
<Deleted due to message size limit>
Hi, I now this is kinda impolite to ask, but I would like some feedback on this issue. I'm representing the Mailu user group here, plus the fact that my own production server is affected and starts to crash due to file corruption as a result of the indexer-worker segfaults. Last night my server became unavailable with logs like:
Error: imap(tim@usrpro.com)<131><2RGyoYd9Jq/AqMsO>: Corrupted transaction log file /mail/tim@usrpro.com/dovecot.index.log seq 6: record size too small (type=0x2272560, offset=26340, size=0) (sync_offset=26592)
Deleting the corrupted files brought it back, but I don't really like the prospect of cleaning corrupted files during christmas. ;)
I would like to know: do you consider this a real dovecot bug? Or is this something as a result of the way Alpine configures the package? Should I cross post the report to Alpine? Or is it worth changing the image to Debian instead?
Or maybe you don't consider the trace good enough to work on? Also that I would like to know.
Many thanks in advance!
Tim
On 12/20/18 9:49 AM, Tim Mohlmann via dovecot wrote:
I'm really sorry, but it was a long day yesterday. I forgot to mention that I used alpine:edge (based on the alpine master branch) image to set up the build environment. So there were some version changes.
dovecot --version: 2.3.3 (dcead646b)
apk info musl:
musl-1.1.20-r2 description: the musl c library (libc) implementation
Configuration was unchanged wrt to my first mail. Maybe it's helpful to share the build flags?
export CFLAGS="-O0" export CXXFLAGS="$CFLAGS"
export CPPFLAGS="$CFLAGS"
export LDFLAGS="-Wl,--as-needed"
In the APKBUILD file can be found here: https://git.alpinelinux.org/cgit/aports/tree/main/dovecot/APKBUILD and I've modified the following:
options="libtool !strip"
subpackages="
....
$pkgname-dbg
"
And if it is of any interest, I've done a
docker commit
on the container and pushed the resulting image as muhlemmer/dovecot:debug. https://cloud.docker.com/repository/docker/muhlemmer/dovecot/tags. Don't know if its of much use by itself, without the Mailu ecosystem around. Just in case.Cheers!
On 12/20/18 2:04 AM, Tim Mohlmann via dovecot wrote:
I've managed to re-compile the alpine packages inside the container, creating the debug symbols and disabling compile time optimizations, as they also disturbed the backtrace. The following output looks much more meaningful. But unfortunately there are still some symbols missing for lucene-core. Let me know if this is enough to work with. (It doesn't seem to be musl after all?)
Reading symbols from /usr/libexec/dovecot/indexer-worker...Reading symbols from /usr/lib/debug//usr/libexec/dovecot/indexer-worker.debug...done. done. [New LWP 30417]
warning: Can't read pathname for load map: No error information. Core was generated by `dovecot/indexer-worker'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 (gdb) bt full #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #1 0x00007f1653edeba2 in lucene::index::KeepOnlyLastCommitDeletionPolicy::onCommit(std::vector<lucene::index::IndexCommitPoint*, std::allocator<lucene::index::IndexCommitPoint*> >&) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #2 0x00007f1653f15312 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #3 0x00007f1653f325e3 in lucene::index::IndexWriter::doFlush(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #4 0x00007f1653f32f9c in lucene::index::IndexWriter::flush(bool, bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #5 0x00007f1653f33052 in lucene::index::IndexWriter::closeInternal(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #6 0x00007f1653f337c4 in lucene::index::IndexWriter::close(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #7 0x00007f16541fc8c6 in lucene_index_build_deinit (index=0x55945fe7a860) at lucene-wrapper.cc:646 ret = 0 #8 0x00007f16541f9ba4 in fts_backend_lucene_update_deinit (_ctx=0x55945fe910a0) at fts-backend-lucene.c:259 ctx = 0x55945fe910a0 backend = 0x55945fe7a760 ret = 0 __func__ = "fts_backend_lucene_update_deinit" #9 0x00007f1654649baa in fts_backend_update_deinit (_ctx=0x55945fe7a480) at fts-api.c:144 ctx = 0x55945fe910a0 backend = 0x55945fe7a760 ret = 0 #10 0x00007f1654655d83 in fts_transaction_end (t=0x55945fe88820, error_r=0x7fff1b28d1b8) at fts-storage.c:613 ft = 0x55945fe84400 flist = 0x55945fe7a3a0 ret = 0 __func__ = "fts_transaction_end" #11 0x00007f16546562e8 in fts_transaction_commit (t=0x55945fe88820, changes_r=0x7fff1b28d2c0) at fts-storage.c:696 ft = 0x55945fe84400 --Type <RET> for more, q to quit, c to continue without paging--c fbox = 0x55945fe7f798 box = 0x55945fe7efa8 autoindex = false ret = 0 error = 0x7f1654813665 <p_strdup_vprintf+90> "H\213E\350\353\067\213E\344\211\302H\213E\330H\211\326H\211\307\350\301\367\377\377H\211E\360\213E\344\203\350\001\211\302H\213M\350H\213E\360H\211\316H\211\307\350\302\357\354\377H\213E\360H\213M\370dH3\f%(" #12 0x00007f165469b83e in quota_mailbox_transaction_commit (ctx=0x55945fe88820, changes_r=0x7fff1b28d2c0) at quota-storage.c:174 qbox = 0x55945fe7f600 qt = 0x55945fe8f160 __func__ = "quota_mailbox_transaction_commit" #13 0x00007f16548fefad in mailbox_transaction_commit_get_changes (_t=0x7fff1b28d348, changes_r=0x7fff1b28d2c0) at mail-storage.c:2215 _data_stack_cur_id = 4 t = 0x55945fe88820 box = 0x55945fe7efa8 save_count = 0 ret = 21908 __func__ = "mailbox_transaction_commit_get_changes" #14 0x00007f16548feefc in mailbox_transaction_commit (t=0x7fff1b28d348) at mail-storage.c:2197 changes = {pool = 0x0, uid_validity = 1418718114, saved_uids = {arr = {buffer = 0x7f16548e5d9e <__x86_return_thunk+5>, element_size = 140733649048408}, v = 0x7f16548e5d9e <__x86_return_thunk+5>, v_modifiable = 0x7f16548e5d9e <__x86_return_thunk+5>}, ignored_modseq_changes = 455660384, changes_mask = (unknown: 0), no_read_perm = false} ret = 21908 #15 0x000055945e4abdcb in index_mailbox_precache (conn=0x55945fe57440, box=0x55945fe7efa8) at master-connection.c:126 storage = 0x55945fe77f28 username = 0x55945fe73ee0 "admin@usrpro.io" box_vname = 0x55945fe7f310 "Sent" errstr = 0x7f165469cb53 <quota_mailbox_sync_deinit+164> "\213E\344H\203\304([]\351\320f\377\377UH\211\345H\203\354\060H\211}\330dH\213\004%(" error = MAIL_ERROR_TEMP status = {messages = 10, recent = 0, unseen = 0, uidvalidity = 1545140227, uidnext = 11, first_unseen_seq = 0, first_recent_uid = 3, last_cached_seq = 1, highest_modseq = 0, highest_pvt_modseq = 0, keywords = 0x0, permanent_flags = 0, flags = 0, permanent_keywords = false, allow_new_keywords = false, nonpermanent_modseqs = false, no_modseq_tracking = false, have_guids = true, have_save_guids = true, have_only_guid128 = false} trans = 0x0 search_args = 0x0 ctx = 0x0 mail = 0x0 metadata = {guid = '\000' <repeats 15 times>, virtual_size = 0, physical_size = 0, first_save_date = 0, cache_fields = 0x0, precache_fields = (MAIL_FETCH_STREAM_HEADER | MAIL_FETCH_STREAM_BODY | MAIL_FETCH_RECEIVED_DATE | MAIL_FETCH_SAVE_DATE | MAIL_FETCH_PHYSICAL_SIZE | MAIL_FETCH_VIRTUAL_SIZE | MAIL_FETCH_UIDL_BACKEND | MAIL_FETCH_GUID | MAIL_FETCH_POP3_ORDER), backend_ns_prefix = 0x0, backend_ns_type = (unknown: 0)} seq = 2 first_uid = 2 last_uid = 10 percentage_str = "\026\177\000" counter = 9 max = 9 percentage = 32534 percentage_sent = 0 ret = 0 uids = 0x55945fe48938 " (UIDs 2..10)" #16 0x000055945e4ac1c1 in index_mailbox (conn=0x55945fe57440, user=0x55945fe73dc8, mailbox=0x55945fe46058 "Sent", max_recent_msgs=0, what=0x55945fe46070 "i") at master-connection.c:205 ns = 0x55945fe77da0 box = 0x55945fe7efa8 status = {messages = 1608873868, recent = 0, unseen = 1581972288, uidvalidity = 21908, uidnext = 455660880, first_unseen_seq = 32767, first_recent_uid = 1608810272, last_cached_seq = 21908, highest_modseq = 139733882519610, highest_pvt_modseq = 94095725476155, keywords = 0x7fff1b28d540, permanent_flags = 0, flags = 0, permanent_keywords = false, allow_new_keywords = false, nonpermanent_modseqs = false, no_modseq_tracking = true, have_guids = true, have_save_guids = false, have_only_guid128 = true} path = 0x55945fe47f38 "/mail/admin@usrpro.io/.Sent" errstr = 0x7f16546e403a <__x86_return_thunk+5> "\363\220\017\256\350\353\371H\215d$\b\303\350\a" error = 32534 sync_flags = MAILBOX_SYNC_FLAG_FULL_READ ret = 0 #17 0x000055945e4ac3d8 in master_connection_input_line (conn=0x55945fe57440, line=0x55945fe5778c "admin@usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti") at master-connection.c:247 args = 0x55945fe46078 input = {parent_event = 0x0, module = 0x55945e4ad2ec "mail", service = 0x55945e4ad2f1 "indexer-worker", username = 0x55945fe46048 "admin@usrpro.io", session_id = 0x0, session_id_prefix = 0x55945fe4605d "2HvXT2h9epPAqMsG", session_create_time = 0, local_ip = {family = 0, u = {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, remote_ip = {family = 0, u = {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, local_port = 0, remote_port = 0, userdb_fields = 0x0, flags_override_add = (unknown: 0), flags_override_remove = (unknown: 0), no_userdb_lookup = false, debug = false, conn_secured = false, conn_ssl_secured = false} service_user = 0x55945fe59958 user = 0x55945fe73dc8 str = 0x7f16547f6231 <pool_system_free+28> "\220\311\351\375\335\356\377UH\211\345H\203\354 H\211}\370H\211u\360H\211U\350H\211M\340H\203", <incomplete sequence \340> error = 0x0 max_recent_msgs = 0 ret = 32767 #18 0x000055945e4ac58f in master_connection_input (conn=0x55945fe57440) at master-connection.c:287 _data_stack_cur_id = 3 line = 0x55945fe5778c "admin@usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti" ret = 32534 #19 0x00007f16547e73c8 in io_loop_call_io (io=0x55945fe57480) at ioloop.c:698 ioloop = 0x55945e4afca0 t_id = 2 __func__ = "io_loop_call_io" #20 0x00007f16547ea409 in io_loop_handler_run_internal (ioloop=0x55945e4afca0) at ioloop-epoll.c:221 ctx = 0x7f1654b0dda0 events = 0x55945fe548c0 event = 0x55945fe548c0 list = 0x55945fe57500 io = 0x55945fe57480 tv = {tv_sec = 2147483, tv_usec = 0} events_count = 5 msecs = -1 ret = 1 i = 0 j = 0 call = true __func__ = "io_loop_handler_run_internal" #21 0x00007f16547e7657 in io_loop_handler_run (ioloop=0x55945e4afca0) at ioloop.c:750 __func__ = "io_loop_handler_run" #22 0x00007f16547e752e in io_loop_run (ioloop=0x55945e4afca0) at ioloop.c:723 __func__ = "io_loop_run" #23 0x00007f165471b3d5 in master_service_run (service=0x55945e4afb40, callback=0x55945e4ab5ed <client_connected>) at master-service.c:775 No locals. #24 0x000055945e4ab7ce in main (argc=1, argv=0x55945e4af8e0) at indexer-worker.c:78 service_flags = (MASTER_SERVICE_FLAG_KEEP_CONFIG_OPEN | MASTER_SERVICE_FLAG_SEND_STATS) storage_service_flags = (MAIL_STORAGE_SERVICE_FLAG_USERDB_LOOKUP | MAIL_STORAGE_SERVICE_FLAG_TEMP_PRIV_DROP | MAIL_STORAGE_SERVICE_FLAG_NO_IDLE_TIMEOUT) c = -1
On 12/19/18 8:15 PM, Tim Mohlmann via dovecot wrote:
<Deleted due to message size limit>
On 21.12.2018 15.01, Tim Mohlmann via dovecot wrote:
Hi, I now this is kinda impolite to ask, but I would like some feedback on this issue. I'm representing the Mailu user group here, plus the fact that my own production server is affected and starts to crash due to file corruption as a result of the indexer-worker segfaults. Last night my server became unavailable with logs like:
Error: imap(tim@usrpro.com)<131><2RGyoYd9Jq/AqMsO>: Corrupted transaction log file /mail/tim@usrpro.com/dovecot.index.log seq 6: record size too small (type=0x2272560, offset=26340, size=0) (sync_offset=26592)
Deleting the corrupted files brought it back, but I don't really like the prospect of cleaning corrupted files during christmas. ;)
I would like to know: do you consider this a real dovecot bug? Or is this something as a result of the way Alpine configures the package? Should I cross post the report to Alpine? Or is it worth changing the image to Debian instead?
Or maybe you don't consider the trace good enough to work on? Also that I would like to know.
Many thanks in advance!
Tim
On 12/20/18 9:49 AM, Tim Mohlmann via dovecot wrote:
I'm really sorry, but it was a long day yesterday. I forgot to mention that I used alpine:edge (based on the alpine master branch) image to set up the build environment. So there were some version changes.
dovecot --version: 2.3.3 (dcead646b)
apk info musl:
musl-1.1.20-r2 description: the musl c library (libc) implementation
Configuration was unchanged wrt to my first mail. Maybe it's helpful to share the build flags?
export CFLAGS="-O0" export CXXFLAGS="$CFLAGS"
export CPPFLAGS="$CFLAGS"
export LDFLAGS="-Wl,--as-needed"
In the APKBUILD file can be found here: https://git.alpinelinux.org/cgit/aports/tree/main/dovecot/APKBUILD and I've modified the following:
options="libtool !strip"
subpackages="
....
$pkgname-dbg
"
And if it is of any interest, I've done a
docker commit
on the container and pushed the resulting image as muhlemmer/dovecot:debug. https://cloud.docker.com/repository/docker/muhlemmer/dovecot/tags. Don't know if its of much use by itself, without the Mailu ecosystem around. Just in case.Cheers!
On 12/20/18 2:04 AM, Tim Mohlmann via dovecot wrote:
I've managed to re-compile the alpine packages inside the container, creating the debug symbols and disabling compile time optimizations, as they also disturbed the backtrace. The following output looks much more meaningful. But unfortunately there are still some symbols missing for lucene-core. Let me know if this is enough to work with. (It doesn't seem to be musl after all?)
Reading symbols from /usr/libexec/dovecot/indexer-worker...Reading symbols from /usr/lib/debug//usr/libexec/dovecot/indexer-worker.debug...done. done. [New LWP 30417]
warning: Can't read pathname for load map: No error information. Core was generated by `dovecot/indexer-worker'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 (gdb) bt full #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #1 0x00007f1653edeba2 in lucene::index::KeepOnlyLastCommitDeletionPolicy::onCommit(std::vector<lucene::index::IndexCommitPoint*, std::allocator<lucene::index::IndexCommitPoint*> >&) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #2 0x00007f1653f15312 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #3 0x00007f1653f325e3 in lucene::index::IndexWriter::doFlush(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #4 0x00007f1653f32f9c in lucene::index::IndexWriter::flush(bool, bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #5 0x00007f1653f33052 in lucene::index::IndexWriter::closeInternal(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #6 0x00007f1653f337c4 in lucene::index::IndexWriter::close(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #7 0x00007f16541fc8c6 in lucene_index_build_deinit (index=0x55945fe7a860) at lucene-wrapper.cc:646 ret = 0 #8 0x00007f16541f9ba4 in fts_backend_lucene_update_deinit (_ctx=0x55945fe910a0) at fts-backend-lucene.c:259 ctx = 0x55945fe910a0 backend = 0x55945fe7a760 ret = 0 __func__ = "fts_backend_lucene_update_deinit" #9 0x00007f1654649baa in fts_backend_update_deinit (_ctx=0x55945fe7a480) at fts-api.c:144 ctx = 0x55945fe910a0 backend = 0x55945fe7a760 ret = 0 #10 0x00007f1654655d83 in fts_transaction_end (t=0x55945fe88820, error_r=0x7fff1b28d1b8) at fts-storage.c:613 ft = 0x55945fe84400 flist = 0x55945fe7a3a0 ret = 0 __func__ = "fts_transaction_end" #11 0x00007f16546562e8 in fts_transaction_commit (t=0x55945fe88820, changes_r=0x7fff1b28d2c0) at fts-storage.c:696 ft = 0x55945fe84400 --Type <RET> for more, q to quit, c to continue without paging--c fbox = 0x55945fe7f798 box = 0x55945fe7efa8 autoindex = false ret = 0 error = 0x7f1654813665 <p_strdup_vprintf+90> "H\213E\350\353\067\213E\344\211\302H\213E\330H\211\326H\211\307\350\301\367\377\377H\211E\360\213E\344\203\350\001\211\302H\213M\350H\213E\360H\211\316H\211\307\350\302\357\354\377H\213E\360H\213M\370dH3\f%(" #12 0x00007f165469b83e in quota_mailbox_transaction_commit (ctx=0x55945fe88820, changes_r=0x7fff1b28d2c0) at quota-storage.c:174 qbox = 0x55945fe7f600 qt = 0x55945fe8f160 __func__ = "quota_mailbox_transaction_commit" #13 0x00007f16548fefad in mailbox_transaction_commit_get_changes (_t=0x7fff1b28d348, changes_r=0x7fff1b28d2c0) at mail-storage.c:2215 _data_stack_cur_id = 4 t = 0x55945fe88820 box = 0x55945fe7efa8 save_count = 0 ret = 21908 __func__ = "mailbox_transaction_commit_get_changes" #14 0x00007f16548feefc in mailbox_transaction_commit (t=0x7fff1b28d348) at mail-storage.c:2197 changes = {pool = 0x0, uid_validity = 1418718114, saved_uids = {arr = {buffer = 0x7f16548e5d9e <__x86_return_thunk+5>, element_size = 140733649048408}, v = 0x7f16548e5d9e <__x86_return_thunk+5>, v_modifiable = 0x7f16548e5d9e <__x86_return_thunk+5>}, ignored_modseq_changes = 455660384, changes_mask = (unknown: 0), no_read_perm = false} ret = 21908 #15 0x000055945e4abdcb in index_mailbox_precache (conn=0x55945fe57440, box=0x55945fe7efa8) at master-connection.c:126 storage = 0x55945fe77f28 username = 0x55945fe73ee0 "admin@usrpro.io" box_vname = 0x55945fe7f310 "Sent" errstr = 0x7f165469cb53 <quota_mailbox_sync_deinit+164> "\213E\344H\203\304([]\351\320f\377\377UH\211\345H\203\354\060H\211}\330dH\213\004%(" error = MAIL_ERROR_TEMP status = {messages = 10, recent = 0, unseen = 0, uidvalidity = 1545140227, uidnext = 11, first_unseen_seq = 0, first_recent_uid = 3, last_cached_seq = 1, highest_modseq = 0, highest_pvt_modseq = 0, keywords = 0x0, permanent_flags = 0, flags = 0, permanent_keywords = false, allow_new_keywords = false, nonpermanent_modseqs = false, no_modseq_tracking = false, have_guids = true, have_save_guids = true, have_only_guid128 = false} trans = 0x0 search_args = 0x0 ctx = 0x0 mail = 0x0 metadata = {guid = '\000' <repeats 15 times>, virtual_size = 0, physical_size = 0, first_save_date = 0, cache_fields = 0x0, precache_fields = (MAIL_FETCH_STREAM_HEADER | MAIL_FETCH_STREAM_BODY | MAIL_FETCH_RECEIVED_DATE | MAIL_FETCH_SAVE_DATE | MAIL_FETCH_PHYSICAL_SIZE | MAIL_FETCH_VIRTUAL_SIZE | MAIL_FETCH_UIDL_BACKEND | MAIL_FETCH_GUID | MAIL_FETCH_POP3_ORDER), backend_ns_prefix = 0x0, backend_ns_type = (unknown: 0)} seq = 2 first_uid = 2 last_uid = 10 percentage_str = "\026\177\000" counter = 9 max = 9 percentage = 32534 percentage_sent = 0 ret = 0 uids = 0x55945fe48938 " (UIDs 2..10)" #16 0x000055945e4ac1c1 in index_mailbox (conn=0x55945fe57440, user=0x55945fe73dc8, mailbox=0x55945fe46058 "Sent", max_recent_msgs=0, what=0x55945fe46070 "i") at master-connection.c:205 ns = 0x55945fe77da0 box = 0x55945fe7efa8 status = {messages = 1608873868, recent = 0, unseen = 1581972288, uidvalidity = 21908, uidnext = 455660880, first_unseen_seq = 32767, first_recent_uid = 1608810272, last_cached_seq = 21908, highest_modseq = 139733882519610, highest_pvt_modseq = 94095725476155, keywords = 0x7fff1b28d540, permanent_flags = 0, flags = 0, permanent_keywords = false, allow_new_keywords = false, nonpermanent_modseqs = false, no_modseq_tracking = true, have_guids = true, have_save_guids = false, have_only_guid128 = true} path = 0x55945fe47f38 "/mail/admin@usrpro.io/.Sent" errstr = 0x7f16546e403a <__x86_return_thunk+5> "\363\220\017\256\350\353\371H\215d$\b\303\350\a" error = 32534 sync_flags = MAILBOX_SYNC_FLAG_FULL_READ ret = 0 #17 0x000055945e4ac3d8 in master_connection_input_line (conn=0x55945fe57440, line=0x55945fe5778c "admin@usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti") at master-connection.c:247 args = 0x55945fe46078 input = {parent_event = 0x0, module = 0x55945e4ad2ec "mail", service = 0x55945e4ad2f1 "indexer-worker", username = 0x55945fe46048 "admin@usrpro.io", session_id = 0x0, session_id_prefix = 0x55945fe4605d "2HvXT2h9epPAqMsG", session_create_time = 0, local_ip = {family = 0, u = {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, remote_ip = {family = 0, u = {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, local_port = 0, remote_port = 0, userdb_fields = 0x0, flags_override_add = (unknown: 0), flags_override_remove = (unknown: 0), no_userdb_lookup = false, debug = false, conn_secured = false, conn_ssl_secured = false} service_user = 0x55945fe59958 user = 0x55945fe73dc8 str = 0x7f16547f6231 <pool_system_free+28> "\220\311\351\375\335\356\377UH\211\345H\203\354 H\211}\370H\211u\360H\211U\350H\211M\340H\203", <incomplete sequence \340> error = 0x0 max_recent_msgs = 0 ret = 32767 #18 0x000055945e4ac58f in master_connection_input (conn=0x55945fe57440) at master-connection.c:287 _data_stack_cur_id = 3 line = 0x55945fe5778c "admin@usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti" ret = 32534 #19 0x00007f16547e73c8 in io_loop_call_io (io=0x55945fe57480) at ioloop.c:698 ioloop = 0x55945e4afca0 t_id = 2 __func__ = "io_loop_call_io" #20 0x00007f16547ea409 in io_loop_handler_run_internal (ioloop=0x55945e4afca0) at ioloop-epoll.c:221 ctx = 0x7f1654b0dda0 events = 0x55945fe548c0 event = 0x55945fe548c0 list = 0x55945fe57500 io = 0x55945fe57480 tv = {tv_sec = 2147483, tv_usec = 0} events_count = 5 msecs = -1 ret = 1 i = 0 j = 0 call = true __func__ = "io_loop_handler_run_internal" #21 0x00007f16547e7657 in io_loop_handler_run (ioloop=0x55945e4afca0) at ioloop.c:750 __func__ = "io_loop_handler_run" #22 0x00007f16547e752e in io_loop_run (ioloop=0x55945e4afca0) at ioloop.c:723 __func__ = "io_loop_run" #23 0x00007f165471b3d5 in master_service_run (service=0x55945e4afb40, callback=0x55945e4ab5ed <client_connected>) at master-service.c:775 No locals. #24 0x000055945e4ab7ce in main (argc=1, argv=0x55945e4af8e0) at indexer-worker.c:78 service_flags = (MASTER_SERVICE_FLAG_KEEP_CONFIG_OPEN | MASTER_SERVICE_FLAG_SEND_STATS) storage_service_flags = (MAIL_STORAGE_SERVICE_FLAG_USERDB_LOOKUP | MAIL_STORAGE_SERVICE_FLAG_TEMP_PRIV_DROP | MAIL_STORAGE_SERVICE_FLAG_NO_IDLE_TIMEOUT) c = -1
On 12/19/18 8:15 PM, Tim Mohlmann via dovecot wrote:
<Deleted due to message size limit>
This unfortunately looks a lot like lucene problem.
Aki
Ok, thanks a lot for the reply. I'll keep digging then :)
On 12/21/18 4:19 PM, Aki Tuomi wrote:
On 21.12.2018 15.01, Tim Mohlmann via dovecot wrote:
Hi, I now this is kinda impolite to ask, but I would like some feedback on this issue. I'm representing the Mailu user group here, plus the fact that my own production server is affected and starts to crash due to file corruption as a result of the indexer-worker segfaults. Last night my server became unavailable with logs like:
Error: imap(tim@usrpro.com)<131><2RGyoYd9Jq/AqMsO>: Corrupted transaction log file /mail/tim@usrpro.com/dovecot.index.log seq 6: record size too small (type=0x2272560, offset=26340, size=0) (sync_offset=26592)
Deleting the corrupted files brought it back, but I don't really like the prospect of cleaning corrupted files during christmas. ;)
I would like to know: do you consider this a real dovecot bug? Or is this something as a result of the way Alpine configures the package? Should I cross post the report to Alpine? Or is it worth changing the image to Debian instead?
Or maybe you don't consider the trace good enough to work on? Also that I would like to know.
Many thanks in advance!
Tim
On 12/20/18 9:49 AM, Tim Mohlmann via dovecot wrote:
I'm really sorry, but it was a long day yesterday. I forgot to mention that I used alpine:edge (based on the alpine master branch) image to set up the build environment. So there were some version changes.
dovecot --version: 2.3.3 (dcead646b)
apk info musl:
musl-1.1.20-r2 description: the musl c library (libc) implementation
Configuration was unchanged wrt to my first mail. Maybe it's helpful to share the build flags?
export CFLAGS="-O0" export CXXFLAGS="$CFLAGS"
export CPPFLAGS="$CFLAGS"
export LDFLAGS="-Wl,--as-needed"
In the APKBUILD file can be found here: https://git.alpinelinux.org/cgit/aports/tree/main/dovecot/APKBUILD and I've modified the following:
options="libtool !strip"
subpackages="
....
$pkgname-dbg
"
And if it is of any interest, I've done a
docker commit
on the container and pushed the resulting image as muhlemmer/dovecot:debug. https://cloud.docker.com/repository/docker/muhlemmer/dovecot/tags. Don't know if its of much use by itself, without the Mailu ecosystem around. Just in case.Cheers!
On 12/20/18 2:04 AM, Tim Mohlmann via dovecot wrote:
I've managed to re-compile the alpine packages inside the container, creating the debug symbols and disabling compile time optimizations, as they also disturbed the backtrace. The following output looks much more meaningful. But unfortunately there are still some symbols missing for lucene-core. Let me know if this is enough to work with. (It doesn't seem to be musl after all?)
Reading symbols from /usr/libexec/dovecot/indexer-worker...Reading symbols from /usr/lib/debug//usr/libexec/dovecot/indexer-worker.debug...done. done. [New LWP 30417]
warning: Can't read pathname for load map: No error information. Core was generated by `dovecot/indexer-worker'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 (gdb) bt full #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #1 0x00007f1653edeba2 in lucene::index::KeepOnlyLastCommitDeletionPolicy::onCommit(std::vector<lucene::index::IndexCommitPoint*, std::allocator<lucene::index::IndexCommitPoint*> >&) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #2 0x00007f1653f15312 in ?? () from /usr/lib/libclucene-core.so.1 No symbol table info available. #3 0x00007f1653f325e3 in lucene::index::IndexWriter::doFlush(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #4 0x00007f1653f32f9c in lucene::index::IndexWriter::flush(bool, bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #5 0x00007f1653f33052 in lucene::index::IndexWriter::closeInternal(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #6 0x00007f1653f337c4 in lucene::index::IndexWriter::close(bool) () from /usr/lib/libclucene-core.so.1 No symbol table info available. #7 0x00007f16541fc8c6 in lucene_index_build_deinit (index=0x55945fe7a860) at lucene-wrapper.cc:646 ret = 0 #8 0x00007f16541f9ba4 in fts_backend_lucene_update_deinit (_ctx=0x55945fe910a0) at fts-backend-lucene.c:259 ctx = 0x55945fe910a0 backend = 0x55945fe7a760 ret = 0 __func__ = "fts_backend_lucene_update_deinit" #9 0x00007f1654649baa in fts_backend_update_deinit (_ctx=0x55945fe7a480) at fts-api.c:144 ctx = 0x55945fe910a0 backend = 0x55945fe7a760 ret = 0 #10 0x00007f1654655d83 in fts_transaction_end (t=0x55945fe88820, error_r=0x7fff1b28d1b8) at fts-storage.c:613 ft = 0x55945fe84400 flist = 0x55945fe7a3a0 ret = 0 __func__ = "fts_transaction_end" #11 0x00007f16546562e8 in fts_transaction_commit (t=0x55945fe88820, changes_r=0x7fff1b28d2c0) at fts-storage.c:696 ft = 0x55945fe84400 --Type <RET> for more, q to quit, c to continue without paging--c fbox = 0x55945fe7f798 box = 0x55945fe7efa8 autoindex = false ret = 0 error = 0x7f1654813665 <p_strdup_vprintf+90> "H\213E\350\353\067\213E\344\211\302H\213E\330H\211\326H\211\307\350\301\367\377\377H\211E\360\213E\344\203\350\001\211\302H\213M\350H\213E\360H\211\316H\211\307\350\302\357\354\377H\213E\360H\213M\370dH3\f%(" #12 0x00007f165469b83e in quota_mailbox_transaction_commit (ctx=0x55945fe88820, changes_r=0x7fff1b28d2c0) at quota-storage.c:174 qbox = 0x55945fe7f600 qt = 0x55945fe8f160 __func__ = "quota_mailbox_transaction_commit" #13 0x00007f16548fefad in mailbox_transaction_commit_get_changes (_t=0x7fff1b28d348, changes_r=0x7fff1b28d2c0) at mail-storage.c:2215 _data_stack_cur_id = 4 t = 0x55945fe88820 box = 0x55945fe7efa8 save_count = 0 ret = 21908 __func__ = "mailbox_transaction_commit_get_changes" #14 0x00007f16548feefc in mailbox_transaction_commit (t=0x7fff1b28d348) at mail-storage.c:2197 changes = {pool = 0x0, uid_validity = 1418718114, saved_uids = {arr = {buffer = 0x7f16548e5d9e <__x86_return_thunk+5>, element_size = 140733649048408}, v = 0x7f16548e5d9e <__x86_return_thunk+5>, v_modifiable = 0x7f16548e5d9e <__x86_return_thunk+5>}, ignored_modseq_changes = 455660384, changes_mask = (unknown: 0), no_read_perm = false} ret = 21908 #15 0x000055945e4abdcb in index_mailbox_precache (conn=0x55945fe57440, box=0x55945fe7efa8) at master-connection.c:126 storage = 0x55945fe77f28 username = 0x55945fe73ee0 "admin@usrpro.io" box_vname = 0x55945fe7f310 "Sent" errstr = 0x7f165469cb53 <quota_mailbox_sync_deinit+164> "\213E\344H\203\304([]\351\320f\377\377UH\211\345H\203\354\060H\211}\330dH\213\004%(" error = MAIL_ERROR_TEMP status = {messages = 10, recent = 0, unseen = 0, uidvalidity = 1545140227, uidnext = 11, first_unseen_seq = 0, first_recent_uid = 3, last_cached_seq = 1, highest_modseq = 0, highest_pvt_modseq = 0, keywords = 0x0, permanent_flags = 0, flags = 0, permanent_keywords = false, allow_new_keywords = false, nonpermanent_modseqs = false, no_modseq_tracking = false, have_guids = true, have_save_guids = true, have_only_guid128 = false} trans = 0x0 search_args = 0x0 ctx = 0x0 mail = 0x0 metadata = {guid = '\000' <repeats 15 times>, virtual_size = 0, physical_size = 0, first_save_date = 0, cache_fields = 0x0, precache_fields = (MAIL_FETCH_STREAM_HEADER | MAIL_FETCH_STREAM_BODY | MAIL_FETCH_RECEIVED_DATE | MAIL_FETCH_SAVE_DATE | MAIL_FETCH_PHYSICAL_SIZE | MAIL_FETCH_VIRTUAL_SIZE | MAIL_FETCH_UIDL_BACKEND | MAIL_FETCH_GUID | MAIL_FETCH_POP3_ORDER), backend_ns_prefix = 0x0, backend_ns_type = (unknown: 0)} seq = 2 first_uid = 2 last_uid = 10 percentage_str = "\026\177\000" counter = 9 max = 9 percentage = 32534 percentage_sent = 0 ret = 0 uids = 0x55945fe48938 " (UIDs 2..10)" #16 0x000055945e4ac1c1 in index_mailbox (conn=0x55945fe57440, user=0x55945fe73dc8, mailbox=0x55945fe46058 "Sent", max_recent_msgs=0, what=0x55945fe46070 "i") at master-connection.c:205 ns = 0x55945fe77da0 box = 0x55945fe7efa8 status = {messages = 1608873868, recent = 0, unseen = 1581972288, uidvalidity = 21908, uidnext = 455660880, first_unseen_seq = 32767, first_recent_uid = 1608810272, last_cached_seq = 21908, highest_modseq = 139733882519610, highest_pvt_modseq = 94095725476155, keywords = 0x7fff1b28d540, permanent_flags = 0, flags = 0, permanent_keywords = false, allow_new_keywords = false, nonpermanent_modseqs = false, no_modseq_tracking = true, have_guids = true, have_save_guids = false, have_only_guid128 = true} path = 0x55945fe47f38 "/mail/admin@usrpro.io/.Sent" errstr = 0x7f16546e403a <__x86_return_thunk+5> "\363\220\017\256\350\353\371H\215d$\b\303\350\a" error = 32534 sync_flags = MAILBOX_SYNC_FLAG_FULL_READ ret = 0 #17 0x000055945e4ac3d8 in master_connection_input_line (conn=0x55945fe57440, line=0x55945fe5778c "admin@usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti") at master-connection.c:247 args = 0x55945fe46078 input = {parent_event = 0x0, module = 0x55945e4ad2ec "mail", service = 0x55945e4ad2f1 "indexer-worker", username = 0x55945fe46048 "admin@usrpro.io", session_id = 0x0, session_id_prefix = 0x55945fe4605d "2HvXT2h9epPAqMsG", session_create_time = 0, local_ip = {family = 0, u = {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, remote_ip = {family = 0, u = {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, local_port = 0, remote_port = 0, userdb_fields = 0x0, flags_override_add = (unknown: 0), flags_override_remove = (unknown: 0), no_userdb_lookup = false, debug = false, conn_secured = false, conn_ssl_secured = false} service_user = 0x55945fe59958 user = 0x55945fe73dc8 str = 0x7f16547f6231 <pool_system_free+28> "\220\311\351\375\335\356\377UH\211\345H\203\354 H\211}\370H\211u\360H\211U\350H\211M\340H\203", <incomplete sequence \340> error = 0x0 max_recent_msgs = 0 ret = 32767 #18 0x000055945e4ac58f in master_connection_input (conn=0x55945fe57440) at master-connection.c:287 _data_stack_cur_id = 3 line = 0x55945fe5778c "admin@usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti" ret = 32534 #19 0x00007f16547e73c8 in io_loop_call_io (io=0x55945fe57480) at ioloop.c:698 ioloop = 0x55945e4afca0 t_id = 2 __func__ = "io_loop_call_io" #20 0x00007f16547ea409 in io_loop_handler_run_internal (ioloop=0x55945e4afca0) at ioloop-epoll.c:221 ctx = 0x7f1654b0dda0 events = 0x55945fe548c0 event = 0x55945fe548c0 list = 0x55945fe57500 io = 0x55945fe57480 tv = {tv_sec = 2147483, tv_usec = 0} events_count = 5 msecs = -1 ret = 1 i = 0 j = 0 call = true __func__ = "io_loop_handler_run_internal" #21 0x00007f16547e7657 in io_loop_handler_run (ioloop=0x55945e4afca0) at ioloop.c:750 __func__ = "io_loop_handler_run" #22 0x00007f16547e752e in io_loop_run (ioloop=0x55945e4afca0) at ioloop.c:723 __func__ = "io_loop_run" #23 0x00007f165471b3d5 in master_service_run (service=0x55945e4afb40, callback=0x55945e4ab5ed <client_connected>) at master-service.c:775 No locals. #24 0x000055945e4ab7ce in main (argc=1, argv=0x55945e4af8e0) at indexer-worker.c:78 service_flags = (MASTER_SERVICE_FLAG_KEEP_CONFIG_OPEN | MASTER_SERVICE_FLAG_SEND_STATS) storage_service_flags = (MAIL_STORAGE_SERVICE_FLAG_USERDB_LOOKUP | MAIL_STORAGE_SERVICE_FLAG_TEMP_PRIV_DROP | MAIL_STORAGE_SERVICE_FLAG_NO_IDLE_TIMEOUT) c = -1
On 12/19/18 8:15 PM, Tim Mohlmann via dovecot wrote:
<Deleted due to message size limit>
This unfortunately looks a lot like lucene problem.
Aki
Hello have you manage to solve your issue. I have a Mailu/Dovecot server in production which is giving me the same error. Any help would be appreciated.
-- Sent from: http://dovecot.2317879.n4.nabble.com/
do you use both fts-solr and tika by chance?
On Fri, Jan 11, 2019 at 11:19 AM Fírvida <aafirvida@gmail.com> wrote:
Hello have you manage to solve your issue. I have a Mailu/Dovecot server in production which is giving me the same error. Any help would be appreciated.
-- Sent from: http://dovecot.2317879.n4.nabble.com/
-- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 (c) E-Mail: larryrtx@gmail.com US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
Op 11/01/2019 om 21:07 schreef Larry Rosenman:
do you use both fts-solr and tika by chance?
I doubt it. The original post is about Lucene and it fails in the Lucene code. The FTS Solr + Tika problem is caused in Dovecot lib-http.
Regards,
Stephan.
On Fri, Jan 11, 2019 at 11:19 AM Fírvida <aafirvida@gmail.com <mailto:aafirvida@gmail.com>> wrote:
Hello have you manage to solve your issue. I have a Mailu/Dovecot server in production which is giving me the same error. Any help would be appreciated. -- Sent from: http://dovecot.2317879.n4.nabble.com/
-- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 (c) E-Mail: larryrtx@gmail.com <mailto:larryrtx@gmail.com> US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
Ok, thanks for the info.
Hadn't read the whole thread.
On Sat, Jan 12, 2019 at 1:30 PM Stephan Bosch <stephan@rename-it.nl> wrote:
Op 11/01/2019 om 21:07 schreef Larry Rosenman:
do you use both fts-solr and tika by chance?
I doubt it. The original post is about Lucene and it fails in the Lucene code. The FTS Solr + Tika problem is caused in Dovecot lib-http.
Regards,
Stephan.
On Fri, Jan 11, 2019 at 11:19 AM Fírvida <aafirvida@gmail.com <mailto:aafirvida@gmail.com>> wrote:
Hello have you manage to solve your issue. I have a Mailu/Dovecot server in production which is giving me the same error. Any help would be appreciated. -- Sent from: http://dovecot.2317879.n4.nabble.com/
-- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 (c) E-Mail: larryrtx@gmail.com <mailto:larryrtx@gmail.com> US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
-- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 (c) E-Mail: larryrtx@gmail.com US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
participants (5)
-
Aki Tuomi
-
Fírvida
-
Larry Rosenman
-
Stephan Bosch
-
Tim Mohlmann