Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped)

Aki Tuomi aki.tuomi at open-xchange.com
Fri Dec 21 16:19:13 EET 2018


On 21.12.2018 15.01, Tim Mohlmann via dovecot wrote:
> Hi, I now this is kinda impolite to ask, but I would like some feedback
> on this issue. I'm representing the Mailu user group here, plus the fact
> that my own production server is affected and starts to crash due to
> file corruption as a result of the indexer-worker segfaults. Last night
> my server became unavailable with logs like:
>
> Error: imap(tim at usrpro.com)<131><2RGyoYd9Jq/AqMsO>: Corrupted
> transaction log file /mail/tim at usrpro.com/dovecot.index.log seq 6:
> record size too small (type=0x2272560, offset=26340, size=0)
> (sync_offset=26592)
>
> Deleting the corrupted files brought it back, but I don't really like
> the prospect of cleaning corrupted files during christmas. ;)
>
> I would like to know: do you consider this a real dovecot bug? Or is
> this something as a result of the way  Alpine configures the package?
> Should I cross post the report to Alpine? Or is it worth changing the
> image to Debian instead?
>
> Or maybe you don't consider the trace good enough to work on? Also that
> I would like to know.
>
> Many thanks in advance!
>
> Tim
>
> On 12/20/18 9:49 AM, Tim Mohlmann via dovecot wrote:
>> I'm really sorry, but it was a long day yesterday. I forgot to mention
>> that I used alpine:edge (based on the alpine master branch) image to set
>> up the build environment. So there were some version changes.
>>
>> dovecot --version: 2.3.3 (dcead646b)
>>
>> apk info musl:
>>
>> musl-1.1.20-r2 description:
>> the musl c library (libc) implementation
>>
>> Configuration was unchanged wrt to my first mail. Maybe it's helpful to
>> share the build flags?
>>
>> export CFLAGS="-O0"
>> export CXXFLAGS="$CFLAGS"
>>
>> export CPPFLAGS="$CFLAGS"
>>
>> export LDFLAGS="-Wl,--as-needed"
>>
>> In the APKBUILD file can be found here:
>> https://git.alpinelinux.org/cgit/aports/tree/main/dovecot/APKBUILD and
>> I've modified the following:
>>
>> options="libtool !strip"
>>
>> subpackages="
>>
>>      ....
>>
>>      $pkgname-dbg
>>
>> "
>>
>> And if it is of any interest, I've done a `docker commit` on the
>> container and pushed the resulting image as muhlemmer/dovecot:debug.
>> https://cloud.docker.com/repository/docker/muhlemmer/dovecot/tags. Don't
>> know if its of much use by itself, without the Mailu ecosystem around.
>> Just in case.
>>
>> Cheers!
>>
>> On 12/20/18 2:04 AM, Tim Mohlmann via dovecot wrote:
>>> I've managed to re-compile the alpine packages inside the container,
>>> creating the debug symbols and disabling compile time optimizations, as
>>> they also disturbed the backtrace. The following output looks much more
>>> meaningful. But unfortunately there are still some symbols missing for
>>> lucene-core. Let me know if this is enough to work with. (It doesn't
>>> seem to be musl after all?)
>>>
>>> Reading symbols from /usr/libexec/dovecot/indexer-worker...Reading
>>> symbols from
>>> /usr/lib/debug//usr/libexec/dovecot/indexer-worker.debug...done.
>>> done.
>>> [New LWP 30417]
>>>
>>> warning: Can't read pathname for load map: No error information.
>>> Core was generated by `dovecot/indexer-worker'.
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>> #0  0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1
>>> (gdb) bt full
>>> #0  0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1
>>> No symbol table info available.
>>> #1  0x00007f1653edeba2 in
>>> lucene::index::KeepOnlyLastCommitDeletionPolicy::onCommit(std::vector<lucene::index::IndexCommitPoint*,
>>> std::allocator<lucene::index::IndexCommitPoint*> >&) ()
>>>    from /usr/lib/libclucene-core.so.1
>>> No symbol table info available.
>>> #2  0x00007f1653f15312 in ?? () from /usr/lib/libclucene-core.so.1
>>> No symbol table info available.
>>> #3  0x00007f1653f325e3 in lucene::index::IndexWriter::doFlush(bool) ()
>>> from /usr/lib/libclucene-core.so.1
>>> No symbol table info available.
>>> #4  0x00007f1653f32f9c in lucene::index::IndexWriter::flush(bool, bool)
>>> () from /usr/lib/libclucene-core.so.1
>>> No symbol table info available.
>>> #5  0x00007f1653f33052 in
>>> lucene::index::IndexWriter::closeInternal(bool) () from
>>> /usr/lib/libclucene-core.so.1
>>> No symbol table info available.
>>> #6  0x00007f1653f337c4 in lucene::index::IndexWriter::close(bool) ()
>>> from /usr/lib/libclucene-core.so.1
>>> No symbol table info available.
>>> #7  0x00007f16541fc8c6 in lucene_index_build_deinit
>>> (index=0x55945fe7a860) at lucene-wrapper.cc:646
>>>         ret = 0
>>> #8  0x00007f16541f9ba4 in fts_backend_lucene_update_deinit
>>> (_ctx=0x55945fe910a0) at fts-backend-lucene.c:259
>>>         ctx = 0x55945fe910a0
>>>         backend = 0x55945fe7a760
>>>         ret = 0
>>>         __func__ = "fts_backend_lucene_update_deinit"
>>> #9  0x00007f1654649baa in fts_backend_update_deinit
>>> (_ctx=0x55945fe7a480) at fts-api.c:144
>>>         ctx = 0x55945fe910a0
>>>         backend = 0x55945fe7a760
>>>         ret = 0
>>> #10 0x00007f1654655d83 in fts_transaction_end (t=0x55945fe88820,
>>> error_r=0x7fff1b28d1b8) at fts-storage.c:613
>>>         ft = 0x55945fe84400
>>>         flist = 0x55945fe7a3a0
>>>         ret = 0
>>>         __func__ = "fts_transaction_end"
>>> #11 0x00007f16546562e8 in fts_transaction_commit (t=0x55945fe88820,
>>> changes_r=0x7fff1b28d2c0) at fts-storage.c:696
>>>         ft = 0x55945fe84400
>>> --Type <RET> for more, q to quit, c to continue without paging--c
>>>         fbox = 0x55945fe7f798
>>>         box = 0x55945fe7efa8
>>>         autoindex = false
>>>         ret = 0
>>>         error = 0x7f1654813665 <p_strdup_vprintf+90>
>>> "H\213E\350\353\067\213E\344\211\302H\213E\330H\211\326H\211\307\350\301\367\377\377H\211E\360\213E\344\203\350\001\211\302H\213M\350H\213E\360H\211\316H\211\307\350\302\357\354\377H\213E\360H\213M\370dH3\f%("
>>> #12 0x00007f165469b83e in quota_mailbox_transaction_commit
>>> (ctx=0x55945fe88820, changes_r=0x7fff1b28d2c0) at quota-storage.c:174
>>>         qbox = 0x55945fe7f600
>>>         qt = 0x55945fe8f160
>>>         __func__ = "quota_mailbox_transaction_commit"
>>> #13 0x00007f16548fefad in mailbox_transaction_commit_get_changes
>>> (_t=0x7fff1b28d348, changes_r=0x7fff1b28d2c0) at mail-storage.c:2215
>>>         _data_stack_cur_id = 4
>>>         t = 0x55945fe88820
>>>         box = 0x55945fe7efa8
>>>         save_count = 0
>>>         ret = 21908
>>>         __func__ = "mailbox_transaction_commit_get_changes"
>>> #14 0x00007f16548feefc in mailbox_transaction_commit (t=0x7fff1b28d348)
>>> at mail-storage.c:2197
>>>         changes = {pool = 0x0, uid_validity = 1418718114, saved_uids =
>>> {arr = {buffer = 0x7f16548e5d9e <__x86_return_thunk+5>, element_size =
>>> 140733649048408}, v = 0x7f16548e5d9e <__x86_return_thunk+5>,
>>> v_modifiable = 0x7f16548e5d9e <__x86_return_thunk+5>},
>>> ignored_modseq_changes = 455660384, changes_mask = (unknown: 0),
>>> no_read_perm = false}
>>>         ret = 21908
>>> #15 0x000055945e4abdcb in index_mailbox_precache (conn=0x55945fe57440,
>>> box=0x55945fe7efa8) at master-connection.c:126
>>>         storage = 0x55945fe77f28
>>>         username = 0x55945fe73ee0 "admin at usrpro.io"
>>>         box_vname = 0x55945fe7f310 "Sent"
>>>         errstr = 0x7f165469cb53 <quota_mailbox_sync_deinit+164>
>>> "\213E\344H\203\304([]\351\320f\377\377UH\211\345H\203\354\060H\211}\330dH\213\004%("
>>>         error = MAIL_ERROR_TEMP
>>>         status = {messages = 10, recent = 0, unseen = 0, uidvalidity =
>>> 1545140227, uidnext = 11, first_unseen_seq = 0, first_recent_uid = 3,
>>> last_cached_seq = 1, highest_modseq = 0, highest_pvt_modseq = 0,
>>> keywords = 0x0, permanent_flags = 0, flags = 0, permanent_keywords =
>>> false, allow_new_keywords = false, nonpermanent_modseqs = false,
>>> no_modseq_tracking = false, have_guids = true, have_save_guids = true,
>>> have_only_guid128 = false}
>>>         trans = 0x0
>>>         search_args = 0x0
>>>         ctx = 0x0
>>>         mail = 0x0
>>>         metadata = {guid = '\000' <repeats 15 times>, virtual_size = 0,
>>> physical_size = 0, first_save_date = 0, cache_fields = 0x0,
>>> precache_fields = (MAIL_FETCH_STREAM_HEADER | MAIL_FETCH_STREAM_BODY |
>>> MAIL_FETCH_RECEIVED_DATE | MAIL_FETCH_SAVE_DATE |
>>> MAIL_FETCH_PHYSICAL_SIZE | MAIL_FETCH_VIRTUAL_SIZE |
>>> MAIL_FETCH_UIDL_BACKEND | MAIL_FETCH_GUID | MAIL_FETCH_POP3_ORDER),
>>> backend_ns_prefix = 0x0, backend_ns_type = (unknown: 0)}
>>>         seq = 2
>>>         first_uid = 2
>>>         last_uid = 10
>>>         percentage_str = "\026\177\000"
>>>         counter = 9
>>>         max = 9
>>>         percentage = 32534
>>>         percentage_sent = 0
>>>         ret = 0
>>>         uids = 0x55945fe48938 " (UIDs 2..10)"
>>> #16 0x000055945e4ac1c1 in index_mailbox (conn=0x55945fe57440,
>>> user=0x55945fe73dc8, mailbox=0x55945fe46058 "Sent", max_recent_msgs=0,
>>> what=0x55945fe46070 "i") at master-connection.c:205
>>>         ns = 0x55945fe77da0
>>>         box = 0x55945fe7efa8
>>>         status = {messages = 1608873868, recent = 0, unseen =
>>> 1581972288, uidvalidity = 21908, uidnext = 455660880, first_unseen_seq =
>>> 32767, first_recent_uid = 1608810272, last_cached_seq = 21908,
>>> highest_modseq = 139733882519610, highest_pvt_modseq = 94095725476155,
>>> keywords = 0x7fff1b28d540, permanent_flags = 0, flags = 0,
>>> permanent_keywords = false, allow_new_keywords = false,
>>> nonpermanent_modseqs = false, no_modseq_tracking = true, have_guids =
>>> true, have_save_guids = false, have_only_guid128 = true}
>>>         path = 0x55945fe47f38 "/mail/admin at usrpro.io/.Sent"
>>>         errstr = 0x7f16546e403a <__x86_return_thunk+5>
>>> "\363\220\017\256\350\353\371H\215d$\b\303\350\a"
>>>         error = 32534
>>>         sync_flags = MAILBOX_SYNC_FLAG_FULL_READ
>>>         ret = 0
>>> #17 0x000055945e4ac3d8 in master_connection_input_line
>>> (conn=0x55945fe57440, line=0x55945fe5778c
>>> "admin at usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti") at master-connection.c:247
>>>         args = 0x55945fe46078
>>>         input = {parent_event = 0x0, module = 0x55945e4ad2ec "mail",
>>> service = 0x55945e4ad2f1 "indexer-worker", username = 0x55945fe46048
>>> "admin at usrpro.io", session_id = 0x0, session_id_prefix = 0x55945fe4605d
>>> "2HvXT2h9epPAqMsG", session_create_time = 0, local_ip = {family = 0, u =
>>> {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>,
>>> __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}},
>>> ip4 = {s_addr = 0}}}, remote_ip = {family = 0, u = {ip6 = {__in6_union =
>>> {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0,
>>> 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, local_port =
>>> 0, remote_port = 0, userdb_fields = 0x0, flags_override_add = (unknown:
>>> 0), flags_override_remove = (unknown: 0), no_userdb_lookup = false,
>>> debug = false, conn_secured = false, conn_ssl_secured = false}
>>>         service_user = 0x55945fe59958
>>>         user = 0x55945fe73dc8
>>>         str = 0x7f16547f6231 <pool_system_free+28>
>>> "\220\311\351\375\335\356\377UH\211\345H\203\354
>>> H\211}\370H\211u\360H\211U\350H\211M\340H\203", <incomplete sequence \340>
>>>         error = 0x0
>>>         max_recent_msgs = 0
>>>         ret = 32767
>>> #18 0x000055945e4ac58f in master_connection_input (conn=0x55945fe57440)
>>> at master-connection.c:287
>>>         _data_stack_cur_id = 3
>>>         line = 0x55945fe5778c
>>> "admin at usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti"
>>>         ret = 32534
>>> #19 0x00007f16547e73c8 in io_loop_call_io (io=0x55945fe57480) at
>>> ioloop.c:698
>>>         ioloop = 0x55945e4afca0
>>>         t_id = 2
>>>         __func__ = "io_loop_call_io"
>>> #20 0x00007f16547ea409 in io_loop_handler_run_internal
>>> (ioloop=0x55945e4afca0) at ioloop-epoll.c:221
>>>         ctx = 0x7f1654b0dda0
>>>         events = 0x55945fe548c0
>>>         event = 0x55945fe548c0
>>>         list = 0x55945fe57500
>>>         io = 0x55945fe57480
>>>         tv = {tv_sec = 2147483, tv_usec = 0}
>>>         events_count = 5
>>>         msecs = -1
>>>         ret = 1
>>>         i = 0
>>>         j = 0
>>>         call = true
>>>         __func__ = "io_loop_handler_run_internal"
>>> #21 0x00007f16547e7657 in io_loop_handler_run (ioloop=0x55945e4afca0) at
>>> ioloop.c:750
>>>         __func__ = "io_loop_handler_run"
>>> #22 0x00007f16547e752e in io_loop_run (ioloop=0x55945e4afca0) at
>>> ioloop.c:723
>>>         __func__ = "io_loop_run"
>>> #23 0x00007f165471b3d5 in master_service_run (service=0x55945e4afb40,
>>> callback=0x55945e4ab5ed <client_connected>) at master-service.c:775
>>> No locals.
>>> #24 0x000055945e4ab7ce in main (argc=1, argv=0x55945e4af8e0) at
>>> indexer-worker.c:78
>>>         service_flags = (MASTER_SERVICE_FLAG_KEEP_CONFIG_OPEN |
>>> MASTER_SERVICE_FLAG_SEND_STATS)
>>>         storage_service_flags = (MAIL_STORAGE_SERVICE_FLAG_USERDB_LOOKUP
>>> | MAIL_STORAGE_SERVICE_FLAG_TEMP_PRIV_DROP |
>>> MAIL_STORAGE_SERVICE_FLAG_NO_IDLE_TIMEOUT)
>>>         c = -1
>>>
>>>
>>> On 12/19/18 8:15 PM, Tim Mohlmann via dovecot wrote:
>>>
>>> <Deleted due to message size limit>
>>>
>>>
This unfortunately looks a lot like lucene problem.

Aki



More information about the dovecot mailing list