Fatal: master: service(indexer-worker): child 493 killed with signal 11 (core dumped)
Tim Mohlmann
tim at usrpro.com
Sat Dec 22 13:13:21 EET 2018
Ok, thanks a lot for the reply. I'll keep digging then :)
On 12/21/18 4:19 PM, Aki Tuomi wrote:
> On 21.12.2018 15.01, Tim Mohlmann via dovecot wrote:
>> Hi, I now this is kinda impolite to ask, but I would like some feedback
>> on this issue. I'm representing the Mailu user group here, plus the fact
>> that my own production server is affected and starts to crash due to
>> file corruption as a result of the indexer-worker segfaults. Last night
>> my server became unavailable with logs like:
>>
>> Error: imap(tim at usrpro.com)<131><2RGyoYd9Jq/AqMsO>: Corrupted
>> transaction log file /mail/tim at usrpro.com/dovecot.index.log seq 6:
>> record size too small (type=0x2272560, offset=26340, size=0)
>> (sync_offset=26592)
>>
>> Deleting the corrupted files brought it back, but I don't really like
>> the prospect of cleaning corrupted files during christmas. ;)
>>
>> I would like to know: do you consider this a real dovecot bug? Or is
>> this something as a result of the way Alpine configures the package?
>> Should I cross post the report to Alpine? Or is it worth changing the
>> image to Debian instead?
>>
>> Or maybe you don't consider the trace good enough to work on? Also that
>> I would like to know.
>>
>> Many thanks in advance!
>>
>> Tim
>>
>> On 12/20/18 9:49 AM, Tim Mohlmann via dovecot wrote:
>>> I'm really sorry, but it was a long day yesterday. I forgot to mention
>>> that I used alpine:edge (based on the alpine master branch) image to set
>>> up the build environment. So there were some version changes.
>>>
>>> dovecot --version: 2.3.3 (dcead646b)
>>>
>>> apk info musl:
>>>
>>> musl-1.1.20-r2 description:
>>> the musl c library (libc) implementation
>>>
>>> Configuration was unchanged wrt to my first mail. Maybe it's helpful to
>>> share the build flags?
>>>
>>> export CFLAGS="-O0"
>>> export CXXFLAGS="$CFLAGS"
>>>
>>> export CPPFLAGS="$CFLAGS"
>>>
>>> export LDFLAGS="-Wl,--as-needed"
>>>
>>> In the APKBUILD file can be found here:
>>> https://git.alpinelinux.org/cgit/aports/tree/main/dovecot/APKBUILD and
>>> I've modified the following:
>>>
>>> options="libtool !strip"
>>>
>>> subpackages="
>>>
>>> ....
>>>
>>> $pkgname-dbg
>>>
>>> "
>>>
>>> And if it is of any interest, I've done a `docker commit` on the
>>> container and pushed the resulting image as muhlemmer/dovecot:debug.
>>> https://cloud.docker.com/repository/docker/muhlemmer/dovecot/tags. Don't
>>> know if its of much use by itself, without the Mailu ecosystem around.
>>> Just in case.
>>>
>>> Cheers!
>>>
>>> On 12/20/18 2:04 AM, Tim Mohlmann via dovecot wrote:
>>>> I've managed to re-compile the alpine packages inside the container,
>>>> creating the debug symbols and disabling compile time optimizations, as
>>>> they also disturbed the backtrace. The following output looks much more
>>>> meaningful. But unfortunately there are still some symbols missing for
>>>> lucene-core. Let me know if this is enough to work with. (It doesn't
>>>> seem to be musl after all?)
>>>>
>>>> Reading symbols from /usr/libexec/dovecot/indexer-worker...Reading
>>>> symbols from
>>>> /usr/lib/debug//usr/libexec/dovecot/indexer-worker.debug...done.
>>>> done.
>>>> [New LWP 30417]
>>>>
>>>> warning: Can't read pathname for load map: No error information.
>>>> Core was generated by `dovecot/indexer-worker'.
>>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>>> #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1
>>>> (gdb) bt full
>>>> #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1
>>>> No symbol table info available.
>>>> #1 0x00007f1653edeba2 in
>>>> lucene::index::KeepOnlyLastCommitDeletionPolicy::onCommit(std::vector<lucene::index::IndexCommitPoint*,
>>>> std::allocator<lucene::index::IndexCommitPoint*> >&) ()
>>>> from /usr/lib/libclucene-core.so.1
>>>> No symbol table info available.
>>>> #2 0x00007f1653f15312 in ?? () from /usr/lib/libclucene-core.so.1
>>>> No symbol table info available.
>>>> #3 0x00007f1653f325e3 in lucene::index::IndexWriter::doFlush(bool) ()
>>>> from /usr/lib/libclucene-core.so.1
>>>> No symbol table info available.
>>>> #4 0x00007f1653f32f9c in lucene::index::IndexWriter::flush(bool, bool)
>>>> () from /usr/lib/libclucene-core.so.1
>>>> No symbol table info available.
>>>> #5 0x00007f1653f33052 in
>>>> lucene::index::IndexWriter::closeInternal(bool) () from
>>>> /usr/lib/libclucene-core.so.1
>>>> No symbol table info available.
>>>> #6 0x00007f1653f337c4 in lucene::index::IndexWriter::close(bool) ()
>>>> from /usr/lib/libclucene-core.so.1
>>>> No symbol table info available.
>>>> #7 0x00007f16541fc8c6 in lucene_index_build_deinit
>>>> (index=0x55945fe7a860) at lucene-wrapper.cc:646
>>>> ret = 0
>>>> #8 0x00007f16541f9ba4 in fts_backend_lucene_update_deinit
>>>> (_ctx=0x55945fe910a0) at fts-backend-lucene.c:259
>>>> ctx = 0x55945fe910a0
>>>> backend = 0x55945fe7a760
>>>> ret = 0
>>>> __func__ = "fts_backend_lucene_update_deinit"
>>>> #9 0x00007f1654649baa in fts_backend_update_deinit
>>>> (_ctx=0x55945fe7a480) at fts-api.c:144
>>>> ctx = 0x55945fe910a0
>>>> backend = 0x55945fe7a760
>>>> ret = 0
>>>> #10 0x00007f1654655d83 in fts_transaction_end (t=0x55945fe88820,
>>>> error_r=0x7fff1b28d1b8) at fts-storage.c:613
>>>> ft = 0x55945fe84400
>>>> flist = 0x55945fe7a3a0
>>>> ret = 0
>>>> __func__ = "fts_transaction_end"
>>>> #11 0x00007f16546562e8 in fts_transaction_commit (t=0x55945fe88820,
>>>> changes_r=0x7fff1b28d2c0) at fts-storage.c:696
>>>> ft = 0x55945fe84400
>>>> --Type <RET> for more, q to quit, c to continue without paging--c
>>>> fbox = 0x55945fe7f798
>>>> box = 0x55945fe7efa8
>>>> autoindex = false
>>>> ret = 0
>>>> error = 0x7f1654813665 <p_strdup_vprintf+90>
>>>> "H\213E\350\353\067\213E\344\211\302H\213E\330H\211\326H\211\307\350\301\367\377\377H\211E\360\213E\344\203\350\001\211\302H\213M\350H\213E\360H\211\316H\211\307\350\302\357\354\377H\213E\360H\213M\370dH3\f%("
>>>> #12 0x00007f165469b83e in quota_mailbox_transaction_commit
>>>> (ctx=0x55945fe88820, changes_r=0x7fff1b28d2c0) at quota-storage.c:174
>>>> qbox = 0x55945fe7f600
>>>> qt = 0x55945fe8f160
>>>> __func__ = "quota_mailbox_transaction_commit"
>>>> #13 0x00007f16548fefad in mailbox_transaction_commit_get_changes
>>>> (_t=0x7fff1b28d348, changes_r=0x7fff1b28d2c0) at mail-storage.c:2215
>>>> _data_stack_cur_id = 4
>>>> t = 0x55945fe88820
>>>> box = 0x55945fe7efa8
>>>> save_count = 0
>>>> ret = 21908
>>>> __func__ = "mailbox_transaction_commit_get_changes"
>>>> #14 0x00007f16548feefc in mailbox_transaction_commit (t=0x7fff1b28d348)
>>>> at mail-storage.c:2197
>>>> changes = {pool = 0x0, uid_validity = 1418718114, saved_uids =
>>>> {arr = {buffer = 0x7f16548e5d9e <__x86_return_thunk+5>, element_size =
>>>> 140733649048408}, v = 0x7f16548e5d9e <__x86_return_thunk+5>,
>>>> v_modifiable = 0x7f16548e5d9e <__x86_return_thunk+5>},
>>>> ignored_modseq_changes = 455660384, changes_mask = (unknown: 0),
>>>> no_read_perm = false}
>>>> ret = 21908
>>>> #15 0x000055945e4abdcb in index_mailbox_precache (conn=0x55945fe57440,
>>>> box=0x55945fe7efa8) at master-connection.c:126
>>>> storage = 0x55945fe77f28
>>>> username = 0x55945fe73ee0 "admin at usrpro.io"
>>>> box_vname = 0x55945fe7f310 "Sent"
>>>> errstr = 0x7f165469cb53 <quota_mailbox_sync_deinit+164>
>>>> "\213E\344H\203\304([]\351\320f\377\377UH\211\345H\203\354\060H\211}\330dH\213\004%("
>>>> error = MAIL_ERROR_TEMP
>>>> status = {messages = 10, recent = 0, unseen = 0, uidvalidity =
>>>> 1545140227, uidnext = 11, first_unseen_seq = 0, first_recent_uid = 3,
>>>> last_cached_seq = 1, highest_modseq = 0, highest_pvt_modseq = 0,
>>>> keywords = 0x0, permanent_flags = 0, flags = 0, permanent_keywords =
>>>> false, allow_new_keywords = false, nonpermanent_modseqs = false,
>>>> no_modseq_tracking = false, have_guids = true, have_save_guids = true,
>>>> have_only_guid128 = false}
>>>> trans = 0x0
>>>> search_args = 0x0
>>>> ctx = 0x0
>>>> mail = 0x0
>>>> metadata = {guid = '\000' <repeats 15 times>, virtual_size = 0,
>>>> physical_size = 0, first_save_date = 0, cache_fields = 0x0,
>>>> precache_fields = (MAIL_FETCH_STREAM_HEADER | MAIL_FETCH_STREAM_BODY |
>>>> MAIL_FETCH_RECEIVED_DATE | MAIL_FETCH_SAVE_DATE |
>>>> MAIL_FETCH_PHYSICAL_SIZE | MAIL_FETCH_VIRTUAL_SIZE |
>>>> MAIL_FETCH_UIDL_BACKEND | MAIL_FETCH_GUID | MAIL_FETCH_POP3_ORDER),
>>>> backend_ns_prefix = 0x0, backend_ns_type = (unknown: 0)}
>>>> seq = 2
>>>> first_uid = 2
>>>> last_uid = 10
>>>> percentage_str = "\026\177\000"
>>>> counter = 9
>>>> max = 9
>>>> percentage = 32534
>>>> percentage_sent = 0
>>>> ret = 0
>>>> uids = 0x55945fe48938 " (UIDs 2..10)"
>>>> #16 0x000055945e4ac1c1 in index_mailbox (conn=0x55945fe57440,
>>>> user=0x55945fe73dc8, mailbox=0x55945fe46058 "Sent", max_recent_msgs=0,
>>>> what=0x55945fe46070 "i") at master-connection.c:205
>>>> ns = 0x55945fe77da0
>>>> box = 0x55945fe7efa8
>>>> status = {messages = 1608873868, recent = 0, unseen =
>>>> 1581972288, uidvalidity = 21908, uidnext = 455660880, first_unseen_seq =
>>>> 32767, first_recent_uid = 1608810272, last_cached_seq = 21908,
>>>> highest_modseq = 139733882519610, highest_pvt_modseq = 94095725476155,
>>>> keywords = 0x7fff1b28d540, permanent_flags = 0, flags = 0,
>>>> permanent_keywords = false, allow_new_keywords = false,
>>>> nonpermanent_modseqs = false, no_modseq_tracking = true, have_guids =
>>>> true, have_save_guids = false, have_only_guid128 = true}
>>>> path = 0x55945fe47f38 "/mail/admin at usrpro.io/.Sent"
>>>> errstr = 0x7f16546e403a <__x86_return_thunk+5>
>>>> "\363\220\017\256\350\353\371H\215d$\b\303\350\a"
>>>> error = 32534
>>>> sync_flags = MAILBOX_SYNC_FLAG_FULL_READ
>>>> ret = 0
>>>> #17 0x000055945e4ac3d8 in master_connection_input_line
>>>> (conn=0x55945fe57440, line=0x55945fe5778c
>>>> "admin at usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti") at master-connection.c:247
>>>> args = 0x55945fe46078
>>>> input = {parent_event = 0x0, module = 0x55945e4ad2ec "mail",
>>>> service = 0x55945e4ad2f1 "indexer-worker", username = 0x55945fe46048
>>>> "admin at usrpro.io", session_id = 0x0, session_id_prefix = 0x55945fe4605d
>>>> "2HvXT2h9epPAqMsG", session_create_time = 0, local_ip = {family = 0, u =
>>>> {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>,
>>>> __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}},
>>>> ip4 = {s_addr = 0}}}, remote_ip = {family = 0, u = {ip6 = {__in6_union =
>>>> {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0,
>>>> 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, local_port =
>>>> 0, remote_port = 0, userdb_fields = 0x0, flags_override_add = (unknown:
>>>> 0), flags_override_remove = (unknown: 0), no_userdb_lookup = false,
>>>> debug = false, conn_secured = false, conn_ssl_secured = false}
>>>> service_user = 0x55945fe59958
>>>> user = 0x55945fe73dc8
>>>> str = 0x7f16547f6231 <pool_system_free+28>
>>>> "\220\311\351\375\335\356\377UH\211\345H\203\354
>>>> H\211}\370H\211u\360H\211U\350H\211M\340H\203", <incomplete sequence \340>
>>>> error = 0x0
>>>> max_recent_msgs = 0
>>>> ret = 32767
>>>> #18 0x000055945e4ac58f in master_connection_input (conn=0x55945fe57440)
>>>> at master-connection.c:287
>>>> _data_stack_cur_id = 3
>>>> line = 0x55945fe5778c
>>>> "admin at usrpro.io\tSent\t2HvXT2h9epPAqMsG\t0\ti"
>>>> ret = 32534
>>>> #19 0x00007f16547e73c8 in io_loop_call_io (io=0x55945fe57480) at
>>>> ioloop.c:698
>>>> ioloop = 0x55945e4afca0
>>>> t_id = 2
>>>> __func__ = "io_loop_call_io"
>>>> #20 0x00007f16547ea409 in io_loop_handler_run_internal
>>>> (ioloop=0x55945e4afca0) at ioloop-epoll.c:221
>>>> ctx = 0x7f1654b0dda0
>>>> events = 0x55945fe548c0
>>>> event = 0x55945fe548c0
>>>> list = 0x55945fe57500
>>>> io = 0x55945fe57480
>>>> tv = {tv_sec = 2147483, tv_usec = 0}
>>>> events_count = 5
>>>> msecs = -1
>>>> ret = 1
>>>> i = 0
>>>> j = 0
>>>> call = true
>>>> __func__ = "io_loop_handler_run_internal"
>>>> #21 0x00007f16547e7657 in io_loop_handler_run (ioloop=0x55945e4afca0) at
>>>> ioloop.c:750
>>>> __func__ = "io_loop_handler_run"
>>>> #22 0x00007f16547e752e in io_loop_run (ioloop=0x55945e4afca0) at
>>>> ioloop.c:723
>>>> __func__ = "io_loop_run"
>>>> #23 0x00007f165471b3d5 in master_service_run (service=0x55945e4afb40,
>>>> callback=0x55945e4ab5ed <client_connected>) at master-service.c:775
>>>> No locals.
>>>> #24 0x000055945e4ab7ce in main (argc=1, argv=0x55945e4af8e0) at
>>>> indexer-worker.c:78
>>>> service_flags = (MASTER_SERVICE_FLAG_KEEP_CONFIG_OPEN |
>>>> MASTER_SERVICE_FLAG_SEND_STATS)
>>>> storage_service_flags = (MAIL_STORAGE_SERVICE_FLAG_USERDB_LOOKUP
>>>> | MAIL_STORAGE_SERVICE_FLAG_TEMP_PRIV_DROP |
>>>> MAIL_STORAGE_SERVICE_FLAG_NO_IDLE_TIMEOUT)
>>>> c = -1
>>>>
>>>>
>>>> On 12/19/18 8:15 PM, Tim Mohlmann via dovecot wrote:
>>>>
>>>> <Deleted due to message size limit>
>>>>
>>>>
> This unfortunately looks a lot like lucene problem.
>
> Aki
>
More information about the dovecot
mailing list