indexer-worker crashes handling mails with big attachments (dovecot 2.2.16/2.2.18 + FTS Apache Solr + Tika)

Robinson Bomze itsec at bomze.de
Wed Jun 3 15:43:31 UTC 2015


Hi,

yesterday i tried to setup Dovecot with Solr (3.6.2) + Tika (1.8) for
FTS. i used a fresh Debian 8.0 system in the beginning with Dovecot
2.2.13 from the Debian repository. After i got some issues with
Tika/Dovecot and i read on the mailinglist that these problems where
fixed in 2.2.14+, so i tried 2.2.18.

With 2.2.18 i get panics with big (ok... huge) attachments. Most
mailboxes (and their attachments) get index fine, but on some i got
panics from the indexer-worker. i was able to isolate the problem.
It seems that when Tika (which works flawless) sends a big reply to
Dovecot and Dovecot sends this data to Solr, communication crashes
between Dovecot and Solr.

Eg. indexing an email with a 200k char wordfile results in a panic of
the indexer-worker:

Jun 02 23:50:57 indexer-worker(username): Warning: I/O leak:
0x7ff65f39f540 (line 120, fd 20)
Jun 02 23:50:57 indexer-worker(username): Warning: Timeout leak:
0x7ff65f39f2e0 (line 325)
Jun 02 23:50:57 indexer: Error: Indexer worker disconnected, discarding
1 requests for username
Jun 02 23:50:57 imap(username): Error: indexer failed to index mailbox
INBOX.username
Jun 02 23:50:57 indexer-worker(username): Fatal: master:
service(indexer-worker): child 11429 killed with signal 11 (core dumped)

I got similar issues results with 2.2.16:
Jun 02 23:21:12 indexer-worker(username): Warning: I/O leak:
0x7ffff7811cc0 (line 127, fd 20)
Jun 02 23:21:12 indexer-worker(username): Panic: file ioloop.c: line 39
(io_add_file): assertion failed: (callback != NULL)
Jun 02 23:21:12 indexer-worker(username): Error: Raw backtrace:
/usr/local/lib/dovecot/libdovecot.so.0(+0x77130) [0x7ffff7842130] ->
/usr/local/lib/dovecot/libdovecot.so.0(+0x7
Jun 02 23:21:12 indexer: Error: Indexer worker disconnected, discarding
1 requests for username
Jun 02 23:21:12 imap(username): Error: indexer failed to index mailbox
INBOX.username
Jun 02 23:21:12 indexer-worker(username): Fatal: master:
service(indexer-worker): child 7909 killed with signal 6 (core dumps
disabled)

The problem was already posted:
http://dovecot.org/pipermail/dovecot/2015-May/100901.html
I could trigger the same panic running the indexer via 'doveadm index -u
username MAILBOX'.

Here is a backtrace (bt) of the 2.2.18-crash (on line #8 you see a
fragement of the text sent to solr):

#0  array_count_i (array=0x8) at array.h:155
#1  array_get_modifiable_i (count_r=<synthetic pointer>, array=0x8) at
array.h:228
#2  priorityq_remove_idx (pq=0x0, idx=0) at priorityq.c:121
#3  0x00007ff65f3ef5eb in priorityq_remove (pq=<optimized out>,
item=item at entry=0xa26920) at priorityq.c:138
#4  0x00007ff65f3e1e70 in timeout_remove (_timeout=<optimized out>) at
ioloop.c:288
#5  0x00007ff65f3e2781 in io_loop_move_timeout
(_timeout=_timeout at entry=0xa27f98) at ioloop.c:861
#6  0x00007ff65f39ff37 in http_client_connection_switch_ioloop
(conn=conn at entry=0xa27ea0) at http-client-connection.c:1357
#7  0x00007ff65f3a3d68 in http_client_switch_ioloop
(client=client at entry=0xa0bf20) at http-client.c:211
#8  0x00007ff65f39c005 in http_client_request_continue_payload
(_req=_req at entry=0xa0ee88,
    data=0xa42fa0 "k for evidence of fluid spill.\nIf the device is
mounted on a stand, examine the condition of the mount.\nIf the device
moves on casters, check the condition of the casters. Check operation of
brakes, i"..., size=55453) at http-client-request.c:566
#9  0x00007ff65f39c22a in http_client_request_send_payload
(_req=_req at entry=0xa0ee88, data=<optimized out>, size=<optimized out>)
at http-client-request.c:625
#10 0x00007ff65e972429 in solr_connection_post_more (post=0xa0ee80,
data=<optimized out>, size=size at entry=55453) at solr-connection.c:504
#11 0x00007ff65e96ea09 in fts_backed_solr_build_commit (ctx=0xa1a880) at
fts-backend-solr.c:341
#12 0x00007ff65e96eaad in fts_backend_solr_update_set_mailbox
(_ctx=0xa1a880, box=0x0) at fts-backend-solr.c:407
#13 0x00007ff65eb7cfac in fts_backend_set_cur_mailbox
(ctx=ctx at entry=0xa1a880) at fts-api.c:129
#14 0x00007ff65eb7cfe3 in fts_backend_update_deinit (_ctx=<optimized
out>) at fts-api.c:143
#15 0x00007ff65eb8303c in fts_transaction_end (t=t at entry=0xa11ed0) at
fts-storage.c:550
#16 0x00007ff65eb83e91 in fts_transaction_commit (t=0xa11ed0,
changes_r=0x7ffdcdca5e30) at fts-storage.c:615
#17 0x00007ff65f688a82 in mailbox_transaction_commit_get_changes
(_t=_t at entry=0x7ffdcdca5ee0, changes_r=changes_r at entry=0x7ffdcdca5e30)
at mail-storage.c:1837
#18 0x00007ff65f688b2e in mailbox_transaction_commit
(t=t at entry=0x7ffdcdca5ee0) at mail-storage.c:1818

"bt full" looks like this:
#0  array_count_i (array=0x8) at array.h:155
No locals.
#1  array_get_modifiable_i (count_r=<synthetic pointer>, array=0x8) at
array.h:228
No locals.
#2  priorityq_remove_idx (pq=0x0, idx=0) at priorityq.c:121
        count = <optimized out>
#3  0x00007ff65f3ef5eb in priorityq_remove (pq=<optimized out>,
item=item at entry=0xa26920) at priorityq.c:138
No locals.
#4  0x00007ff65f3e1e70 in timeout_remove (_timeout=<optimized out>) at
ioloop.c:288
        timeout = 0xa26920
#5  0x00007ff65f3e2781 in io_loop_move_timeout
(_timeout=_timeout at entry=0xa27f98) at ioloop.c:861
        new_to = 0xa1adf0
        old_to = <optimized out>
#6  0x00007ff65f39ff37 in http_client_connection_switch_ioloop
(conn=conn at entry=0xa27ea0) at http-client-connection.c:1357
No locals.
#7  0x00007ff65f3a3d68 in http_client_switch_ioloop
(client=client at entry=0xa0bf20) at http-client.c:211
        conn = 0xa27ea0
        _conn = 0xa27ea0
        host = <optimized out>
        peer = <optimized out>
#8  0x00007ff65f39c005 in http_client_request_continue_payload
(_req=_req at entry=0xa0ee88,
    data=0xa42fa0 "k for evidence of fluid spill.\nIf the device is
mounted on a stand, examine the condition of the mount.\nIf the device
moves on casters, check the condition of the casters. Check operation of
brakes, i"..., size=55453) at http-client-request.c:566
        prev_ioloop = 0x9f4730
        req = 0xa36970
        conn = 0xa27ea0
        client = 0xa0bf20
        ret = <optimized out>
        __FUNCTION__ = "http_client_request_continue_payload"
#9  0x00007ff65f39c22a in http_client_request_send_payload
(_req=_req at entry=0xa0ee88, data=<optimized out>, size=<optimized out>)
at http-client-request.c:625
        __FUNCTION__ = "http_client_request_send_payload"
#10 0x00007ff65e972429 in solr_connection_post_more (post=0xa0ee80,
data=<optimized out>, size=size at entry=55453) at solr-connection.c:504
        conn = 0xa0be50
        __FUNCTION__ = "solr_connection_post_more"

Hope anyone fixes the code... i need this feature :)
Thanks a lot in advance!


More information about the dovecot mailing list