Hi,
yesterday i tried to setup Dovecot with Solr (3.6.2) + Tika (1.8) for FTS. i used a fresh Debian 8.0 system in the beginning with Dovecot 2.2.13 from the Debian repository. After i got some issues with Tika/Dovecot and i read on the mailinglist that these problems where fixed in 2.2.14+, so i tried 2.2.18.
With 2.2.18 i get panics with big (ok... huge) attachments. Most mailboxes (and their attachments) get index fine, but on some i got panics from the indexer-worker. i was able to isolate the problem. It seems that when Tika (which works flawless) sends a big reply to Dovecot and Dovecot sends this data to Solr, communication crashes between Dovecot and Solr.
Eg. indexing an email with a 200k char wordfile results in a panic of the indexer-worker:
Jun 02 23:50:57 indexer-worker(username): Warning: I/O leak: 0x7ff65f39f540 (line 120, fd 20) Jun 02 23:50:57 indexer-worker(username): Warning: Timeout leak: 0x7ff65f39f2e0 (line 325) Jun 02 23:50:57 indexer: Error: Indexer worker disconnected, discarding 1 requests for username Jun 02 23:50:57 imap(username): Error: indexer failed to index mailbox INBOX.username Jun 02 23:50:57 indexer-worker(username): Fatal: master: service(indexer-worker): child 11429 killed with signal 11 (core dumped)
I got similar issues results with 2.2.16: Jun 02 23:21:12 indexer-worker(username): Warning: I/O leak: 0x7ffff7811cc0 (line 127, fd 20) Jun 02 23:21:12 indexer-worker(username): Panic: file ioloop.c: line 39 (io_add_file): assertion failed: (callback != NULL) Jun 02 23:21:12 indexer-worker(username): Error: Raw backtrace: /usr/local/lib/dovecot/libdovecot.so.0(+0x77130) [0x7ffff7842130] -> /usr/local/lib/dovecot/libdovecot.so.0(+0x7 Jun 02 23:21:12 indexer: Error: Indexer worker disconnected, discarding 1 requests for username Jun 02 23:21:12 imap(username): Error: indexer failed to index mailbox INBOX.username Jun 02 23:21:12 indexer-worker(username): Fatal: master: service(indexer-worker): child 7909 killed with signal 6 (core dumps disabled)
The problem was already posted: http://dovecot.org/pipermail/dovecot/2015-May/100901.html I could trigger the same panic running the indexer via 'doveadm index -u username MAILBOX'.
Here is a backtrace (bt) of the 2.2.18-crash (on line #8 you see a fragement of the text sent to solr):
#0 array_count_i (array=0x8) at array.h:155 #1 array_get_modifiable_i (count_r=<synthetic pointer>, array=0x8) at array.h:228 #2 priorityq_remove_idx (pq=0x0, idx=0) at priorityq.c:121 #3 0x00007ff65f3ef5eb in priorityq_remove (pq=<optimized out>, item=item@entry=0xa26920) at priorityq.c:138 #4 0x00007ff65f3e1e70 in timeout_remove (_timeout=<optimized out>) at ioloop.c:288 #5 0x00007ff65f3e2781 in io_loop_move_timeout (_timeout=_timeout@entry=0xa27f98) at ioloop.c:861 #6 0x00007ff65f39ff37 in http_client_connection_switch_ioloop (conn=conn@entry=0xa27ea0) at http-client-connection.c:1357 #7 0x00007ff65f3a3d68 in http_client_switch_ioloop (client=client@entry=0xa0bf20) at http-client.c:211 #8 0x00007ff65f39c005 in http_client_request_continue_payload (_req=_req@entry=0xa0ee88, data=0xa42fa0 "k for evidence of fluid spill.\nIf the device is mounted on a stand, examine the condition of the mount.\nIf the device moves on casters, check the condition of the casters. Check operation of brakes, i"..., size=55453) at http-client-request.c:566 #9 0x00007ff65f39c22a in http_client_request_send_payload (_req=_req@entry=0xa0ee88, data=<optimized out>, size=<optimized out>) at http-client-request.c:625 #10 0x00007ff65e972429 in solr_connection_post_more (post=0xa0ee80, data=<optimized out>, size=size@entry=55453) at solr-connection.c:504 #11 0x00007ff65e96ea09 in fts_backed_solr_build_commit (ctx=0xa1a880) at fts-backend-solr.c:341 #12 0x00007ff65e96eaad in fts_backend_solr_update_set_mailbox (_ctx=0xa1a880, box=0x0) at fts-backend-solr.c:407 #13 0x00007ff65eb7cfac in fts_backend_set_cur_mailbox (ctx=ctx@entry=0xa1a880) at fts-api.c:129 #14 0x00007ff65eb7cfe3 in fts_backend_update_deinit (_ctx=<optimized out>) at fts-api.c:143 #15 0x00007ff65eb8303c in fts_transaction_end (t=t@entry=0xa11ed0) at fts-storage.c:550 #16 0x00007ff65eb83e91 in fts_transaction_commit (t=0xa11ed0, changes_r=0x7ffdcdca5e30) at fts-storage.c:615 #17 0x00007ff65f688a82 in mailbox_transaction_commit_get_changes (_t=_t@entry=0x7ffdcdca5ee0, changes_r=changes_r@entry=0x7ffdcdca5e30) at mail-storage.c:1837 #18 0x00007ff65f688b2e in mailbox_transaction_commit (t=t@entry=0x7ffdcdca5ee0) at mail-storage.c:1818
"bt full" looks like this: #0 array_count_i (array=0x8) at array.h:155 No locals. #1 array_get_modifiable_i (count_r=<synthetic pointer>, array=0x8) at array.h:228 No locals. #2 priorityq_remove_idx (pq=0x0, idx=0) at priorityq.c:121 count = <optimized out> #3 0x00007ff65f3ef5eb in priorityq_remove (pq=<optimized out>, item=item@entry=0xa26920) at priorityq.c:138 No locals. #4 0x00007ff65f3e1e70 in timeout_remove (_timeout=<optimized out>) at ioloop.c:288 timeout = 0xa26920 #5 0x00007ff65f3e2781 in io_loop_move_timeout (_timeout=_timeout@entry=0xa27f98) at ioloop.c:861 new_to = 0xa1adf0 old_to = <optimized out> #6 0x00007ff65f39ff37 in http_client_connection_switch_ioloop (conn=conn@entry=0xa27ea0) at http-client-connection.c:1357 No locals. #7 0x00007ff65f3a3d68 in http_client_switch_ioloop (client=client@entry=0xa0bf20) at http-client.c:211 conn = 0xa27ea0 _conn = 0xa27ea0 host = <optimized out> peer = <optimized out> #8 0x00007ff65f39c005 in http_client_request_continue_payload (_req=_req@entry=0xa0ee88, data=0xa42fa0 "k for evidence of fluid spill.\nIf the device is mounted on a stand, examine the condition of the mount.\nIf the device moves on casters, check the condition of the casters. Check operation of brakes, i"..., size=55453) at http-client-request.c:566 prev_ioloop = 0x9f4730 req = 0xa36970 conn = 0xa27ea0 client = 0xa0bf20 ret = <optimized out> __FUNCTION__ = "http_client_request_continue_payload" #9 0x00007ff65f39c22a in http_client_request_send_payload (_req=_req@entry=0xa0ee88, data=<optimized out>, size=<optimized out>) at http-client-request.c:625 __FUNCTION__ = "http_client_request_send_payload" #10 0x00007ff65e972429 in solr_connection_post_more (post=0xa0ee80, data=<optimized out>, size=size@entry=55453) at solr-connection.c:504 conn = 0xa0be50 __FUNCTION__ = "solr_connection_post_more"
Hope anyone fixes the code... i need this feature :) Thanks a lot in advance!