On 05/02/2021 15:04, Dean Carpenter wrote:

Is there anything I can do here ?  This makes Tika unusable :(  That really sucks because we have a *lot* of attachments.

Thanks -

On 2021-02-03 2:07 pm, Dean Carpenter wrote:

Just noticed this in the logs

doveadm(harryb@example.com): Debug: http-client[2]: request [Req2: PUT http://localhost/tika/]: Submitted (requests left=1)
doveadm(harryb@example.com): Debug: http-client[2]: request [Req2: PUT http://localhost/tika/]: Waiting for request to finish

It doesn't have the port on the URL ...

 

On 2021-02-03 1:59 pm, Dean Carpenter wrote:

Getting panic in http-client-request.c: line 1240 during indexing

I'm testing the install/setup of dovecot on an Ubuntu Focal 20.04 system, using the dovecot repo. The packages that are installed are

harryb@dove1:~$ dpkg -l dovecot\* solr\* jetty\* tesser\* | fgrep ii
ii dovecot-core 2:2.3.13-2+ubuntu20.04 amd64 secure POP3/IMAP server - core files
ii dovecot-imapd 2:2.3.13-2+ubuntu20.04 amd64 secure POP3/IMAP server - IMAP daemon
ii dovecot-lmtpd 2:2.3.13-2+ubuntu20.04 amd64 secure POP3/IMAP server - LMTP server
ii dovecot-managesieved 2:2.3.13-2+ubuntu20.04 amd64 secure POP3/IMAP server - ManageSieve server
ii dovecot-mysql 2:2.3.13-2+ubuntu20.04 amd64 secure POP3/IMAP server - MySQL support
ii dovecot-sieve 2:2.3.13-2+ubuntu20.04 amd64 secure POP3/IMAP server - Sieve filters support
ii dovecot-solr 2:2.3.13-2+ubuntu20.04 amd64 secure POP3/IMAP server - Solr support
ii jetty9 9.4.26-1 all Java servlet engine and webserver
ii solr-common 3.6.2+dfsg-22 all Enterprise search server based on Lucene3 - common files
ii solr-jetty 3.6.2+dfsg-22 all Enterprise search server based on Lucene3 - Jetty integration
ii tesseract-ocr 4.1.1-2build2 amd64 Tesseract command line OCR tool
ii tesseract-ocr-eng 1:4.00~git30-7274cfa-1 all tesseract-ocr language files for English
ii tesseract-ocr-osd 1:4.00~git30-7274cfa-1 all tesseract-ocr language files for script and orientation

Apache Tika v1.25 is also installed

harryb@dove1:~$ l /usr/local/bin/tika*
-rw-r--r-- 1 root root 79337717 Feb 3 12:40 /usr/local/bin/tika-server-1.25.jar

and running pretty standard-ish

/usr/bin/java -jar /usr/local/bin/tika-server-1.25.jar --port 9998 --host localhost -enableUnsecureFeatures -enableFileUrl --log info

I use mbsync to pull in sample mailboxes for testing

The solr conf in /etc/dovecot/conf.d/ is as follows
plugin {
fts = solr
fts_solr = url=http://localhost:8693/solr/
fts_autoindex = yes
fts_filters = lowercase snowball stopwords
fts_tokenizers = generic email-address
fts_tokenizer_generic = algorithm=simple
fts_autoindex_exclude = \Junk
fts_autoindex_exclude2 = \Trash
fts_autoindex_exclude3 = \Spam
}
plugin {
fts_tika = http://localhost:9998/tika/
}

Unfortunately indexing is panic'ing. Whether I index a test folder manually, or just looking in /var/log/dovecot.log

I'll see the following very consistently :

harryb@dove1:~$ sudo doveadm -D index -u harryb@example.com Receipts

Debug: Loading modules from directory: /usr/lib/dovecot/modules
Debug: Module loaded: /usr/lib/dovecot/modules/lib10_mail_crypt_plugin.so
Debug: Module loaded: /usr/lib/dovecot/modules/lib15_notify_plugin.so
Debug: Module loaded: /usr/lib/dovecot/modules/lib20_fts_plugin.so
Debug: Module loaded: /usr/lib/dovecot/modules/lib20_replication_plugin.so
Debug: Module loaded: /usr/lib/dovecot/modules/lib20_zlib_plugin.so
Debug: Module loaded: /usr/lib/dovecot/modules/lib21_fts_solr_plugin.so
Debug: Loading modules from directory: /usr/lib/dovecot/modules/doveadm
Debug: Module loaded: /usr/lib/dovecot/modules/doveadm/lib20_doveadm_fts_plugin.so
Debug: Module loaded: /usr/lib/dovecot/modules/doveadm/libdoveadm_mail_crypt_plugin.so
doveadm(harryb@example.com): Debug: auth-master: passdb lookup(harryb@example.com): Started passdb lookup
doveadm(harryb@example.com): Debug: auth-master: conn unix:/var/run/dovecot/auth-userdb: Connecting
doveadm(harryb@example.com): Debug: auth-master: conn unix:/var/run/dovecot/auth-userdb (pid=165148,uid=0): Client connected (fd=9)
doveadm(harryb@example.com): Debug: auth-master: passdb lookup(harryb@example.com): auth PASS input: user=harryb@example.com
doveadm(harryb@example.com): Debug: auth-master: passdb lookup(harryb@example.com): Finished passdb lookup (user=harryb@example.com )
doveadm(harryb@example.com)<180749><>: Debug: auth-master: userdb lookup(harryb@example.com): Started userdb lookup
doveadm(harryb@example.com)<180749><>: Debug: auth-master: userdb lookup(harryb@example.com): auth USER input: harryb@example.com home=/var/mail/example.com/harryb uid=1001 gid=1001
doveadm(harryb@example.com)<180749><>: Debug: auth-master: userdb lookup(harryb@example.com): Finished userdb lookup (username=harryb@example.com home=/var/mail/example.com/harryb uid=1001 gid=1001)
doveadm(harryb@example.com): Debug: Effective uid=1001, gid=1001, home=/var/mail/example.com/harryb
doveadm(harryb@example.com): Debug: Namespace inbox: type=private, prefix=, sep=, inbox=yes, hidden=no, list=yes, subscriptions=yes location=maildir:~/Maildir:LAYOUT=fs:INBOX=~/Maildir/INBOX
doveadm(harryb@example.com): Debug: fs: root=/var/mail/example.com/harryb/Maildir, index=, indexpvt=, control=, inbox=/var/mail/example.com/harryb/Maildir/INBOX, alt=
doveadm(harryb@example.com): Debug: Mailbox Receipts: Mailbox opened because: index
doveadm(harryb@example.com): Info: Receipts: Caching mails seq=1..72
doveadm(harryb@example.com): Debug: Mailbox Receipts: UID 1: Opened mail because: prefetch
doveadm(harryb@example.com): Debug: Mailbox Receipts: UID 1: Opened mail because: fts indexing
doveadm(harryb@example.com): Debug: http-client: peer [::1]:9998 (shared): Peer created
doveadm(harryb@example.com): Debug: http-client: peer [::1]:9998: Peer pool created
doveadm(harryb@example.com): Debug: http-client[2]: peer [::1]:9998: Peer created
doveadm(harryb@example.com): Debug: http-client[2]: queue http://localhost:9998: Setting up connection to [::1]:9998 (1 requests pending)
doveadm(harryb@example.com): Debug: http-client[2]: peer [::1]:9998: Linked queue http://localhost:9998 (1 queues linked)
doveadm(harryb@example.com): Debug: http-client[2]: queue http://localhost:9998: Started new connection to [::1]:9998
doveadm(harryb@example.com): Debug: http-client[2]: request [Req2: PUT http://localhost/tika/]: Submitted (requests left=1)
doveadm(harryb@example.com): Debug: http-client[2]: request [Req2: PUT http://localhost/tika/]: Waiting for request to finish
doveadm(harryb@example.com): Debug: http-client[2]: peer [::1]:9998: Creating 1 new connections to handle requests (already 0 usable, connecting to 0, closing 0)
doveadm(harryb@example.com): Debug: http-client[2]: peer [::1]:9998: Making new connection 1 of 1 (0 connections exist, 0 pending)
doveadm(harryb@example.com): Debug: http-client: conn [::1]:9998 [2]: Connecting
doveadm(harryb@example.com): Debug: http-client: conn [::1]:9998 [2]: Waiting for connect (fd=20) to finish for max 0 msecs
doveadm(harryb@example.com): Debug: http-client: conn [::1]:9998 [2]: HTTP connection created (1 parallel connections exist)
doveadm(harryb@example.com): Panic: file http-client-request.c: line 1240 (http_client_request_send_more): assertion failed: (req->payload_input != NULL)
doveadm(harryb@example.com): Error: Raw backtrace: /usr/lib/dovecot/libdovecot.so.0(backtrace_append+0x41) [0x7ffb60d697a1] -> /usr/lib/dovecot/libdovecot.so.0(backtrace_get+0x22) [0x7ffb60d698c2] -> /usr/lib/dovecot/libdovecot.so.0(+0x104b3b) [0x7ffb60d75b3b] -> /usr/lib/dovecot/libdovecot.so.0(+0x104b77) [0x7ffb60d75b77] -> /usr/lib/dovecot/libdovecot.so.0(+0x5b4d5) [0x7ffb60ccc4d5] -> /usr/lib/dovecot/libdovecot.so.0(+0x5291e) [0x7ffb60cc391e] -> /usr/lib/dovecot/libdovecot.so.0(http_client_connection_output+0xf2) [0x7ffb60d19f82] -> /usr/lib/dovecot/libdovecot.so.0(+0x12b1c5) [0x7ffb60d9c1c5] -> /usr/lib/dovecot/libdovecot.so.0(io_loop_call_io+0x6d) [0x7ffb60d8bd4d] -> /usr/lib/dovecot/libdovecot.so.0(io_loop_handler_run_internal+0x145) [0x7ffb60d8d395] -> /usr/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0x54) [0x7ffb60d8bdf4] -> /usr/lib/dovecot/libdovecot.so.0(io_loop_run+0x48) [0x7ffb60d8bf68] -> /usr/lib/dovecot/libdovecot.so.0(+0xa482d) [0x7ffb60d1582d] -> /usr/lib/dovecot/libdovecot.so.0(http_client_request_send_payload+0x34) [0x7ffb60d159c4] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(+0x1068d) [0x7ffb604ef68d] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(fts_parser_more+0x2b) [0x7ffb604ee4ab] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(+0xd77f) [0x7ffb604ec77f] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(fts_build_mail+0x52) [0x7ffb604ecec2] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(+0x13573) [0x7ffb604f2573] -> /usr/lib/dovecot/libdovecot-storage.so.0(mail_precache+0x32) [0x7ffb60e92ed2] -> doveadm(+0x3982f) [0x55d3519ed82f] -> doveadm(+0x33e35) [0x55d3519e7e35] -> doveadm(+0x34a16) [0x55d3519e8a16] -> doveadm(doveadm_cmd_ver2_to_mail_cmd_wrapper+0x225) [0x55d3519e97b5] -> doveadm(doveadm_cmd_run_ver2+0x500) [0x55d3519f9e50] -> doveadm(doveadm_cmd_try_run_ver2+0x3e) [0x55d3519f9eae] -> doveadm(main+0x1d4) [0x55d3519d8dc4] -> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7ffb608fd0b3] -> doveadm(_start+0x2e) [0x55d3519d92ae]
Aborted

The key line above is

doveadm(harryb@example.com): Panic: file http-client-request.c: line 1240 (http_client_request_send_more): assertion failed: (req->payload_input != NULL)

It's the same each time. In /var/log/syslog there are usually a couple of extra lines

Feb 03 13:25:17 indexer: Error: Indexer worker disconnected, discarding 1 requests for harryb@example.com
Feb 03 13:25:17 indexer-worker(harryb@example.com)<202106><p+xgBonqGmAIFQMATj4HXg:IvE7Eo3qGmB6FQMATj4HXg>: Fatal: master: service(indexer-worker): child 202106 killed with signal 6 (core dumps disabled - https://dovecot.org/bugreport.html#coredumps)

-- 
Dean Carpenter
deano is at areyes dot com
203 six oh four 6644

Hi Dean

I guess you need the patch that was posted on this list by Jeff Sipek on 19.08.20 17:37

John