[patch] enhancement for tika server protected by user/password basic auth

John Fawcett john at voipsupport.it
Sun Nov 15 21:13:27 EET 2020


On 15/11/2020 18:10, John Fawcett wrote:
> On 15/11/2020 15:49, PGNet Dev wrote:
>> On 11/15/20 6:33 AM, John Fawcett wrote:
>>> I've configured a tika server behind an apache proxy which enforces
>>> basic auth, but sending basic auth credentials for a tika server is not
>>> currently supported by Dovecot.
>> i was _just_ setting up a tika instance behind a nginx proxy with
>> basicauth in place.
>>
>> hadn't yet gotten to the "can't pass auth creds in dovecot" bit.
>> thx! for the patch; hopefully the premise/patch will get picked up.
>> (ya-request for a proper @dovecot public bug/issue queue!)
>>
>> have you found any other 'magic required' to get solr & tika indexing
>> text/attachments, respectively, in Dovecot context?
>> is it as straightforward as spec'ing the 'fts_solr' & 'fts_tika' urls,
>> and Dovecot does the passing-around correctly?
> I've just started using tika myself, but from my tests, it's as simple
> as adding fts_tika to a working solr integration.
>
> John
>
>
Just a couple of updates about Tika and Solr together.

1. On mass reindexing I'm seeing panics - see below. These are present
with Dovecot 2.3.10 and 2.3.11.3. Seem to go away with the fix which was
previously posted on this list by Josef 'Jeff' Sipek, which I repeat
below for easy of reference.

2. On mass reindexing my Tika server seems to get a bit overwhelmed. I
think I'll need to look into how resources are allocated and do some
tuning. This produces 502 Proxy Error responses back to Dovecot.

As far as Dovecot integration with Tika, I believe that some resource
limits would be helpful. I think it would make sense to have a limit in
Dovecot about the maximum file size it will try to send to Tika.
Potentially, it could be useful also to allow configuration of the types
of file to send to Tika. For example I see lots of image files going
across, but I'd probably be happy not to have them indexed. It won't be
perfect, since those file types could exist inside zip files, but maybe
would cut out a bit of the load.

John


Nov 15 17:58:19 server02 dovecot:
indexer-worker(user at example.com)<11132><kMrwLCpesV98KwAAAJEHgA>: Panic:
file http-client-request.c: line 1235 (http_client_request_send_more):
assertion failed: (req->payload_input != NULL)
Nov 15 17:58:19 server02 dovecot:
indexer-worker(user at example.com)<11132><kMrwLCpesV98KwAAAJEHgA>: Error:
Raw backtrace:
/usr/local/lib/dovecot/libdovecot.so.0(backtrace_append+0x42)
[0x7f87c271adf2] ->
/usr/local/lib/dovecot/libdovecot.so.0(backtrace_get+0x1e)
[0x7f87c271aefe] -> /usr/local/lib/dovecot/libdovecot.so.0(+0xec44e)
[0x7f87c272544e] -> /usr/local/lib/dovecot/libdovecot.so.0(+0xec4f1)
[0x7f87c27254f1] -> /usr/local/lib/dovecot/libdovecot.so.0(i_fatal+0)
[0x7f87c267c4ea] ->
/usr/local/lib/dovecot/libdovecot.so.0(http_client_request_send_more+0x3dd)
[0x7f87c26c449d] ->
/usr/local/lib/dovecot/libdovecot.so.0(http_client_connection_output+0xf1)
[0x7f87c26c8bf1] ->
/usr/local/lib/dovecot/libssl_iostream_openssl.so(+0x918f)
[0x7f87bea4818f] -> /usr/local/lib/dovecot/libdovecot.so.0(+0x115710)
[0x7f87c274e710] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_call_io+0x65)
[0x7f87c273db65] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_handler_run_internal+0x12b)
[0x7f87c273f4ab] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0x59)
[0x7f87c273dc69] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_run+0x38)
[0x7f87c273dea8] -> /usr/local/lib/dovecot/libdovecot.so.0(+0x8a9c6)
[0x7f87c26c39c6] ->
/usr/local/lib/dovecot/libdovecot.so.0(http_client_request_send_payload+0x2c)
[0x7f87c26c3c4c] -> /usr/local/lib/dovecot/lib20_fts_plugin.so(+0xdbdd)
[0x7f87c1a1abdd] ->
/usr/local/lib/dovecot/lib20_fts_plugin.so(fts_parser_more+0x27)
[0x7f87c1a19b67] -> /usr/local/lib/dovecot/lib20_fts_plugin.so(+0xa951)
[0x7f87c1a17951] ->
/usr/local/lib/dovecot/lib20_fts_plugin.so(fts_build_mail+0x54)
[0x7f87c1a182b4] -> /usr/local/lib/dovecot/lib20_fts_plugin.so(+0x11502)
[0x7f87c1a1e502] ->
/usr/local/lib/dovecot/libdovecot-storage.so.0(mail_precache+0x2e)
[0x7f87c2a2519e] -> dovecot/indexer-worker [user at example.com
Sent](+0x2834) [0x55bd6355f834] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_call_io+0x65)
[0x7f87c273db65] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_handler_run_internal+0x12b)
[0x7f87c273f4ab] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0x59)
[0x7f87c273dc69] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_run+0x38)
[0x7f87c273dea8] ->
/usr/local/lib/dovecot/libdovecot.so.0(master_service_run+0x13)
[0x7f87c26ad383] -> dovecot/indexer-worker [user at example.com
Sent](main+0xd7) [0x55bd6355f227] ->
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f87c228d555] ->
dovecot/indexer-worker [user at example.com Sent](+0x22ee) [0x55bd6355f2ee]

diff --git a/src/plugins/fts-solr/solr-connection.c b/src/plugins/fts-solr/solr-connection.c
index ae720b5e2870a852c1b6c440939e3c7c0fa72b5c..9d364f93e2cd1b716b9ab61bd39656a6c5b1ea04 100644
--- a/src/plugins/fts-solr/solr-connection.c
+++ b/src/plugins/fts-solr/solr-connection.c
@@ -103,7 +103,7 @@ int solr_connection_init(const struct fts_solr_settings *solr_set,
 		http_set.ssl = ssl_client_set;
 		http_set.debug = solr_set->debug;
 		http_set.rawlog_dir = solr_set->rawlog_dir;
-		solr_http_client = http_client_init(&http_set);
+		solr_http_client = http_client_init_private(&http_set);
 	}
 	*conn_r = conn;
diff --git a/src/plugins/fts/fts-parser-tika.c b/src/plugins/fts/fts-parser-tika.c
index a4b8b5c3034f57e22e77caa759c090da6b62f8ba..b8b57a350b9a710d101ac7ccbcc14560d415d905 100644
--- a/src/plugins/fts/fts-parser-tika.c
+++ b/src/plugins/fts/fts-parser-tika.c
@@ -77,7 +77,7 @@ tika_get_http_client_url(struct mail_user *user, struct http_url **http_url_r)
 		http_set.request_timeout_msecs = 60*1000;
 		http_set.ssl = &ssl_set;
 		http_set.debug = user->mail_debug;
-		tika_http_client = http_client_init(&http_set);
+		tika_http_client = http_client_init_private(&http_set);
 	}
 	*http_url_r = tuser->http_url;
 	return 0;




More information about the dovecot mailing list