large search indexer tasks, submitted to flatcurve+tika+tesseract backend for attachment scanning, timeout even with "fts_index_timeout = 0"; how to increase/remove timeouts?
PGNet Dev
pgnet.dev at gmail.com
Thu Jul 28 13:53:47 UTC 2022
On 7/27/22 3:15 PM, Michael Slusarz wrote:
>> where do I set that timeout to not fail, as above, on large index tasks?
>
> You need to change the source, as Tika has a hardcoded 60 second HTTP request limit.
>
> https://github.com/dovecot/core/blob/release-2.3.19/src/plugins/fts/fts-parser-tika.c#L76
Thanks. For now, that can be done.
Can you clarify *why* it's a hardcoded limit, rather than a settable param?
Tika backend can be configured to process attachments handed off by dovecot with or without OCR, within user-def'd min/max size limits. Passing back parsed results for
indexing by (fts-)flatcurve, (fts-)solr, etc.
There are clearly occasions where those limits can be / are exceeded. IIUC, on the dovecot timeout fail, the submit is not retried in any form -- e.g., with a conditionally scaled timeout, or a user-def'd timeout.
It seems, in such cases, that making the timeout -- and perhaps other tika params in dovecot? -- would be useful.
Can this be considered for dovecot? Or, is there some reason that it can't, or shouldn't?
More information about the dovecot
mailing list