[patch] enhancement for tika server protected by user/password basic auth

PGNet Dev pgnet.dev at gmail.com
Sun Nov 15 22:54:28 EET 2020


On 11/15/20 12:21 PM, John Fawcett wrote:
> I'm using tika-server.jar installed as a service

yup. same here.

atm, listening on localhost, with Dovecot -> Tika direct, no proxy.

similarly fragile under load.  throwing ~10 messages with .5-5MB attachments at it at once causes all sorts of complaints.

one at a time seems OK ...

> Dovecot currently implements separate integrations, first the
> attachments are sent to tika, then the results are sent to solr.

ah, so tika first ...

> The two could even be running on separate servers.

Not sure when that's a useful usecase.  I can certainly see a separate, integrated solr+tika server.

ExtremelyhHeavy loads, I guess.

> Yes that could be an alternative way, so instead of sending the
> attachments to tika, send the attachments to solr and let it send them
> to tika. It would be more than configuration in Dovecot though.

yup.  taking a look at solr cell + tika integration to see where the config makes most sense.

this is a useful 1st read

   https://lucene.apache.org/solr/guide/8_7/uploading-data-with-solr-cell-using-apache-tika.html

> Yes, I think limits on Dovecot are useful in any case, otherwise you end
> up sending arbitrary sized files across the network to have them thrown
> away on the server.

point taken.

afaict, fts_solr has only a batch_size limit -- but neither a total message size, or an attachment size limit.


More information about the dovecot mailing list