Stuart Henderson stu at spacehopper.org
Mon Feb 8 23:11:01 EET 2021

On 2021-02-08, Joan Moreau <jom at grosjo.net> wrote:
> Well, in the function xxx_build_more of FTS plugin, the data received in 
> the original PDF, not the output of pdftotext
> Can you clarify where do you put your log in the solr plugin , so I can 
> check the situation in the xapian plugin ?

The log is particular to fts_solr, you set it with e.g.

"fts_solr = url= rawlog_dir=/tmp/solr"

Confirmed it works for me, i.e. passes text from inside the pdf, and not
the whole pdf itself.

Did you check that decode2text.sh works ok on your system (when running
as the relevant uid)?

cat foo.pdf | sudo -u dovecot /usr/libexec/dovecot/decode2text.sh application/pdf

