Well, thank you for the answer, but the actual issue is that data sent by the decoder (stipulated in the conf file) is properly collected by dovecot core, but /not/ sent to the plugin : the plugin receives the original data.
This is not linked to a particular plugin (xapian, solr, squat, etc..) but seems to be a general issue of dovecot core
On 2021-02-08 01:03, John Fawcett wrote:
On 07/02/2021 18:51, Joan Moreau wrote:
more info : the function fts_parser_script_more in plugins/fts/fts-parser.c properly read the output of the script
still, the data is not sent to the FTS pligins (xapian or any other)
On 2021-02-07 17:37, Joan Moreau wrote:
more info : I am running dovecot git version
On 2021-02-07 17:15, Joan Moreau wrote:
a bit more on this, adding log in the decode2text.sh, I can see that pdftotext output the right data, but that data is /not/ transmitted to the fts plugin for indexing (only the original pdf code is)
On 2021-02-07 17:00, Joan Moreau wrote:
Hello,
I am trying to deal properly with email attachements in fts-xapian plugins.
I tried the default script with a PDF file.
The data I receive in the fts plugin part ("xxx_build_more") is the original document, no the output of the pdftotext
Is there anything I am missing ?
Here my config:
plugin { plugin = fts_xapian managesieve sieve
fts = xapian fts_xapian = partial=2 full=20 verbose=1 attachments=1
fts_autoindex = yes fts_enforced = yes fts_autoindex_exclude = \Trash fts_autoindex_exclude2 = \Drafts
fts_decoder = decode2text
sieve = /data/mail/%d/%n/local.sieve sieve_after = /data/mail/after.sieve sieve_before = /data/mail/before.sieve sieve_dir = /data/mail/%d/%n/sieve sieve_global_dir = /data/mail sieve_global_path = /data/mail/global.sieve }
...
service decode2text { executable = script /usr/libexec/dovecot/decode2text.sh user = dovecot unix_listener decode2text { mode = 0666 } }
Thank you
Joan
I'm not sure I can be much use for xapian, but looking at your configuration I did notice some differences with the documentation. I don't know if they are relevant to the issue you're seeing.
First of all I don't see
mail_plugins = fts
plugin = fts
settings which are both mentioned in the xapian documentation.
Also the documentation states that attachments=1 can only index text attachments. Maybe you should be using attachments=0 and let fts_decode handle the attachments.
Failing that, I can only advise to turn on some debugging and see what that brings.
best regards
John