more info : the function fts_parser_script_more in plugins/fts/fts-parser.c properly read the output of the script
still, the data is not sent to the FTS pligins (xapian or any other)
more info : I am running dovecot git version
On 2021-02-07 17:15, Joan Moreau wrote:
a bit more on this, adding log in the decode2text.sh, I can see that pdftotext output the right data, but that data is /not/ transmitted to the fts plugin for indexing (only the original pdf code is)
On 2021-02-07 17:00, Joan Moreau wrote:
Hello,
I am trying to deal properly with email attachements in fts-xapian plugins.
I tried the default script with a PDF file.
The data I receive in the fts plugin part ("xxx_build_more") is the original document, no the output of the pdftotext
Is there anything I am missing ?
Here my config:
plugin {
plugin = fts_xapian managesieve sievefts = xapian
fts_xapian = partial=2 full=20 verbose=1 attachments=1fts_autoindex = yes
fts_enforced = yes
fts_autoindex_exclude = \Trash
fts_autoindex_exclude2 = \Draftsfts_decoder = decode2text
sieve = /data/mail/%d/%n/local.sieve
sieve_after = /data/mail/after.sieve
sieve_before = /data/mail/before.sieve
sieve_dir = /data/mail/%d/%n/sieve
sieve_global_dir = /data/mail
sieve_global_path = /data/mail/global.sieve
}...
service decode2text {
executable = script /usr/libexec/dovecot/decode2text.sh
user = dovecot
unix_listener decode2text {
mode = 0666
}
}
Thank you
Joan
I'm not sure I can be much use for xapian, but looking at your configuration I did notice some differences with the documentation. I don't know if they are relevant to the issue you're seeing.
First of all I don't see
mail_plugins = fts
plugin = fts
settings which are both mentioned in the xapian documentation.
Also the documentation states that attachments=1 can only index text attachments. Maybe you should be using attachments=0 and let fts_decode handle the attachments.
Failing that, I can only advise to turn on some debugging and see what that brings.
best regards
John