fts_encoder

Joan Moreau jom at grosjo.net
Mon Feb 8 16:22:41 EET 2021


Well, thank you for the answer, but the actual issue is that data sent 
by the decoder (stipulated in the conf file) is properly collected by 
dovecot core, but /not/ sent to the plugin : the plugin receives the 
original data.

This is not linked to a particular plugin (xapian, solr, squat, etc..) 
but seems to be a general issue of dovecot core

On 2021-02-08 01:03, John Fawcett wrote:

> On 07/02/2021 18:51, Joan Moreau wrote:
> 
> more info : the function fts_parser_script_more in 
> plugins/fts/fts-parser.c properly read the output of the script
> 
> still, the data is not sent to the FTS pligins (xapian or any other)
> 
> On 2021-02-07 17:37, Joan Moreau wrote:
> 
> more info : I am running dovecot git version
> 
> On 2021-02-07 17:15, Joan Moreau wrote:
> 
> a bit more on this, adding log in the decode2text.sh, I can see that 
> pdftotext output the right data, but that data is /not/ transmitted to 
> the fts plugin for indexing (only the original pdf code is)
> 
> On 2021-02-07 17:00, Joan Moreau wrote:
> 
> Hello,
> 
> I am trying to deal properly with email attachements in fts-xapian 
> plugins.
> 
> I tried the default script with a PDF file.
> 
> The data I receive in the fts plugin part ("xxx_build_more") is the 
> original document, no the output of the pdftotext
> 
> Is there anything I am missing ?
> 
> Here my config:
> 
> plugin {
> plugin = fts_xapian managesieve sieve
> 
> fts = xapian
> fts_xapian = partial=2 full=20 verbose=1 attachments=1
> 
> fts_autoindex = yes
> fts_enforced = yes
> fts_autoindex_exclude = \Trash
> fts_autoindex_exclude2 = \Drafts
> 
> fts_decoder = decode2text
> 
> sieve = /data/mail/%d/%n/local.sieve
> sieve_after = /data/mail/after.sieve
> sieve_before = /data/mail/before.sieve
> sieve_dir = /data/mail/%d/%n/sieve
> sieve_global_dir = /data/mail
> sieve_global_path = /data/mail/global.sieve
> }
> 
> ...
> 
> service decode2text {
> executable = script /usr/libexec/dovecot/decode2text.sh
> user = dovecot
> unix_listener decode2text {
> mode = 0666
> }
> }
> 
> Thank you

Joan

I'm not sure I can be much use for xapian, but looking at your 
configuration I did notice some differences with the documentation. I 
don't know if they are relevant to the issue you're seeing.

First of all I don't see

mail_plugins = fts

plugin = fts

settings which are both mentioned in the xapian documentation.

Also the documentation states that attachments=1 can only index text 
attachments. Maybe you should be using attachments=0 and let fts_decode 
handle the attachments.

Failing that, I can only advise to turn on some debugging and see what 
that brings.

best regards

John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dovecot.org/pipermail/dovecot/attachments/20210208/872d5568/attachment.html>


More information about the dovecot mailing list