<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <div class="moz-cite-prefix">On 07/02/2021 18:51, Joan Moreau wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:fac791d5a8e987afe0ea6afe7dc9dec7@grosjo.net">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <p>more info : the function fts_parser_script_more in
        plugins/fts/fts-parser.c properly read the output of the script</p>
      <p>still, the data is not sent to the FTS pligins (xapian or any
        other)</p>
      <p><br>
      </p>
      <p><br>
      </p>
      <p id="reply-intro">On 2021-02-07 17:37, Joan Moreau wrote:</p>
      <blockquote type="cite" style="padding: 0 0.4em; border-left:
        #1010ff 2px solid; margin: 0">
        <div id="replybody1">
          <div style="font-size: 9pt; font-family:
            Verdana,Geneva,sans-serif;">
            <p>more info : I am running dovecot git version</p>
            <p><br>
            </p>
            <p id="v1reply-intro">On 2021-02-07 17:15, Joan Moreau
              wrote:</p>
            <blockquote style="padding: 0 0.4em; border-left: #1010ff
              2px solid; margin: 0;">
              <div id="v1replybody1">
                <div style="font-size: 9pt; font-family:
                  Verdana,Geneva,sans-serif;">
                  <p>a bit more on this, adding log in the
                    decode2text.sh, I can see that pdftotext output the
                    right data, but that data is /not/ transmitted to
                    the fts plugin for indexing (only the original pdf
                    code is)</p>
                  <p><br>
                  </p>
                  <p><br>
                  </p>
                  <p id="v1v1reply-intro">On 2021-02-07 17:00, Joan
                    Moreau wrote:</p>
                  <blockquote style="padding: 0 0.4em; border-left:
                    #1010ff 2px solid; margin: 0;">
                    <div id="v1v1replybody1">
                      <div style="font-size: 9pt; font-family:
                        Verdana,Geneva,sans-serif;">
                        <p>Hello,</p>
                        <p>I am trying to deal properly with email
                          attachements in fts-xapian plugins.</p>
                        <p>I tried the default script with a PDF file.</p>
                        <p>The data I receive in the fts plugin part
                          ("xxx_build_more") is the original document,
                          no the output of the pdftotext</p>
                        <p>Is there anything I am missing ?</p>
                        <p>Here my config:</p>
                        <p><br>
                        </p>
                        <p><span style="font-family: 'courier new',
                            courier, monospace;">plugin {</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">        plugin =
                            fts_xapian managesieve sieve</span></p>
                        <p><span style="font-family: 'courier new',
                            courier, monospace;">        fts = xapian</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">        fts_xapian =
                            partial=2 full=20 verbose=1 attachments=1</span></p>
                        <p><span style="font-family: 'courier new',
                            courier, monospace;">        fts_autoindex =
                            yes</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">        fts_enforced =
                            yes</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">       
                            fts_autoindex_exclude = \Trash</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">       
                            fts_autoindex_exclude2 = \Drafts</span></p>
                        <p><span style="font-family: 'courier new',
                            courier, monospace;">        fts_decoder =
                            decode2text</span></p>
                        <p><span style="font-family: 'courier new',
                            courier, monospace;">        sieve =
                            /data/mail/%d/%n/local.sieve</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">        sieve_after =
                            /data/mail/after.sieve</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">        sieve_before =
                            /data/mail/before.sieve</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">        sieve_dir =
                            /data/mail/%d/%n/sieve</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">       
                            sieve_global_dir = /data/mail</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">       
                            sieve_global_path = /data/mail/global.sieve</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">}</span></p>
                        <p><span style="font-family: 'courier new',
                            courier, monospace;">...</span></p>
                        <p><span style="font-family: 'courier new',
                            courier, monospace;">service decode2text {</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">   executable = script
                            /usr/libexec/dovecot/decode2text.sh</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">   user = dovecot</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">   unix_listener
                            decode2text {</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">     mode = 0666</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">   }</span><br>
                          <span style="font-family: 'courier new',
                            courier, monospace;">}</span></p>
                        <p><br>
                        </p>
                        <p>Thank you</p>
                        <p><br>
                        </p>
                      </div>
                    </div>
                  </blockquote>
                </div>
              </div>
            </blockquote>
          </div>
        </div>
      </blockquote>
    </blockquote>
    <p>Joan<br>
    </p>
    <p>I'm not sure I can be much use for xapian, but looking at your
      configuration I did notice some differences with the
      documentation. I don't know if they are relevant to the issue
      you're seeing.</p>
    <p>First of all I don't see <br>
    </p>
    <pre><code>mail_plugins = fts</code></pre>
    <p>plugin = fts</p>
    <p>settings which are both mentioned in the xapian documentation. <br>
    </p>
    <p>Also the documentation states that attachments=1 can only index
      text attachments. Maybe you should be using attachments=0 and let
      fts_decode handle the attachments.</p>
    <p>Failing that, I can only advise to turn on some debugging and see
      what that brings.</p>
    <p>best regards</p>
    <p>John<br>
    </p>
    <p><br>
    </p>
  </body>
</html>