v2.3.11.3 solr plugin search via MUA fails to match accented ascii characters; cmd line exec of `doveadm fts lookup` PANICs (assertion failed) [proposed patch]

John Fawcett john at voipsupport.it
Sun Nov 1 10:56:58 EET 2020


On 31/10/2020 22:01, PGNet Dev wrote:
> On 10/31/20 9:55 AM, John Fawcett wrote:
>> I can contribute a patch that solves the segfault. Unfortunately though
>> fts search may be more broken than this. It does not give me search
>> results, even though I see it querying solr and getting hits.
>
> Thx -- hopefully it moves this in the right direction.
>
> Also on the 'good news' page, it appears there's been some progress on
> Thunderbird's use of backend/server search,
>
>     TBird "search on server" doesn't -- NO comm with backend
> IMAP/SOLR; appears to be local-only search
>      https://bugzilla.mozilla.org/show_bug.cgi?id=1673928
>
>     "A fix for this is upcoming."
>
> Remains to be seen if the doveadm search issues, and implications on
> backend problems, have any effect on the Thunderbird searches.

At the moment I don't see other corrections needed in dovecot apart from
command line doveadm fts which is not a show stopper. Via doveadm search
I confirm - on my simple config - that search for accented or non
accented characters works correctly as it does via imap connection. For
the imap test you can take Thunderbird out of the equation by running
another imap client, for example this three line php script (that relies
on having the php imap extension installed) can be run from the command
line with

php -f filename.php

and for me produces the same results as doveadm search.

<?php
$conn   = imap_open('{server.example.com:993/imap/ssl}INBOX',
'username', 'password', OP_READONLY);
$uids   = imap_search($conn, 'BODY "también"', SE_UID);
print_r($uids);

Only thing I cannot vouch for is bringing dovecot fts library and config
into the equation because my setup delegates almost everything to solr.

Can you get evidence of things not working? For example tests run with
soft_commit configured - that's important since without it the updates
don't show up immediately in searches, that do show that the update is
happening in solr via solr log, but then search is not working on
accented characters, despite it working on other text in the same
message? The solr logs also show whether the text was found or not via
the "hits=" value in the logged searches, for example:

2020-11-01 08:32:42.231 INFO  (qtp24119573-21) [   x:dovecot]
o.a.s.c.S.Request [dovecot]  webapp=/solr path=/select
params={q={!lucene+q.op%3DAND}body:también&fl=uid,score&sort=uid+asc&fq=%2Bbox:b1626f0fe8d9145e54100000c54a863a+%2Buser:john at voipsupport.it&rows=3202&wt=xml}
hits=3 status=0 QTime=3

But if no hits are found, then dovecot cannot be expected to display
results. It still may be an indexing problem though.

John




More information about the dovecot mailing list