I'm running,
dovecot --version
2.3.11.3 (502c39af9)
solr -version
8.6.3
uname -rm
5.8.13-200.fc32.x86_64 x86_64
grep _NAME /etc/os-release
PRETTY_NAME="Fedora 32 (Server Edition)"
CPE_NAME="cpe:/o:fedoraproject:fedora:32"
Solr FTS plugin is enabled/configured,
mail_plugins = virtual acl fts fts_solr
plugin {
fts = solr
fts_autoindex = yes
fts_solr = url=https://solr.example.com:8984/solr/dovecot/
fts_enforced = body
fts_filters = normalizer-icu stopwords snowball
fts_language_config = /usr/share/libexttextcat/fpdb.conf
fts_languages = en es de fr it pt
soft_commit = yes
}
IMAP capability returns,
a OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT MULTIAPPEND URL-PARTIAL CATENATE UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH LIST-STATUS BINARY MOVE SNIPPET=FUZZY PREVIEW=FUZZY STATUS=SIZE SAVEDATE SPECIAL-USE LITERAL+ NOTIFY SPECIAL-USE QUOTA ACL RIGHTS=texk] Logged in
I've got two messages in my IMAP store,
cd /data/vmail/example.com/myuser/Maildir/cur/
ls -altr | grep S= | /bin/tail -n2
-rw------- 1 vmail vmail 1.3K Oct 11 14:05 1602450306.M393628P65260.mx.example.com,S=1278,W=1304:2,S
-rw------- 1 vmail vmail 1.3K Oct 11 14:05 1602450353.M756184P65260.mx.example.com,S=1277,W=1303:2,S
that differ in BODY CONTENT -- -- one message has ascii txt with NO character accents -- the other has the same text, but with ON character accent
cat "1602450306.M393628P65260.mx.example.com,S=1278,W=1304:2,S"
...
From: M User <myuser@example.com>
Subject: test
Reply-To: myuser@example.com
To: "User, My" <myuser@example.com>
Message-ID: <6fc7ac30-b460-7dd4-f85d-ca4403ad7188@example.com>
Date: Sun, 11 Oct 2020 14:05:06 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.3.2
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
!!!! también
cat 1602450353.M756184P65260.mx.example.com,S=1277,W=1303:2,S
...
From: M User <myuser@example.com>
Subject: test
Reply-To: myuser@example.com
To: "User, My" <myuser@example.com>
Message-ID: <015b3fb4-46f9-87cc-d541-060db0a13086@example.com>
Date: Sun, 11 Oct 2020 14:05:53 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.3.2
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
!!!! tambien
i manually re-scan & index
doveadm fts rescan -u myuser@example.com
doveadm index -u myuser@example.com -q '*'
...
==> /var/log/dovecot/dovecot-info.log <==
2020-10-11 15:06:34 indexer-worker(myuser@example.com)<OyUmLeqBg18fDAEA+IOfAw>: Info: Indexed 21 messages in accts (UIDs 14399..130699)
2020-10-11 15:06:34 indexer-worker(myuser@example.com)<6NnOMuqBg18fDAEA+IOfAw>: Info: Indexed 16 messages in accts/v007132 (UIDs 13414..14778)
...
with no errors.
then search in mail client, here TBird 78, with
[X] Run Search on Server
for _un_accented "tambien", match is correctly -- and quickly -- returned.
in logs,
==> /var/log/dovecot/dovecot-info.log <==
2020-10-11 14:57:05 imap-login: Info: Login: user=<myuser@example.com>, method=PLAIN, rip=10.0.1.7, lip=10.0.1.50, mpid=67743, TLS
2020-10-11 14:57:16 indexer-worker(myuser@example.com)<3ZUzQ2yx2JKsHgsH:9gu0MbF/g1+hCAEA+IOfAw>: Info: Indexed 4788 messages in INBOX (UIDs 135476..140263)
BUT, repeating search for ACCENTED "también" returns *no* match/result.
No errors in log, simply no match.
Attempting to test/debug from from cmd line,
doveadm fts lookup -u myuser@example.com body "tambien"
causes a PANIC
doveadm(myuser@example.com): Panic: file mail-storage.c: line 2112 (mailbox_get_open_status): assertion failed: (box->opened)
doveadm(myuser@example.com): Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(backtrace_append+0x46) [0x7f3ee94accc6] -> /usr/lib64/dovecot/libdovecot.so.0(backtrace_get+0x22) [0x7f3ee94acde2] -> /usr/lib64/dovecot/libdovecot.so.0(+0x10025b) [0x7f3ee94b625b] -> /usr/lib64/dovecot/libdovecot.so.0(+0x100297) [0x7f3ee94b6297] -> /usr/lib64/dovecot/libdovecot.so.0(+0x59bc6) [0x7f3ee940fbc6] -> /usr/lib64/dovecot/libdovecot-storage.so.0(+0x4779e) [0x7f3ee95c379e] -> /usr/lib64/dovecot/lib21_fts_solr_plugin.so(+0x5849) [0x7f3ee9015849] -> /usr/lib64/dovecot/lib20_fts_plugin.so(fts_backend_lookup+0x51) [0x7f3ee8c37491] -> /usr/lib64/dovecot/doveadm/lib20_doveadm_fts_plugin.so(+0x3280) [0x7f3ee8ba9280] -> doveadm(+0x343cd) [0x5637e99443cd] -> doveadm(+0x34fe0) [0x5637e9944fe0] -> doveadm(doveadm_cmd_ver2_to_mail_cmd_wrapper+0x22d) [0x5637e9945e2d] -> doveadm(doveadm_cmd_run_ver2+0x4e8) [0x5637e99568d8] -> doveadm(doveadm_cmd_try_run_ver2+0x3e) [0x5637e995692e] -> doveadm(main+0x1d4) [0x5637e9934cf4] -> /lib64/libc.so.6(__libc_start_main+0xf2) [0x7f3ee9071042] -> doveadm(_start+0x2e) [0x5637e99351ce]
Aborted
(1) What config -- dovecot &/or solr -- is needed to match on accented characters? (2) What add'l detail, if any, is needed for troubleshooting the panic?