On 19/10/2020 02:49 PGNet Dev pgnet.dev@gmail.com wrote:
I've since rebuilt/reconfig'd all parts of my setup from scratch; some good cleanup along the way.
Atm, my entire system for send/recv, store/retrieve, + rules & search is working as I intend. Ok, mostly ...
Except for this accented-character search mystery. I've got a _lot_ of mail with various languages in bodies, so _do_ need to get this sorted.
On 10/18/20 2:58 PM, John Fawcett wrote: ... silly question ...
hardly!
creating 2 messages
(1) Subject: tambien Body: tambien
(2) Subject: también Body: también
and two more, two avoid known stop words
(3) Subject: aausdfrhyetdwgyatrdf Body: aausdfrhyetdwgyatrdf
(4) Subject: aausdfrhyétdwgyatrdf Body: aausdfrhyétdwgyatrdf
1st,
doveadm fts rescan -u myuser@example.com doveadm index -u myuser@example.com -q '*'
TBird/solr searches,
Subject: tambien ==> FOUND Subject: también ==> FOUND Subject: aausdfrhyetdwgyatrdf ==> FOUND Subject: aausdfrhyétdwgyatrdf ==> FOUND
Body: tambien ==> FOUND Body: también ==> (empty) Body: aausdfrhyetdwgyatrdf ==> FOUND Body: aausdfrhyétdwgyatrdf ==> (empty)
suggests it's _not_ (just) an existing-stopword problem
notable/odd that subject searches are OK, but not body.
On 10/18/20 2:58 PM, Shawn Heisey wrote: ...
If you are using something like the following schema: https://raw.githubusercontent.com/dovecot/core/master/doc/solr-schema-7.7.0....
I am
Solr does have a set of ICU filters, which I would recommend using rather than the lowercase filter
I'll give that a try ; haven't used solr outside of the dovecot context -- so need to find a doc/example on how, exactly, that's done correctly.
I cannot say much about the panic you're getting when using the doveadm command. The stacktrace says it is happening in dovecot code, not Solr code. And it looks like the panic had nothing to do with FTS or Solr ... what I see points to mailbox storage code.
again/still
doveadm fts lookup -u myuser@example.com <any key> "<any str>"
_all_ panic, as above,
doveadm(myuser@example.com): Panic: file mail-storage.c: line 2112 (mailbox_get_open_status): assertion failed: (box->opened) doveadm(myuser@example.com): Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(backtrace_append+0x46) [0x7f61bba4ecc6] -> /usr/lib64/dovecot/libdovecot.so.0(backtrace_get+0x22) [0x7f61bba4ede2] -> /usr/lib64/dovecot/libdovecot.so.0(+0x10025b) [0x7f61bba5825b] -> /usr/lib64/dovecot/libdovecot.so.0(+0x100297) [0x7f61bba58297] -> /usr/lib64/dovecot/libdovecot.so.0(+0x59bc6) [0x7f61bb9b1bc6] -> /usr/lib64/dovecot/libdovecot-storage.so.0(+0x4779e) [0x7f61bbb6579e] -> /usr/lib64/dovecot/lib21_fts_solr_plugin.so(+0x5849) [0x7f61bb5b7849] -> /usr/lib64/dovecot/lib20_fts_plugin.so(fts_backend_lookup+0x51) [0x7f61bb1d9491] -> /usr/lib64/dovecot/doveadm/lib20_doveadm_fts_plugin.so(+0x3280) [0x7f61bb14b280] -> doveadm(+0x343cd) [0x55f5def873cd] -> doveadm(+0x34fe0) [0x55f5def87fe0] -> doveadm(doveadm_cmd_ver2_to_mail_cmd_wrapper+0x22d) [0x55f5def88e2d] -> doveadm(doveadm_cmd_run_ver2+0x4e8) [0x55f5def998d8] -> doveadm(doveadm_cmd_try_run_ver2+0x3e) [0x55f5def9992e] -> doveadm(main+0x1d4) [0x55f5def77cf4] -> /lib64/libc.so.6(__libc_start_main+0xf2) [0x7f61bb613042] -> doveadm(_start+0x2e) [0x55f5def781ce] Aborted
Hopefully dovecot devs might comment further.
I'll see what I find with using the ICU filters -- if perhaps anything changes
Hi!
I can reproduce your problem with the fts lookup
command. Luckily it's equivalent to running doveadm search
. I'll open a bug about this.
Dovecot FTS tokenization is not done, unless you have use_libfts
in fts_solr setting, in your case
fts_solr = url=https://solr.example.com:8984/solr/dovecot/ use_libfts
Without this, everything is sent to to solr as-is, which is then expected to do all the work.
Aki