Re: v2.3.11.3 solr plugin search via MUA fails to match accented ascii characters; cmd line exec of `doveadm fts lookup` PANICs (assertion failed) [proposed patch]
On 02/11/2020 15:11 PGNet Dev <pgnet.dev@gmail.com> wrote:
On 11/2/20 12:44 AM, Aki Tuomi wrote:
you should try removing use_libfts from your config line and let solr do that part.
sry, i'm a bit confused.
you'd suggested I _add_ it,
https://dovecot.org/pipermail/dovecot/2020-October/120258.html
I can reproduce your problem with the
fts lookup
command. Luckily it's equivalent to runningdoveadm search
. I'll open a bug about this. Dovecot FTS tokenization is not done, unless you haveuse_libfts
in fts_solr setting, in your case fts_solr = url=https://solr.example.com:8984/solr/dovecot/ use_libfts Without this, everything is sent to to solr as-is, which is then expected to do all the work.So what's the recommendation? use use_libfts, or not?
It's a choice. You can let solr perform the tokenization etc. or you can let dovecot do it. There is no recommendation when using solr.
It seems that use_libfts is broken with solr due to reasons, so I guess the only option for now is not to use it.
Aki
On 11/2/20 5:13 AM, Aki Tuomi wrote:
So what's the recommendation? use use_libfts, or not?
It's a choice. You can let solr perform the tokenization etc. or you can let dovecot do it. There is no recommendation when using solr.
atm, my fts plugin conf is
plugin {
fts = solr
fts_solr = url=https://solr.example.com:8984/solr/dovecot/ use_libfts soft_commit=yes batch_size=250
fts_autoindex = yes
fts_autoindex_max_recent_msgs = 999
fts_autoindex_exclude = \Junk
fts_autoindex_exclude2 = \Trash
fts_enforced = yes
fts_filters = normalizer-icu snowball stopwords
fts_filters_en = lowercase snowball english-possessive stopwords
fts_languages = en es de fr it pt
fts_language_config = /usr/share/libexttextcat/fpdb.conf
fts_tokenizers = generic email-address
fts_tokenizer_generic = algorithm=simple
}
It seems that use_libfts is broken with solr due to reasons, so I guess the only option for now is not to use it.
if I
fts_solr = url=https://solr.example.com:8984/solr/dovecot/ use_libfts soft_commit=yes batch_size=250
fts_solr = url=https://solr.example.com:8984/solr/dovecot/ soft_commit=yes batch_size=250
how much of that^ config needs to be *removed* &/or simply stops functioning?
from its introduction
dovecot-2.2: fts-solr: fts_solr=use_libfts send data to Solr via...
https://dovecot.org/list/dovecot-cvs/2015-April/025715.html
fts-solr: fts_solr=use_libfts send data to Solr via space-separated tokens.In this case Solr should be configured to not do any kind of filtering anduse only WhitespaceTokenizerFactory
it's unclear to me what the effect of NOT using it is.
Reading
https://doc.dovecot.org/configuration_manual/fts/?highlight=fts%20solr%20plugin#dovecot-fts-architecture
https://doc.dovecot.org/configuration_manual/fts/tokenization/#fts-tokenization
refers to all of
fts_languages
fts_tokenizers
fts_tokenizer_generic
fts_filters
fts_filters_en
WithOUT 'use_libfts' which of those^ need modification/removal from dovecot config?
participants (2)
-
Aki Tuomi
-
PGNet Dev