FTS config - cannot search substrings
Hi,
I have an issue with FTS, maybe someone here can help me out.
Basically my setup is working with solr + tika enabled, however, it seems my search is not configured the way I expect it. Eg when I receive a pdf document containing the word "myfoobar" I cannot search for "myfoo" or "foobar", but searching for "myfoobar" shows the right result.
Any ideas? Thanks!
sometimes ignoring the problem for a few moments and then starting over again helps ...
the solution was already solved here in a mail over 10 years ago - https://dovecot.org/list/dovecot/2011-May/059338.html
Thanks Daniel Miller!
On 18.11.21 23:58, infoomatic wrote:
Hi,
I have an issue with FTS, maybe someone here can help me out.
Basically my setup is working with solr + tika enabled, however, it seems my search is not configured the way I expect it. Eg when I receive a pdf document containing the word "myfoobar" I cannot search for "myfoo" or "foobar", but searching for "myfoobar" shows the right result.
Any ideas? Thanks!
On 11/18/2021 4:43 PM, infoomatic wrote:
sometimes ignoring the problem for a few moments and then starting over again helps ...
the solution was already solved here in a mail over 10 years ago - https://dovecot.org/list/dovecot/2011-May/059338.html
Thanks Daniel Miller!
There's something to keep in mind here. This is the critical piece that makes the substring search work:
<filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="15"/>
Because the minimum size is 3, that config means that it is impossible to search for one or two letter words -- terms shorter than minGramSize are dropped. This could affect the quality of search results.
If you are running at least version 7.4.0 of Solr, you can add this to the config for that filter so that isn't a problem:
preserveOriginal="true"
https://javadoc.io/static/org.apache.lucene/lucene-analyzers-common/7.4.0/or...
Thanks, Shawn
participants (2)
-
infoomatic
-
Shawn Heisey