On 8/24/2021 7:19 PM, Steve Dondley wrote:
THE PROBLEM: When I do a full text search through all my inbox and all subfolders on a single word, search results are returned in about 10 to 15 seconds. This is better than the 40 seconds or so I'm getting when I turn off the fts and fts_solr plugins but still a little disappointing.
I did some experimenting. I noticed that if the word I'm searching on is fairly rare, results will pop up quickly, like in around 3 to 5 seconds. Words that don't exist at all in any email returns nothing almost instantly.
But words that appear in several hundred emails are the ones that are take a much longer time.
This is offtopic for this list, but I will try to help you. If I am unsuccessful, you should raise the issue on the solr-users mailing list.
How much of the total server memory of 4GB did you give to Solr for its heap? Is there other software running on that server besides Solr?
What's the total size of all the Solr indexes on the Solr server?
Can you get the screenshot mentioned at the following URL, put it on a file-sharing site, and give me the URL?
https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#Sol...
(disclaimer: I wrote that Solr wiki page)
You should read the entire page, the link above is to the section describing useful screenshots)
General note: A Solr search that takes 3 seconds (let alone 15) would have me concerned. If the system is sized appropriately, I would expect a search even on a massive index to complete in less than a second.
I happen to be using Solr for dovecot myself. If I search my index for "the" which is very common in English text, the query takes 19 milliseconds, and that is searching on five fields, as well as doing a facet on the user field. My Solr index has 150048 messages (122K of those are in my personal mailbox) and takes up 628 megabytes of disk space. The total size of the email that is indexed is 7 gigabytes.
|<str name="parsedquery_toString">+(cc:the | from:the | to:the | body:the | subject:the)</str> |
My index is using the stopword filter but the list of stopwords is empty.
The following response may interest you:
This is a search for "a" which I had run several times, so Solr was serving it from its cache, and this time it only took 6 milliseconds. It also shows what a facet can do. The longest time I got for the "a" search was 15 milliseconds, before the query was in the cache.
Thanks, Shawn