On Dec 11, 2010, at 4:42 AM, Stan Hoeppner wrote:
Tony Pyro put forth on 12/10/2010 4:29 PM:
Hello,
New subscriber here. I noticed that the FTS index is not used in compound searches. Is this expected? Tested in 2.0.0 and 2.0.8:
. search BODY "waldo"
- SEARCH . OK Search completed (0.000 secs). . SEARCH CHARSET UTF-8 OR SUBJECT "waldo" FROM "waldo"
- SEARCH . OK Search completed (1.768 secs). . SEARCH CHARSET UTF-8 OR SUBJECT "waldo" BODY "waldo"
- OK Searched 0% of the mailbox, ETA 9605:25
- OK Searched 4% of the mailbox, ETA 6:39
- OK Searched 6% of the mailbox, ETA 6:58
- OK Searched 8% of the mailbox, ETA 6:54
It's a problem for us because the Afterlogic webmail client does not offer a body-only search. The two search options are From + To + Subject, or "entire messages", which puts together a large OR query:
SRCH1069 SEARCH CHARSET UTF-8 OR (OR (OR FROM "waldo" TO "waldo") SUBJECT "waldo") BODY "waldo"
I also checked to see whether the header fields are included in the FTS index but it didn't appear so. I got more results from the search "TO gmail.com" than from "BODY gmail.com"
plugin { fts = squat } protocol imap { mail_plugins = " fts fts_squat" }
Was the above performed with a cold or hot Squat index. If cold, performance will always suck. That's the big downside of Squat. The indexes must be hot. Unless Timo fixed this in 2.0.x and I missed seeing the announcement.
My dovecot hardware is absolutely ancient, dual 500 MHz machine. Cold Squat searches on a 15K mbox mailbox take upwards of 1.5 minutes. With the index hot, any search on that same folder takes a fraction of a second.
I'm guessing your solution, which has been mentioned on list before, is to write a basic search script and run it nightly, twice a day, or more often, depending on your needs, to keep the index hot. Whenever Squat has to rebuilt the index, the initial search takes forever, often making it slower than not using Squat at all.
-- Stan
I had just built the index minutes before. It's not that FTS is slow overall. Look carefully at the three example searches -- the compound search using FTS plus a header field takes inordinately longer than either component search alone. It's as if the FTS is only used in the simplest of searches.
Incidentally, the same thing happens when combining two BODY searches (search OR BODY "waldo" BODY "carmen"). Repeating the search doesn't improve its performance.