Solr FTS - when does indexing happen?
I have Solr FTS on my dovecot install. I followed the instructions on the dovecot wiki.
How long a delay should I expect to see between new mail being delivered with the dovecot LDA and an indexing request sent to Solr? Because I get a LOT of email from various mailing lists, and I do not see any activity in Solr's log. When I did doveadm index -A -q '*' there was a lot of indexing activity in Solr's log, as expected.
One time I looked at the Solr index and it had been 23 hours since it's last update ... I can guarantee that I received a lot of new messages in that time.
What do I need to look at for further troubleshooting?
I can confirm that when I issue a search in the TypeApp app on my phone (an IMAP app for android), I see the query in Solr's logfile.
Thanks, Shawn
On 2021-09-03 12:43 PM, Shawn Heisey wrote:
I have Solr FTS on my dovecot install. I followed the instructions on the dovecot wiki.
How long a delay should I expect to see between new mail being delivered with the dovecot LDA and an indexing request sent to Solr? Because I get a LOT of email from various mailing lists, and I do not see any activity in Solr's log. When I did doveadm index -A -q '*' there was a lot of indexing activity in Solr's log, as expected.
One time I looked at the Solr index and it had been 23 hours since it's last update ... I can guarantee that I received a lot of new messages in that time.
What do I need to look at for further troubleshooting?
I can confirm that when I issue a search in the TypeApp app on my phone (an IMAP app for android), I see the query in Solr's logfile.
Thanks, Shawn
DISCLAIMER: I've only set up solr once with dovecot so take these words with a grain of salt.
As I recall, indexing an email is triggered immediately when an email is received if you have you dovecot settings set properly to trigger the indexing. The dovecot documentation for FTS, it spells it out.
See https://doc.dovecot.org/configuration_manual/fts/solr/?highlight=fts%20user%...
There is an autoindex setting that neeeds to be set to "yes".
On 9/4/2021 4:06 PM, Steve Dondley wrote:
As I recall, indexing an email is triggered immediately when an email is received if you have you dovecot settings set properly to trigger the indexing. The dovecot documentation for FTS, it spells it out.
See https://doc.dovecot.org/configuration_manual/fts/solr/?highlight=fts%20user%...
There is an autoindex setting that neeeds to be set to "yes".
I see something talking about autoindex, but it does not have an example so that I can see where it needs to go. I cannot work it out from what is there.
With a little googling, I was able to figure out where it needs to go. And now it acts like I was expecting.
Since most people will want fts_autoindex, the wiki page should include it in its example configuration that goes into 90-plugin.conf. Possibly better ... maybe it should default to "yes".
Thanks, Shawn
On 9/4/2021 4:52 PM, Shawn Heisey wrote:
I see something talking about autoindex, but it does not have an example so that I can see where it needs to go. I cannot work it out from what is there.
With a little googling, I was able to figure out where it needs to go. And now it acts like I was expecting.
Deletes are an interesting thing with autoindex. If I use the "Del" key in Thunderbird (which moves the message to the Trash), I see an immediate delete (from the original folder) and add (to the Trash folder) in Solr's log. And if I choose the "Empty Trash" option, I see those deletes in Solr's log immediately.
But if I press Shift-Del in Thunderbird (which immediately deletes the message, bypassing Trash), then it takes about 15 seconds before the Solr log shows the delete request. Is that expected? It's not causing me any problems, as it's highly unlikely that I'm going to do a query matching a message that I deleted ten seconds ago. I can stand to wait 15 seconds for the index to be updated.
Dovecot version is 2:2.3.16-2+ubuntu20.04, pulled from the Dovecot repository.
I have been doing some fiddling with the solrconfig and schema. I have more fields stored now -- added from, to, and subject. I couldn't tell what the matching messages were when accessing Solr directly.
I also implemented TrimFieldUpdateProcessorFactory which trims leading and trailing whitespace from fields before they are indexed. I happened to notice that some of the new stored fields I added had EOL characters in them (not sure if it was \n or \r\n).
IMHO, a rather glaring omission from the fields in Solr is a timestamp/date field. Does dovecot's FTS have the ability to send that data? I know that Dovecot might not use it, but it would be a very useful thing to have for querying the dovecot index from something other than dovecot. Not something I *NEED*, just nice to have. I haven't looked at the fts or fts_solr code.
Thanks, Shawn
Since most people will want fts_autoindex, the wiki page should include it in its example configuration that goes into 90-plugin.conf. Possibly better ... maybe it should default to "yes".
It's probably a safe bet the developers, who are experts on these systems, probably have good reason not to make autoindexing the default.
participants (2)
-
Shawn Heisey
-
Steve Dondley