On 14/04/2019 17:16, Peter Mogensen via dovecot wrote:
sorry... I got distracted half way and forgot to put a meaningfull subject so the archive could figure out the thread. - resending.
On 4/14/19 4:04 PM, dovecot-request@dovecot.org wrote:
Solr ships with autoCommit set to 15 seconds and openSearcher set to false on the autoCommit.? The autoSoftCommit setting is not enabled by default, but depending on how the index was created, Solr might try to set autoSoftCommit to 3 seconds ... which is WAY too short. I just run with the default. 15s autoCommit and no autoSoftCommit
This thread says that dovecot is sending explicit commits.? I see explicit /update req. with softCommit and waitSearcer=true in a tcpdump.
One thing that might be happening to exceed 60 seconds is an extremely long commit, which is usually caused by excessive cache autowarming, but might be related to insufficient memory.? The max heap setting on an out-of-the-box Solr install (5.0 and later) is 512MB.? That's VERY small, and it doesn't take much index data before a much larger heap is required. I run with
SOLR_JAVA_MEM="-Xmx8g -Xms2g"
I looked into the code (version 2.3.5.1): This is 2.2.35. I haven't checked the source difference to 2.3.x I must admit.
I immagine that one of the reasons dovecot sends softCommits is because without autoindex active and even if mailboxes are periodically indexed from cron, the last emails received with be indexed at the moment of the search.? I expect that dovecot has to because of it's default behavior by only bringing the index up-to-date just before search. So it has towait for the index result to be available if there's been any new mails indexed.
- a configurable batch size would enable to tune the number of emails per request and help stay under the 60 seconds hard coded http request timeout. A configurable http timeout would be less useful, since this will potentially run into other timeouts on solr side. Being able to configure it is great. But I don't think it solves much. I recompiled with 100 as batch size and it still ended in timeouts. Then I recompiled with 10min timeout and now I see all the batches completing and their processesing time is mostly between 1 and 2 minutes (so all would have failed).
To me it looks like Solr just takes too long time to index. This is no small machine. It's a 20 core Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz and for this test it's not doing anything else, so I'm a bit surprised that even with only a few users this takes so long time.
/Peter
Peter
I suppose you could go with a batch size of 50. If it's linear, you could still keep under the default 60 seconds http request time :-)
I'm now testing with solr settings autoCommit 15 seconds, autoSoftCommit 60 seconds and sending no softCommits from dovecot and 500 batch size.
I've set up
/usr/local/bin/doveadm index -A "*"
in crontab every 5 minutes so indexes will stay mostly up to date to minimize amount of mail not already visible in the index when searches are done.
The solr server is a small test virtual machine with 0.2 (shared) vCPU and 0.6MB of memory and non SSD storage. It can index around 2000 emails per minute when there is no other activity. Average email size is about 45Kb. I'm not indexing attachments.
John