<!doctype html>
<html>
<head>
<meta charset="UTF-8">
</head>
<body>
<div>
<br>
</div>
<blockquote type="cite">
<div>
On 14 April 2019 16:59 John Fawcett via dovecot <
<a href="mailto:dovecot@dovecot.org">dovecot@dovecot.org</a>> wrote:
</div>
<div>
<br>
</div>
<div>
<br>
</div>
<div>
On 13/04/2019 17:16, Shawn Heisey via dovecot wrote:
</div>
<blockquote type="cite">
<div>
On 4/13/2019 4:29 AM, John Fawcett via dovecot wrote:
</div>
<blockquote type="cite">
<div>
If this value was made configurable people could set it to what they
</div>
<div>
want. However the underlying problem is likely on solr configuration.
</div>
</blockquote>
</blockquote>
<blockquote type="cite">
<div>
The Jetty that is included in Solr has its idle timeout set to 50
</div>
<div>
seconds. But in practice, I have not seen this timeout trigger ...
</div>
<div>
and if the OP is seeing a 60 second timeout, then the 50 second idle
</div>
<div>
timeout in Jetty must not be occurring.
</div>
</blockquote>
<blockquote type="cite">
<div>
There may be a socket timeout configured on inter-server requests --
</div>
<div>
distributed queries or the load balancing that SolrCloud does. I can
</div>
<div>
never remember whether this is the case by default. I think it is.
</div>
</blockquote>
<div>
>> If there is an issue on initial indexing, where you are not really
</div>
<div>
>> concerned about qucik visibility but just getting things into the index
</div>
<div>
>> efficiently, a better approach would be for dovecot plugin not to send
</div>
<div>
>> any commit or softCommit (or waitSearcher either) and that should speed
</div>
<div>
>> things up. You'd need to configure solr with a long autoSoftCommit
</div>
<div>
>> maxTime and a reasonable autoCommit maxTime, which you could then
</div>
<div>
>> reconfigure when the load was done.
</div>
<div>
>
</div>
<blockquote type="cite">
<div>
Solr ships with autoCommit set to 15 seconds and openSearcher set to
</div>
<div>
false on the autoCommit. The autoSoftCommit setting is not enabled by
</div>
<div>
default, but depending on how the index was created, Solr might try to
</div>
<div>
set autoSoftCommit to 3 seconds ... which is WAY too short.
</div>
</blockquote>
<blockquote type="cite">
<div>
I will usually increase the autoCommit time to 60 seconds, just to
</div>
<div>
reduce the amount of work that Solr is doing. The autoSoftCommit
</div>
<div>
time, if it is used, should be set to a reasonably long value ...
</div>
<div>
values between two and five minutes would be good. Attempting to use
</div>
<div>
a very short autoSoftCommit time will usually lead to problems.
</div>
</blockquote>
<blockquote type="cite">
<div>
This thread says that dovecot is sending explicit commits. One thing
</div>
<div>
that might be happening to exceed 60 seconds is an extremely long
</div>
<div>
commit, which is usually caused by excessive cache autowarming, but
</div>
<div>
might be related to insufficient memory. The max heap setting on an
</div>
<div>
out-of-the-box Solr install (5.0 and later) is 512MB. That's VERY
</div>
<div>
small, and it doesn't take much index data before a much larger heap
</div>
<div>
is required.
</div>
</blockquote>
<blockquote type="cite">
<div>
Thanks,
</div>
<div>
Shawn
</div>
</blockquote>
<div>
I looked into the code (version 2.3.5.1): the fts-solr plugin is not
</div>
<div>
sending softCommit every 1000 emails. Emails from a single folder are
</div>
<div>
batched in up to a maximum of 1000 emails per request, but the
</div>
<div>
softCommit gets sent once per mailbox folder at the end of all the
</div>
<div>
requests for that folder.
</div>
<div>
<br>
</div>
<div>
I immagine that one of the reasons dovecot sends softCommits is because
</div>
<div>
without autoindex active and even if mailboxes are periodically indexed
</div>
<div>
from cron, the last emails received with be indexed at the moment of the
</div>
<div>
search. So while sending softCommit has the advantage of including
</div>
<div>
recent mails in searches, it means that softCommits are being done upon
</div>
<div>
user request. Frequency depends on user activity.
</div>
<div>
<br>
</div>
<div>
Going back to the original problem: seems the first advice to Peter is
</div>
<div>
to look into solr configuration as others have said.
</div>
<div>
<br>
</div>
<div>
From dovecot point of view I can see the following as potentially useful
</div>
<div>
features:
</div>
<div>
<br>
</div>
<div>
1) a configurable batch size would enable to tune the number of emails
</div>
<div>
per request and help stay under the 60 seconds hard coded http request
</div>
<div>
timeout. A configurable http timeout would be less useful, since this
</div>
<div>
will potentially run into other timeouts on solr side.
</div>
<div>
<br>
</div>
<div>
2) abilty to turn off softCommits so as to have a more predictable
</div>
<div>
softCommit workload. In that case autoSoftCommit should be configured in
</div>
<div>
solr. In order to minimize risk of recent emails not appearing in search
</div>
<div>
results, periodic indexing could be set up by cron.
</div>
<div>
<br>
</div>
<div>
I've attached a patch, any comments are welcome (especially about
</div>
<div>
getting settings from the backend context).
</div>
<div>
<br>
</div>
<div>
Example config
</div>
<div>
<br>
</div>
<div>
plugin {
</div>
<div>
fts = solr
</div>
<div>
fts_solr =
</div>
<div>
url=
<a href="https://user:password@solr.example.com:443/solr/dovecot/" rel="noopener" target="_blank">https://user:password@solr.example.com:443/solr/dovecot/</a>
</div>
<div>
batch_size=500 no_soft_commit
</div>
<div>
}
</div>
<div>
<br>
</div>
<div>
John
</div>
</blockquote>
<div>
<br>
</div>
<div>
Can you please open a pull request to https://github.com/dovecot/core ?
</div>
<div class="io-ox-signature">
<pre>---
Aki Tuomi</pre>
</div>
</body>
</html>