Solr connection timeout hardwired to 60s

Shawn Heisey elyograg at elyograg.org
Fri Apr 5 04:11:29 EEST 2019


On 4/4/2019 6:42 PM, M. Balridge via dovecot wrote:
> What is a general rule of thumb for RAM and SSD disk requirements as a
> fraction of indexed document hive size to keep query performance at 200ms or
> less? How do people deal with the JAVA GC world-stoppages, other than simply
> doubling or tripling every instance?

There's no hard and fast rule for exactly how much memory you need for a 
search engine.  Some installs work well with half the index cached, 
others require more, some require less.

For ideal performance, you should have enough memory over and above your 
program requirements to cache the entire index.  That can be problematic 
with indexes that are hundreds of gigabytes, or even terabytes. 
Achieving the ideal is rarely necessary, though.

With a large enough heap, it is simply impossible to avoid long 
stop-the-world GC.  With proper tuning, those full garbage collections 
can happen far less frequently.  I've got another page about that.

https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr

To handle extremely large indexes with good performance, I would 
recommend many servers running SolrCloud, and a sharded index.  That way 
each individual server will not be required to handle terabytes of data. 
  This can get very expensive very quickly.  You will also need a load 
balancer, to eliminate single points of failure.

> I am wondering how well alternatives to Solr work in these situations
> (ElasticSearch, Xapian, and any others I may have missed).

Assuming they are configured as similarly as possible, ElasticSearch and 
Solr will have nearly identical requirements, and perform similarly to 
each other.  They are both Lucene-based, and it is Lucene that primarily 
drives the requirements.  I know nothing about any other solutions.

With the extremely large index you have described, memory will be your 
achilles heel no matter what solution you find.

It is not Java that needs the extreme amounts of memory for very large 
indexes.  It is the operating system -- the disk cache.  You might also 
need a fairly large heap, but the on-disk size of the index will have 
less of an impact on heap requirements than the number of documents in 
the index.

Thanks,
Shawn


More information about the dovecot mailing list