Dovecot - FTS Solr: disk usage & position information?

Vincent Brillault vincent.brillault at cern.ch
Wed Sep 1 12:27:11 EEST 2021


Dear all,

Just a status update, in case this can help others.

We went forward and disabled the position information indexing and the 
re-indexed of our mail data (over a couple of days to avoid overloading 
the systems). Before the re-indexing we had 1.33 TiB in our Solr 
Indexes. After re-indexation, we had only 542 GiB, that's a 60% of our 
storage requirements for our FTS indexes :)

So far, we haven't been reported any issue or measurable differences by 
our users concerning the quality of the FTS. From further debugging, as 
discussed on the solr-user mailing list 
(https://lists.apache.org/thread.html/rcdf8bb97be0839e57928ad5fa34501ec8a73392c11248db91206bc33%40%3Cusers.solr.apache.org%3E), 
I've come to the conclusion that, with the current integration between 
Dovecot and Solr (esp the fact that `"` is escaped), it's impossible to 
trigger phrase queries from user queries as long as 
autoGeneratePhraseQueries is false.

I've attached the schema.xml and solrconfig.xml we are now using with 
Solr 8.6.0, in case there is any interest from others. Let me know if 
you prefer a MR to update the xmls present in 
https://github.com/dovecot/core/tree/master/doc.

Cheers,
Vincent
-------------- next part --------------
A non-text attachment was scrubbed...
Name: solrconfig.xml
Type: text/xml
Size: 2856 bytes
Desc: not available
URL: <https://dovecot.org/pipermail/dovecot/attachments/20210901/cd1b4a30/attachment.xml>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: schema.xml
Type: text/xml
Size: 3478 bytes
Desc: not available
URL: <https://dovecot.org/pipermail/dovecot/attachments/20210901/cd1b4a30/attachment-0001.xml>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://dovecot.org/pipermail/dovecot/attachments/20210901/cd1b4a30/attachment.sig>


More information about the dovecot mailing list