<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body style='font-size: 9pt; font-family: Verdana,Geneva,sans-serif'>
<p>Hi</p>
<p>This is the summary of my work with SOLR-Dovecot, in my <strong>quest to reproduce the previoulsy excellent work of fts_squat</strong></p>
<p><br /></p>
<p>@Aki : Based on the time I have spent on this, I would love to see you updating the Wiki with those improvements, and adding my name somewhere</p>
<p>@All : Hope it helps</p>
<p><br /></p>
<p><strong>- Installation:</strong></p>
<p>-> Create a clean install using the default, (at least in the Archlinux package), and do a "sudo -u solr solr create -c dovecot ". The config files are then in /opt/solr/server/solr/dovecot/conf and datafiles in /opt/solr/server/solr/dovecot/data</p>
<p>-> In /opt/solr/server/solr/dovecot/conf/solrconfig.xml:</p>
<p> * around line 313, change <openSearcher>false</openSearcher> to <openSearcher>true</openSearcher></p>
<p> * around line 147, set <writeLockTimeout>2000</writeLockTimeout> (or above)</p>
<p> * around line 1127, before <updateProcessor class="solr.UUIDUpdateProcessorFactory" name="uuid"/>, add <schemaFactory class="ClassicIndexSchemaFactory"></schemaFactory></p>
<p> * around line 1161, delete the whole <updateProcessor class="solr.AddSchemaFieldsUpdateProcessorFactory" name="add-schema-fields"></p>
<p> * around line 1192, remove the whole <updateRequestProcessorChain name="add-unknown-fields-to-the-schema" ... /></p>
<p>-> Remove /opt/solr/server/solr/dovecot/conf/managed-schema</p>
<p>-> Change "schema.xml" by the one below to reproduce fts_squat behavior (equivalent to " fts_squat = partial=3 full=25" in dovecot.conf) (note : such a huge trouble to replace a single line setup, anyway...)</p>
<p>-> Move /opt/solr/server/solr (or the subfolder data) to a partition with *space*, ideally ext4 or faster file system (it looks like Solr is not considering using a simple mysql database, which would make sense to avoid all the fuzz and let it transit to a non-java state, but that is another story)</p>
<p>-> Config of dovecot.conf is as below</p>
<p>-> The systemd unit shall specify high ulimit for files and proc (see below)</p>
<p>-> Increase the memory available for the JavaVM (I put 12Gb as I have quite a space on my server, but you may adapt it as per your specs) : in /opt/solr/bin/solr.in.sh, set SOLR_HEAP="12288m"</p>
<p>-> As Solr is complaining a lot, you may consider a filter for it in your syslog-ng or journald as it pollutes greatly your audit files</p>
<p>-> (re)Start solr (first) and dovecot by systemctl</p>
<p>-> Launch redindex ( doveadm fts rescan -u <username> )</p>
<p>-> wait for a big while to let the system re-index all your mail boxes</p>
<p><br /></p>
<p><strong>- Bugs so far</strong></p>
<p>-> Line 620 of fts_solr dovecot plugin : the size oof header is improperly calculated ("huge header" warning for a simple email, which kilss the index of that considered email, so basically MOST emails as the calculation is wrong)</p>
<p>-> The UID returned by SOlr is to be considered as a STRING (and that is maybe the source of problem of the "out of bound" errors in fts_solr dovecot, as "long" is not enough)</p>
<p>-> Java errors : A lot of non sense for me, I am not expert in Java. But, with increased memory, it seems not crashing, even if complaining quite a lot in the logs</p>
<p><br /></p>
<p><br /></p>
<p><br /></p>
<p><strong>-------SCHEMA.XML in /opt/solr/server/solr/dovecot/conf</strong></p>
<p><?xml version="1.0" encoding="UTF-8"?><br /><schema name="dovecot" version="2.0"><br /><uniqueKey>id</uniqueKey><br /><fieldType name="dovecottext" class="solr.TextField" autoGeneratePhraseQueries="true" positionIncrementGap="100"><br /><analyzer type="index"><br /><tokenizer class="solr.ClassicTokenizerFactory"/><br /><filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" splitOnCaseChange="1" generateWordParts="1" splitOnNumerics="1" catenateAll="1" catenateWords="1" preserveOriginal="1"/><br /><filter class="solr.FlattenGraphFilterFactory"/><br /><filter class="solr.LowerCaseFilterFactory"/><br /><filter class="solr.TrimFilterFactory"/><br /><filter class="solr.RemoveDuplicatesTokenFilterFactory"/><br /></analyzer><br /><analyzer type="query"><br /><tokenizer class="solr.KeywordTokenizerFactory"/><br /><filter class="solr.LowerCaseFilterFactory"/><br /><filter class="solr.TrimFilterFactory"/><br /><filter class="solr.RemoveDuplicatesTokenFilterFactory"/><br /></analyzer><br /></fieldType><br /><fieldType name="dovecotfield" class="solr.TextField" autoGeneratePhraseQueries="true"><br /><analyzer type="index"><br /><tokenizer class="solr.ClassicTokenizerFactory"/><br /><filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="25"/><br /><filter class="solr.TrimFilterFactory"/><br /><filter class="solr.LowerCaseFilterFactory"/><br /><filter class="solr.RemoveDuplicatesTokenFilterFactory"/><br /></analyzer><br /><analyzer type="query"><br /><tokenizer class="solr.KeywordTokenizerFactory"/><br /><filter class="solr.LowerCaseFilterFactory"/><br /><filter class="solr.TrimFilterFactory"/><br /><filter class="solr.RemoveDuplicatesTokenFilterFactory"/><br /></analyzer><br /></fieldType></p>
<p><fieldType name="string" class="solr.StrField"/><br /><field name="_version_" type="string" indexed="true" stored="true"/><br /><field name="bcc" type="string" indexed="false" stored="false"/><br /><field name="body" type="dovecottext" indexed="true" stored="false"/><br /><field name="box" type="string" indexed="true" required="true" stored="true"/><br /><field name="cc" type="dovecotfield" indexed="true" stored="false"/><br /><field name="from" type="dovecotfield" indexed="true" stored="false"/><br /><field name="hdr" type="string" indexed="false" stored="false"/><br /><field name="id" type="string" indexed="true" required="true" stored="true"/><br /><field name="subject" type="dovecottext" indexed="true" stored="false"/><br /><field name="to" type="dovecotfield" indexed="true" stored="false"/><br /><field name="uid" type="string" indexed="true" required="true" stored="true"/><br /><field name="user" type="string" indexed="true" required="true" stored="true"/><br /></schema></p>
<p><br /></p>
<p><strong>-- DOVECOT.CONF</strong></p>
<p>mail_plugins = fts fts_solr</p>
<p>plugin {<br />plugin = fts fts_solr managesieve sieve</p>
<p>fts = solr<br />fts_autoindex = yes<br />fts_enforced = yes<br />fts_solr = url=http://127.0.0.1:8983/solr/dovecot/</p>
<p>(replace 127.0.0.1 by your solr server if you want to use an external server)<br />(...)</p>
<p>}</p>
<p><br /></p>
<p><br /></p>
<p><strong>-- /etc/systemd/system/multi-user.target.wants/solr.service</strong></p>
<p>[Unit]<br />Description=Solr full text search engine<br />After=network.target</p>
<p>[Service]<br />Type=simple<br />User=solr<br />Group=solr<br />PrivateTmp=yes<br />WorkingDirectory=/opt/solr<br /><strong>LimitNOFILE=65000</strong><br /><strong>LimitNPROC=65000</strong><br />ExecStart=/opt/solr/bin/solr start -f</p>
<p>[Install]<br />WantedBy=multi-user.target</p>
<p><br /></p>
</body></html>