Solr - complete setup
Aki Tuomi
aki.tuomi at open-xchange.com
Fri Jan 4 11:19:26 EET 2019
We'll go thru all your updates, and try to update the wiki with what there is. Your effort is really appreciated here.
Aki
> On 04 January 2019 at 02:38 Joan Moreau via dovecot <dovecot at dovecot.org> wrote:
>
>
> Hi
>
> This is the summary of my work with SOLR-Dovecot, in my QUEST TO
> REPRODUCE THE PREVIOULSY EXCELLENT WORK OF FTS_SQUAT
>
> @Aki : Based on the time I have spent on this, I would love to see you
> updating the Wiki with those improvements, and adding my name somewhere
>
> @All : Hope it helps
>
> - INSTALLATION:
>
> -> Create a clean install using the default, (at least in the Archlinux
> package), and do a "sudo -u solr solr create -c dovecot ". The config
> files are then in /opt/solr/server/solr/dovecot/conf and datafiles in
> /opt/solr/server/solr/dovecot/data
>
> -> In /opt/solr/server/solr/dovecot/conf/solrconfig.xml:
>
> * around line 313, change <openSearcher>false</openSearcher> to
> <openSearcher>true</openSearcher>
>
> * around line 147, set <writeLockTimeout>2000</writeLockTimeout>
> (or above)
>
> * around line 1127, before <updateProcessor
> class="solr.UUIDUpdateProcessorFactory" name="uuid"/>, add
> <schemaFactory class="ClassicIndexSchemaFactory"></schemaFactory>
>
> * around line 1161, delete the whole <updateProcessor
> class="solr.AddSchemaFieldsUpdateProcessorFactory"
> name="add-schema-fields">
>
> * around line 1192, remove the whole <updateRequestProcessorChain
> name="add-unknown-fields-to-the-schema" ... />
>
> -> Remove /opt/solr/server/solr/dovecot/conf/managed-schema
>
> -> Change "schema.xml" by the one below to reproduce fts_squat behavior
> (equivalent to " fts_squat = partial=3 full=25" in dovecot.conf) (note :
> such a huge trouble to replace a single line setup, anyway...)
>
> -> Move /opt/solr/server/solr (or the subfolder data) to a partition
> with *space*, ideally ext4 or faster file system (it looks like Solr is
> not considering using a simple mysql database, which would make sense to
> avoid all the fuzz and let it transit to a non-java state, but that is
> another story)
>
> -> Config of dovecot.conf is as below
>
> -> The systemd unit shall specify high ulimit for files and proc (see
> below)
>
> -> Increase the memory available for the JavaVM (I put 12Gb as I have
> quite a space on my server, but you may adapt it as per your specs) : in
> /opt/solr/bin/solr.in.sh, set SOLR_HEAP="12288m"
>
> -> As Solr is complaining a lot, you may consider a filter for it in
> your syslog-ng or journald as it pollutes greatly your audit files
>
> -> (re)Start solr (first) and dovecot by systemctl
>
> -> Launch redindex ( doveadm fts rescan -u <username> )
>
> -> wait for a big while to let the system re-index all your mail boxes
>
> - BUGS SO FAR
>
> -> Line 620 of fts_solr dovecot plugin : the size oof header is
> improperly calculated ("huge header" warning for a simple email, which
> kilss the index of that considered email, so basically MOST emails as
> the calculation is wrong)
>
> -> The UID returned by SOlr is to be considered as a STRING (and that is
> maybe the source of problem of the "out of bound" errors in fts_solr
> dovecot, as "long" is not enough)
>
> -> Java errors : A lot of non sense for me, I am not expert in Java.
> But, with increased memory, it seems not crashing, even if complaining
> quite a lot in the logs
>
> -------SCHEMA.XML IN /OPT/SOLR/SERVER/SOLR/DOVECOT/CONF
>
> <?xml version="1.0" encoding="UTF-8"?>
> <schema name="dovecot" version="2.0">
> <uniqueKey>id</uniqueKey>
> <fieldType name="dovecottext" class="solr.TextField"
> autoGeneratePhraseQueries="true" positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.ClassicTokenizerFactory"/>
> <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1"
> generateNumberParts="1" splitOnCaseChange="1" generateWordParts="1"
> splitOnNumerics="1" catenateAll="1" catenateWords="1"
> preserveOriginal="1"/>
> <filter class="solr.FlattenGraphFilterFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.TrimFilterFactory"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.KeywordTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.TrimFilterFactory"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> </fieldType>
> <fieldType name="dovecotfield" class="solr.TextField"
> autoGeneratePhraseQueries="true">
> <analyzer type="index">
> <tokenizer class="solr.ClassicTokenizerFactory"/>
> <filter class="solr.NGramFilterFactory" minGramSize="3"
> maxGramSize="25"/>
> <filter class="solr.TrimFilterFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.KeywordTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.TrimFilterFactory"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> </fieldType>
>
> <fieldType name="string" class="solr.StrField"/>
> <field name="_version_" type="string" indexed="true" stored="true"/>
> <field name="bcc" type="string" indexed="false" stored="false"/>
> <field name="body" type="dovecottext" indexed="true" stored="false"/>
> <field name="box" type="string" indexed="true" required="true"
> stored="true"/>
> <field name="cc" type="dovecotfield" indexed="true" stored="false"/>
> <field name="from" type="dovecotfield" indexed="true" stored="false"/>
> <field name="hdr" type="string" indexed="false" stored="false"/>
> <field name="id" type="string" indexed="true" required="true"
> stored="true"/>
> <field name="subject" type="dovecottext" indexed="true" stored="false"/>
> <field name="to" type="dovecotfield" indexed="true" stored="false"/>
> <field name="uid" type="string" indexed="true" required="true"
> stored="true"/>
> <field name="user" type="string" indexed="true" required="true"
> stored="true"/>
> </schema>
>
> -- DOVECOT.CONF
>
> mail_plugins = fts fts_solr
>
> plugin {
> plugin = fts fts_solr managesieve sieve
>
> fts = solr
> fts_autoindex = yes
> fts_enforced = yes
> fts_solr = url=http://127.0.0.1:8983/solr/dovecot/
>
> (replace 127.0.0.1 by your solr server if you want to use an external
> server)
> (...)
>
> }
>
> -- /ETC/SYSTEMD/SYSTEM/MULTI-USER.TARGET.WANTS/SOLR.SERVICE
>
> [Unit]
> Description=Solr full text search engine
> After=network.target
>
> [Service]
> Type=simple
> User=solr
> Group=solr
> PrivateTmp=yes
> WorkingDirectory=/opt/solr
> LIMITNOFILE=65000
> LIMITNPROC=65000
> ExecStart=/opt/solr/bin/solr start -f
>
> [Install]
> WantedBy=multi-user.target
More information about the dovecot
mailing list