Solr - complete setup (update)

Stephan Bosch stephan at rename-it.nl
Wed Jan 30 01:33:20 EET 2019


(forgot to CC mailing list)

Op 26/01/2019 om 20:07 schreef Joan Moreau via dovecot:
>>
>>
>> *- Bugs so far*
>>
>> -> Line 620 of fts_solr dovecot plugin : the size oof header is 
>> improperly calculated ("huge header" warning for a simple email, 
>> which kilss the index of that considered email, so basically MOST 
>> emails as the calculation is wrong)
> *You can check that regularly in dovecot log file. My guess is the mix 
> of Unicode which is not properly addressed here.*

Does this happen with specific messages? Do you have a sample message 
for me? I don't see how Unicode could cause this.

>>
>> -> The UID returned by SOlr is to be considered as a STRING (and that 
>> is maybe the source of problem of the "out of bound" errors in 
>> fts_solr dovecot, as "long" is not enough)
> *This is just highly visible in Solr schema.xml. Swithcing it to 
> "long" in schema.xml returns plenty of errors.*

I cannot reproduce this so far (see modified schema below). In a simple 
test I just get the desired results and no errors logged.

>>
>> -> Java errors : A lot of non sense for me, I am not expert in Java. 
>> But, with increased memory, it seems not crashing, even if 
>> complaining quite a lot in the logs
>>
>> Can you elaborate on the errors you have seen so far? When do these 
>> happen? How can I reproduce them?
>>
> *Honestly, I have no clue what the problems are. I just increased the 
> memory of the JVM and the systems stopped crashing. Log files are huge 
> anyway.*

What errors do you see? I see only INFO entries in my 
/var/solr/logs/solr.log. Looks like Solr is pretty verbose by default 
(lots of INFO output), but there must be a way to reduce that.

Regards,

Stephan.


<?xml version="1.0" encoding="UTF-8"?>
<schema name="dovecot" version="2.0">
<uniqueKey>id</uniqueKey>
<fieldType name="long" class="solr.LongPointField" 
positionIncrementGap="0"/>
<fieldType name="dovecottext" class="solr.TextField" 
autoGeneratePhraseQueries="true" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.ClassicTokenizerFactory"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" 
generateNumberParts="1" splitOnCaseChange="1" generateWordParts="1" 
splitOnNumerics="1" catenateAll="1" catenateWords="1" preserveOriginal="1"/>
<filter class="solr.FlattenGraphFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="dovecotfield" class="solr.TextField" 
autoGeneratePhraseQueries="true">
<analyzer type="index">
<tokenizer class="solr.ClassicTokenizerFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="25"/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>

<fieldType name="string" class="solr.StrField"/>
<field name="_version_" type="string" indexed="true" stored="true"/>
<field name="bcc" type="string" indexed="false" stored="false"/>
<field name="body" type="dovecottext" indexed="true" stored="false"/>
<field name="box" type="string" indexed="true" required="true" 
stored="true"/>
<field name="cc" type="dovecotfield" indexed="true" stored="false"/>
<field name="from" type="dovecotfield" indexed="true" stored="false"/>
<field name="hdr" type="string" indexed="false" stored="false"/>
<field name="id" type="string" indexed="true" required="true" 
stored="true"/>
<field name="subject" type="dovecottext" indexed="true" stored="false"/>
<field name="to" type="dovecotfield" indexed="true" stored="false"/>
<field name="uid" type="long" indexed="true" required="true" stored="true"/>
<field name="user" type="string" indexed="true" required="true" 
stored="true"/>
</schema>






More information about the dovecot mailing list