[Bug] FTS double escaping

azurit at pobox.sk azurit at pobox.sk
Thu Apr 6 15:06:23 EEST 2017


Citát Aki Tuomi <aki.tuomi at dovecot.fi>:

> On 06.04.2017 14:58, azurit at pobox.sk wrote:
>> Hi,
>>
>> i'm trying to resolve few problems with indexing 'From' headers using
>> FTS/Solr. I was tcpdumping the communication between Dovecot and
>> Jetty/Solr and noticed that 'From' headers, which includes also
>> sender's name, are double escaped. This is what was Dovecot sending to
>> Solr:
>>
>> </field><field name="from">Name Surname
>> &lt;test at example.com&gt;</field></doc></add>
>>
>> As you can see, characters < and > were escaped to < and > which
>> were, again, escaped to &lt; and &gt;. This is doing problems
>> while trying to index whole e-mail address, as Solr sees it as
>> '<test at example.com>'.
>>
>> I spend hours trying to figure out why i'm able to search in all parts
>> of e-mail addresses but searching for full and exact e-mail address
>> was successfull ONLY for messages which doesn't include sender's name
>> in 'From' header. Finally, after i found this bug, this fixed all
>> search problems:
>>
>> <filter class="solr.PatternReplaceFilterFactory" pattern="&lt;"
>> replacement=""/>
>> <filter class="solr.PatternReplaceFilterFactory" pattern="&gt;"
>> replacement=""/>
>>
>> I hope that, at least, this bug, reported by me, will be fixed. Thank
>> you.
>>
>> azur
>
> Hi!
>
> Which dovecot version was this?
>
> Aki



Sorry, forgot to mention it, 2.2.27, Debian Jessie (backports), 64bit.




More information about the dovecot mailing list