[Bug] FTS double escaping
azurit at pobox.sk
azurit at pobox.sk
Sun Apr 9 21:06:46 EEST 2017
Citát azurit at pobox.sk:
> Citát Timo Sirainen <tss at iki.fi>:
>
>> On 6 Apr 2017, at 14.58, azurit at pobox.sk wrote:
>>>
>>> Hi,
>>>
>>> i'm trying to resolve few problems with indexing 'From' headers
>>> using FTS/Solr. I was tcpdumping the communication between Dovecot
>>> and Jetty/Solr and noticed that 'From' headers, which includes
>>> also sender's name, are double escaped. This is what was Dovecot
>>> sending to Solr:
>>>
>>> </field><field name="from">Name Surname
>>> <test at example.com></field></doc></add>
>>>
>>> As you can see, characters < and > were escaped to < and >
>>> which were, again, escaped to < and >. This is doing
>>> problems while trying to index whole e-mail address, as Solr sees
>>> it as '<test at example.com>'.
>>>
>>> I spend hours trying to figure out why i'm able to search in all
>>> parts of e-mail addresses but searching for full and exact e-mail
>>> address was successfull ONLY for messages which doesn't include
>>> sender's name in 'From' header. Finally, after i found this bug,
>>> this fixed all search problems:
>>>
>>> <filter class="solr.PatternReplaceFilterFactory"
>>> pattern="<" replacement=""/>
>>> <filter class="solr.PatternReplaceFilterFactory"
>>> pattern=">" replacement=""/>
>>>
>>> I hope that, at least, this bug, reported by me, will be fixed. Thank you.
>>
>> The attached patch should also help.
>
>
> Works fine, thank you!
Will this fix gets into 2.2.29?
More information about the dovecot
mailing list