[Bug] FTS double escaping
azurit at pobox.sk
azurit at pobox.sk
Sun Apr 9 20:52:10 EEST 2017
Citát Timo Sirainen <tss at iki.fi>:
> On 6 Apr 2017, at 14.58, azurit at pobox.sk wrote:
>>
>> Hi,
>>
>> i'm trying to resolve few problems with indexing 'From' headers
>> using FTS/Solr. I was tcpdumping the communication between Dovecot
>> and Jetty/Solr and noticed that 'From' headers, which includes also
>> sender's name, are double escaped. This is what was Dovecot sending
>> to Solr:
>>
>> </field><field name="from">Name Surname
>> <test at example.com></field></doc></add>
>>
>> As you can see, characters < and > were escaped to < and >
>> which were, again, escaped to < and >. This is doing
>> problems while trying to index whole e-mail address, as Solr sees
>> it as '<test at example.com>'.
>>
>> I spend hours trying to figure out why i'm able to search in all
>> parts of e-mail addresses but searching for full and exact e-mail
>> address was successfull ONLY for messages which doesn't include
>> sender's name in 'From' header. Finally, after i found this bug,
>> this fixed all search problems:
>>
>> <filter class="solr.PatternReplaceFilterFactory" pattern="<"
>> replacement=""/>
>> <filter class="solr.PatternReplaceFilterFactory" pattern=">"
>> replacement=""/>
>>
>> I hope that, at least, this bug, reported by me, will be fixed. Thank you.
>
> The attached patch should also help.
Works fine, thank you!
More information about the dovecot
mailing list