[Bug] FTS double escaping

Timo Sirainen tss at iki.fi
Sun Apr 9 16:05:41 EEST 2017


On 6 Apr 2017, at 14.58, azurit at pobox.sk wrote:
> 
> Hi,
> 
> i'm trying to resolve few problems with indexing 'From' headers using FTS/Solr. I was tcpdumping the communication between Dovecot and Jetty/Solr and noticed that 'From' headers, which includes also sender's name, are double escaped. This is what was Dovecot sending to Solr:
> 
> </field><field name="from">Name Surname &lt;test at example.com&gt;</field></doc></add>
> 
> As you can see, characters < and > were escaped to < and > which were, again, escaped to &lt; and &gt;. This is doing problems while trying to index whole e-mail address, as Solr sees it as '<test at example.com>'.
> 
> I spend hours trying to figure out why i'm able to search in all parts of e-mail addresses but searching for full and exact e-mail address was successfull ONLY for messages which doesn't include sender's name in 'From' header. Finally, after i found this bug, this fixed all search problems:
> 
> <filter class="solr.PatternReplaceFilterFactory" pattern="&lt;" replacement=""/>
> <filter class="solr.PatternReplaceFilterFactory" pattern="&gt;" replacement=""/>
> 
> I hope that, at least, this bug, reported by me, will be fixed. Thank you.

The attached patch should also help.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: solr.diff
Type: application/octet-stream
Size: 843 bytes
Desc: not available
URL: <http://dovecot.org/pipermail/dovecot/attachments/20170409/8aa32e7d/attachment.obj>


More information about the dovecot mailing list