[Bug] FTS double escaping
azurit at pobox.sk
azurit at pobox.sk
Thu Apr 6 14:58:12 EEST 2017
Hi,
i'm trying to resolve few problems with indexing 'From' headers using
FTS/Solr. I was tcpdumping the communication between Dovecot and
Jetty/Solr and noticed that 'From' headers, which includes also
sender's name, are double escaped. This is what was Dovecot sending to
Solr:
</field><field name="from">Name Surname
<test at example.com></field></doc></add>
As you can see, characters < and > were escaped to < and > which
were, again, escaped to < and >. This is doing problems
while trying to index whole e-mail address, as Solr sees it as
'<test at example.com>'.
I spend hours trying to figure out why i'm able to search in all parts
of e-mail addresses but searching for full and exact e-mail address
was successfull ONLY for messages which doesn't include sender's name
in 'From' header. Finally, after i found this bug, this fixed all
search problems:
<filter class="solr.PatternReplaceFilterFactory" pattern="<"
replacement=""/>
<filter class="solr.PatternReplaceFilterFactory" pattern=">"
replacement=""/>
I hope that, at least, this bug, reported by me, will be fixed. Thank you.
azur
More information about the dovecot
mailing list