Different handling of upper and lower case while indexing/searching with Solr
Patrik Peng
patrik.peng at hostpoint.ch
Wed Feb 9 11:31:23 UTC 2022
Woops, this time with better formatting.
On 09.02.22 12:21, Patrik Peng wrote:
>
> Hello there
>
> We stumbled upon an user account with Solr FTS, which returned no
> search results for any given search query.
> Further investigation revealed an issue between indexing mails and
> querying the index.
> The user name contains upper and lower case characters (eg.
> Some.User at domain.net).
>
> When new mail is indexed for this user, the user name used for Solr's
> `user` and `id` fields are transformed into lowercase, as shown in the
> Solr log:
>
> webapp=/solr path=/update
> params={...}{add=[8543/426f3b0348d03451a3fb00008ba2b673/some.user at domain.net
> (1724281617442144256), ... (162 adds)]} 0 44298
>
> And can be confirmed by manually querying Solr. The Solr schema in use
> performs no transformation for the affected fields.
> When a search request is performed via IMAP, Dovecot queries Solr with
> the original user name:
>
> GET
> /solr/dovecot_fts_popimap/select?wt=json&f...&fq=%2Bbox:1a30ec359dce3451b8e600008ba2b673+%2Buser:Some.User at domain.net
> HTTP/1.1"
>
> Which (correctly) returns zero results.
>
> To summarize, I suspect dovecot transforms any user name to lower case
> while indexing mails, but not when querying for results.
>
> Is this a bug, or caused by misconfiguration?
>
> Regards
> Patrik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dovecot.org/pipermail/dovecot/attachments/20220209/824475fd/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://dovecot.org/pipermail/dovecot/attachments/20220209/824475fd/attachment-0001.sig>
More information about the dovecot
mailing list