Different handling of upper and lower case while indexing/searching with Solr
    Patrik Peng 
    patrik.peng at hostpoint.ch
       
    Wed Feb  9 11:21:19 UTC 2022
    
    
  
Hello there
We stumbled upon an user account with Solr FTS, which returned no search 
results for any given search query.
Further investigation revealed an issue between indexing mails and 
querying the index.
The user name contains upper and lower case characters (eg. 
Some.User at domain.net).
When new mail is indexed for this user, the user name used for Solr's 
`user` and `id` fields are transformed into lowercase, as shown in the 
Solr log:
webapp=/solr path=/update 
params={...}{add=[8543/426f3b0348d03451a3fb00008ba2b673/some.user at domain.net 
(1724281617442144256), ... (162 adds)]} 0 44298
And can be confirmed by manually querying Solr. The Solr schema in use 
performs no transformation for the affected fields.
When a search request is performed via IMAP, Dovecot queries Solr with 
the original user name:
GET 
/solr/dovecot_fts_popimap/select?wt=json&f...&fq=%2Bbox:1a30ec359dce3451b8e600008ba2b673+%2Buser:Some.User at domain.net 
HTTP/1.1"
Which (correctly) returns zero results.
To summarize, I suspect dovecot transforms any user name to lower case 
while indexing mails, but not when querying for results.
Is this a bug, or caused by misconfiguration?
Regards
Patrik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dovecot.org/pipermail/dovecot/attachments/20220209/861816b1/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://dovecot.org/pipermail/dovecot/attachments/20220209/861816b1/attachment.sig>
    
    
More information about the dovecot
mailing list