Different handling of upper and lower case while indexing/searching with Solr
Hello there
We stumbled upon an user account with Solr FTS, which returned no search results for any given search query. Further investigation revealed an issue between indexing mails and querying the index. The user name contains upper and lower case characters (eg. Some.User@domain.net).
When new mail is indexed for this user, the user name used for Solr's
user
and id
fields are transformed into lowercase, as shown in the
Solr log:
webapp=/solr path=/update params={...}{add=[8543/426f3b0348d03451a3fb00008ba2b673/some.user@domain.net (1724281617442144256), ... (162 adds)]} 0 44298
And can be confirmed by manually querying Solr. The Solr schema in use performs no transformation for the affected fields. When a search request is performed via IMAP, Dovecot queries Solr with the original user name:
GET /solr/dovecot_fts_popimap/select?wt=json&f...&fq=%2Bbox:1a30ec359dce3451b8e600008ba2b673+%2Buser:Some.User@domain.net HTTP/1.1"
Which (correctly) returns zero results.
To summarize, I suspect dovecot transforms any user name to lower case while indexing mails, but not when querying for results.
Is this a bug, or caused by misconfiguration?
Regards Patrik
Woops, this time with better formatting.
On 09.02.22 12:21, Patrik Peng wrote:
Hello there
We stumbled upon an user account with Solr FTS, which returned no search results for any given search query. Further investigation revealed an issue between indexing mails and querying the index. The user name contains upper and lower case characters (eg. Some.User@domain.net).
When new mail is indexed for this user, the user name used for Solr's
user
andid
fields are transformed into lowercase, as shown in the Solr log:webapp=/solr path=/update params={...}{add=[8543/426f3b0348d03451a3fb00008ba2b673/some.user@domain.net (1724281617442144256), ... (162 adds)]} 0 44298
And can be confirmed by manually querying Solr. The Solr schema in use performs no transformation for the affected fields. When a search request is performed via IMAP, Dovecot queries Solr with the original user name:
GET /solr/dovecot_fts_popimap/select?wt=json&f...&fq=%2Bbox:1a30ec359dce3451b8e600008ba2b673+%2Buser:Some.User@domain.net HTTP/1.1"
Which (correctly) returns zero results.
To summarize, I suspect dovecot transforms any user name to lower case while indexing mails, but not when querying for results.
Is this a bug, or caused by misconfiguration?
Regards Patrik
On February 9, 2022 12:31:23 PM GMT+01:00, Patrik Peng <patrik.peng@hostpoint.ch> wrote:
Woops, this time with better formatting.
On 09.02.22 12:21, Patrik Peng wrote:
Hello there
We stumbled upon an user account with Solr FTS, which returned no search results for any given search query. Further investigation revealed an issue between indexing mails and querying the index. The user name contains upper and lower case characters (eg. Some.User@domain.net).
When new mail is indexed for this user, the user name used for Solr's
user
andid
fields are transformed into lowercase, as shown in the Solr log:webapp=/solr path=/update params={...}{add=[8543/426f3b0348d03451a3fb00008ba2b673/some.user@domain.net (1724281617442144256), ... (162 adds)]} 0 44298
And can be confirmed by manually querying Solr. The Solr schema in use performs no transformation for the affected fields. When a search request is performed via IMAP, Dovecot queries Solr with the original user name:
GET /solr/dovecot_fts_popimap/select?wt=json&f...&fq=%2Bbox:1a30ec359dce3451b8e600008ba2b673+%2Buser:Some.User@domain.net HTTP/1.1"
Which (correctly) returns zero results.
To summarize, I suspect dovecot transforms any user name to lower case while indexing mails, but not when querying for results.
Is this a bug, or caused by my configuration?
How are your users added to your auth backend? Please post your doveconf -n output
Regards Patrik
-- Christian Kivalo
On 09.02.22 17:47, Christian Kivalo wrote:
How are your users added to your auth backend?
We use a SQL DB as auth backend. Users are added by an external application. New accounts are all added as lowercase, but it could be possible that there was a time in the past where accounts were added without conversion. At least the DB contains a few accounts with uppercase letters in the localpart.
Please post your doveconf -n output
Here you go (I stripped a few irrelevant sections):
# 2.3.15 (0503334ab1): /usr/local/etc/dovecot/dovecot.conf # Pigeonhole version 0.5.15 (e6a84e31) # OS: FreeBSD 12.2-RELEASE-p11 amd64 # Hostname: XXX auth_cache_negative_ttl = 5 mins auth_cache_size = 20 M auth_cache_ttl = 5 mins auth_username_chars = abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890.+-_@ auth_verbose = yes auth_worker_max_count = 90 config_cache_size = 50 M disable_plaintext_auth = no passdb { args = /usr/local/etc/dovecot/sql.conf driver = sql name = sql } userdb { args = /usr/local/etc/dovecot/sql.conf driver = sql name = sql } plugin { fts_autoindex = no fts_autoindex_exclude = \Junk fts_autoindex_exclude2 = \spam fts_autoindex_exclude3 = INBOX.spam fts_enforced = no fts_index_timeout = 120s fts_solr = url=https://XXX soft_commit=no batch_size=1000 mail_log_events = copy save delete undelete expunge mailbox_create mailbox_delete mailbox_rename mail_log_fields = uid box msgid from size flags quota = maildir:User quota quota_grace = 10%% quota_rule = *:storage=1G quota_warning = storage=95%% quota-warning 95 %u quota_warning2 = storage=80%% quota-warning 80 %u sieve = /var/empty/sieve.current sieve_before = /usr/local/etc/dovecot/sieve.before/ sieve_dir = /var/empty/sieve sieve_global_dir = /usr/local/etc/dovecot/sieve.global/ sieve_global_extensions = +editheader } service auth-worker { process_limit = 150 user = dovenull } service auth { client_limit = 65000 } ssl_cert = </etc/ssl/certs/xxx ssl_cipher_list = ECDHE+AESGCM:DHE+AESGCM:ECDHE+AES256:DHE+AES256:ECDHE+AES:DHE+AES:!LOW:!MEDIUM:!aNULL:!eNULL:!3DES:!DES:!DSS:!EXP:!MD5:!PSK:!RC4:!SRP ssl_client_ca_file = /etc/ssl/certs/xxx ssl_dh = # hidden, use -P to show it ssl_key = # hidden, use -P to show it ssl_min_protocol = TLSv1 ssl_prefer_server_ciphers = yes verbose_proctitle = yes protocol imap { imap_client_workarounds = delay-newmail tb-extra-mailbox-sep mail_max_userip_connections = 45 mail_plugins = mail_log notify quota fts fts_solr imap_quota } protocol pop3 { mail_max_userip_connections = 30 pop3_client_workarounds = outlook-no-nuls oe-ns-eoh pop3_uidl_format = UID%u-%v } protocol lmtp { mail_plugins = mail_log notify quota fts fts_solr sieve }
On 10/02/2022 11:36 Patrik Peng <patrik.peng@hostpoint.ch> wrote:
On 09.02.22 17:47, Christian Kivalo wrote:
How are your users added to your auth backend?
We use a SQL DB as auth backend. Users are added by an external application. New accounts are all added as lowercase, but it could be possible that there was a time in the past where accounts were added without conversion. At least the DB contains a few accounts with uppercase letters in the localpart.
Please post your doveconf -n output
Here you go (I stripped a few irrelevant sections):
Probably easiest fix is to fix the users in database to all lowercase, as you are likely returning user
attribute in your SQL queries.
Aki
Probably easiest fix is to fix the users in database to all lowercase, as you are likely returning
user
attribute in your SQL queries. We thought about this as well, but there are 500+ affected accounts and
On 10.02.22 10:43, Aki Tuomi wrote: they are used by our customers which would mean for each of them to reconfigure all their clients.
On 10/02/2022 11:58 Patrik Peng <patrik.peng@hostpoint.ch> wrote:
Probably easiest fix is to fix the users in database to all lowercase, as you are likely returning
user
attribute in your SQL queries. We thought about this as well, but there are 500+ affected accounts andOn 10.02.22 10:43, Aki Tuomi wrote: they are used by our customers which would mean for each of them to reconfigure all their clients.
You can configure dovecot with
auth_username_format=%Lu
which downcases the username provided by the customer, as well.
Aki
On 10.02.22 11:25, Aki Tuomi wrote:
You can configure dovecot with
auth_username_format=%Lu
which downcases the username provided by the customer, as well.
According to the docs [1], is '%Lu' already the default and this value is not changed in our config. I guess 'auth_username_format=%Lu' is not applied when a user is performing a search via FTS.
On 10/02/2022 13:46 Patrik Peng <patrik.peng@hostpoint.ch> wrote:
On 10.02.22 11:25, Aki Tuomi wrote:
You can configure dovecot with
auth_username_format=%Lu
which downcases the username provided by the customer, as well.
According to the docs [1], is '%Lu' already the default and this value is not changed in our config. I guess 'auth_username_format=%Lu' is not applied when a user is performing a search via FTS.
It is applied, but, your sql database overrides it by having non-lowercased usernames.
Aki
participants (3)
-
Aki Tuomi
-
Christian Kivalo
-
Patrik Peng