Replication to wrong mailbox

Aki Tuomi aki.tuomi at dovecot.fi
Thu Nov 2 10:57:16 EET 2017


Can you somehow reproduce this issue with auth_debug=yes and
mail_debug=yes and provide those logs?

Aki


On 02.11.2017 10:55, Ralf Becker wrote:
> No one any idea?
>
> Replication into wrong mailboxes caused by an unavailable proxy dict
> backend is a serious privacy and/or security problem!
>
> Ralf
>
> Am 30.10.17 um 10:05 schrieb Ralf Becker:
>> It happened now twice that replication created folders and mails in the
>> wrong mailbox :(
>>
>> Here's the architecture we use:
>> - 2 Dovecot (2.2.32) backends in two different datacenters replicating
>> via a VPN connection
>> - Dovecot directors in both datacenters talks to both backends with
>> vhost_count of 100 vs 1 for local vs remote backend
>> - backends use proxy dict via a unix domain socket and socat to talk via
>> tcp to a dict on a different server (kubernetes cluster)
>> - backends have a local sqlite userdb for iteration (also containing
>> home directories, as just iteration is not possible)
>> - serving around 7000 mailboxes in a roughly 200 different domains
>>
>> Everything works as expected, until dict is not reachable eg. due to a
>> server failure or a planed reboot of a node of the kubernetes cluster.
>> In that situation it can happen that some requests are not answered,
>> even with Kubernetes running multiple instances of the dict.
>> I can only speculate what happens then: it seems the connection failure
>> to the remote dict is not correctly handled and leads to situation in
>> which last mailbox/home directory is used for the replication :(
>>
>> When it happened the first time we attributed it to the fact that the
>> Sqlite database at that time contained no home directory information,
>> which we fixed after. This first time (server failure) took a couple of
>> minutes and lead to many mailboxes containing mostly folders but also
>> some new arrived mails belonging to other mailboxes/users. We could only
>> resolve that situation by rolling back to a zfs snapshot before the
>> downtime.
>>
>> The second time was last Friday night during a (much shorter) reboot of
>> a Kubernetes node and lead only to a single mailbox containing folders
>> and mails of other mailboxes. That was verified by looking at timestamps
>> of directories below $home/mdbox/mailboxes and files in $home/mdbox/storage.
>> I can not tell if adding the home directory to the Sqlite database or
>> the shorter time of the failure limited the wrong replication to a
>> single mailbox.
>>
>> Can someone with more knowledge of the Dovecot code please check/verify
>> how replication deals with failures in proxy dict. I'm of cause happy to
>> provide more information of our configuration if needed.
>>
>> Here is an exert of our configuration (full doveconf -n is attached):
>>
>> passdb {
>>   args = /etc/dovecot/dovecot-dict-master-auth.conf
>>   driver = dict
>>   master = yes
>> }
>> passdb {
>>   args = /etc/dovecot/dovecot-dict-auth.conf
>>   driver = dict
>> }
>> userdb {
>>   driver = prefetch
>> }
>> userdb {
>>   args = /etc/dovecot/dovecot-dict-auth.conf
>>   driver = dict
>> }
>> userdb {
>>   args = /etc/dovecot/dovecot-sql.conf
>>   driver = sql
>> }
>>
>> dovecot-dict-auth.conf:
>> uri = proxy:/var/run/dovecot_auth_proxy/socket:backend
>> password_key = passdb/%u/%w
>> user_key = userdb/%u
>> iterate_disable = yes
>>
>> dovecot-dict-master-auth.conf:
>> uri = proxy:/var/run/dovecot_auth_proxy/socket:backend
>> password_key = master/%{login_user}/%u/%w
>> iterate_disable = yes
>>
>> dovecot-sql.conf:
>> driver = sqlite
>> connect = /etc/dovecot/users.sqlite
>> user_query = SELECT home,NULL AS uid,NULL AS gid FROM users WHERE userid
>> = '%n' AND domain = '%d'
>> iterate_query = SELECT userid AS username, domain FROM users




More information about the dovecot mailing list