Replication to wrong mailbox
Timo Sirainen
tss at iki.fi
Thu Nov 2 11:34:31 EET 2017
On 30 Oct 2017, at 11.05, Ralf Becker <rb at egroupware.org> wrote:
>
> It happened now twice that replication created folders and mails in the
> wrong mailbox :(
>
> Here's the architecture we use:
> - 2 Dovecot (2.2.32) backends in two different datacenters replicating
> via a VPN connection
> - Dovecot directors in both datacenters talks to both backends with
> vhost_count of 100 vs 1 for local vs remote backend
> - backends use proxy dict via a unix domain socket and socat to talk via
> tcp to a dict on a different server (kubernetes cluster)
> - backends have a local sqlite userdb for iteration (also containing
> home directories, as just iteration is not possible)
> - serving around 7000 mailboxes in a roughly 200 different domains
>
> Everything works as expected, until dict is not reachable eg. due to a
> server failure or a planed reboot of a node of the kubernetes cluster.
> In that situation it can happen that some requests are not answered,
> even with Kubernetes running multiple instances of the dict.
> I can only speculate what happens then: it seems the connection failure
> to the remote dict is not correctly handled and leads to situation in
> which last mailbox/home directory is used for the replication :(
It sounds to me like a userdb lookup changes the username during a dict failure. Although I can't really think of how that could happen. The only thing that comes to my mind is auth_cache, but in that case I'd expect the same problem to happen even when there aren't dict errors.
For testing you could see if it's reproducible with:
- get random username
- do doveadm user <user>
- verify that the result contains the same input user
Then do that in a loop rapidly and restart your test kubernetes once in a while.
More information about the dovecot
mailing list