Replication to wrong mailbox

Ralf Becker rb at egroupware.org
Thu Nov 2 11:47:05 EET 2017


Hi Timo,

Am 02.11.17 um 10:34 schrieb Timo Sirainen:
> On 30 Oct 2017, at 11.05, Ralf Becker <rb at egroupware.org> wrote:
>> It happened now twice that replication created folders and mails in the
>> wrong mailbox :(
>>
>> Here's the architecture we use:
>> - 2 Dovecot (2.2.32) backends in two different datacenters replicating
>> via a VPN connection
>> - Dovecot directors in both datacenters talks to both backends with
>> vhost_count of 100 vs 1 for local vs remote backend
>> - backends use proxy dict via a unix domain socket and socat to talk via
>> tcp to a dict on a different server (kubernetes cluster)
>> - backends have a local sqlite userdb for iteration (also containing
>> home directories, as just iteration is not possible)
>> - serving around 7000 mailboxes in a roughly 200 different domains
>>
>> Everything works as expected, until dict is not reachable eg. due to a
>> server failure or a planed reboot of a node of the kubernetes cluster.
>> In that situation it can happen that some requests are not answered,
>> even with Kubernetes running multiple instances of the dict.
>> I can only speculate what happens then: it seems the connection failure
>> to the remote dict is not correctly handled and leads to situation in
>> which last mailbox/home directory is used for the replication :(
> It sounds to me like a userdb lookup changes the username during a dict failure. Although I can't really think of how that could happen. 

Me neither.

Users are in multiple MariaDB databases on a Galera cluster. We have no
problems or unexpected changes there.

The dict is running multiple time, but that might not guarantee no
single request might fail.

> The only thing that comes to my mind is auth_cache, but in that case I'd expect the same problem to happen even when there aren't dict errors.
>
> For testing you could see if it's reproducible with:
>
>  - get random username
>  - do doveadm user <user>
>  - verify that the result contains the same input user
>
> Then do that in a loop rapidly and restart your test kubernetes once in a while.


Ok, I'll give that a try. It's would be a lot easier then the whole
replication setup.

Ralf

-- 
Ralf Becker
EGroupware GmbH [www.egroupware.org]
Handelsregister HRB Kaiserslautern 3587
Geschäftsführer Birgit und Ralf Becker
Leibnizstr. 17, 67663 Kaiserslautern, Germany
Telefon +49 631 31657-0


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: OpenPGP digital signature
URL: <https://dovecot.org/pipermail/dovecot/attachments/20171102/0b6beeed/attachment.sig>


More information about the dovecot mailing list