Dovecot Replication - Architecture Endianness?
Hi all,
I've had an interesting use case come up which - to cut the story short
- one way to solve the problem I am looking at may be to replicate a small number of mailboxes to a third remote server.
I've currently had replication running between my main dovecot machine and another remote VM for some time and working well (so I'm not new to replication and I've got a good working config), but I've a need to add a third to the mix for a select number of mailboxes. The arch on both of those is Gentoo x86_64 and with latest 2.1.16 -hg.
I have attempted this so far by rsync'ing the initial Maildirs and then once the bulk of the data has been transferred rely on dovecot's replication to keep things in sync. I figure that this should in theory mean that the subsequent updates in both directions are incremental and the bulk of the data gets moved while the device is here on my desk using rsync.
I've attempted to do this using a Raspberry Pi as a remote device, but when I set it up the dovecot replication process seems to need to start the replication over from scratch even after the rsync is done. I know this is happening as the disk utilisation on the Pi skyrockets once the replication starts and I end up with thousands of double ups of all the mails ... which defeats the entire point of the process.
If I do an identical configuration but on a third Gentoo x86_64 VM locally it all works as expected. No double ups of mails and the "catchup" between the two devices is practically instant. Same filesystem even. The only difference appears to be the system architecture.
So main my question is this. Is there a known architecture/endian limitation on replication? I guess cross-arch replication is not something many people try but is it supposed to work anyway?
Has anyone else got replication working across different arch's?
Also is there a way to restrict replication users aside from a crude hack around system first and last UIDs?
Thanks, Reuben
On 4/05/2015 11:06 PM, Teemu Huovila wrote:
Ok. That explains why the rsync won't work. But if I kick off a dovecot to dovecot replication (without doing the rsync first) will this work any better once the system catches up? This assumes (possibly incorrectly - please correct me if I am wrong) that the index files themselves aren't dsync'd byte-by-byte but instead the metadata/content from them is sent instead, and then the indexes are written to remote disk by the remote dovecot in the right arch and format that the remote machine can read and understand. Because if that's the case then I can probably make that work - just taking a hit on the initial sync which could take longer.
Even if this doesn't end up working I figure I'll get to learn a little more about the indexes themselves in the process.
Thanks for any advice, Reuben
On 04 May 2015, at 17:11, Reuben Farrelly <reuben-dovecot@reub.net> wrote:
dsyncing between servers (or in general using dsync-server) transfers all data using a portable protocol. So dsync source and destination can then have different endianess and it doesn't matter.
On 7/05/2015 7:47 AM, Timo Sirainen wrote:
I've tested this out today and can confirm it all works well - starting from nothing and doing the entire sync using dovecot. The takeaway from this is that for cross-arch replication an initial rsync is -not- the right thing to do in this situation.
Thanks!
Reuben
On 06 May 2015, at 13:52, Reuben Farrelly <reuben-dovecot@reub.net> wrote:
You can create a new userdb passwd-file that adds extra fields. So something like:
userdb { driver = passwd result_success = continue-ok }
userdb { driver = passwd-file args = /etc/dovecot/passwd.extra skip = notfound }
Could this be done via a per-user LDA setting or sieve?
Replication would happen also with IMAP access.
On 7/05/2015 7:49 AM, Timo Sirainen wrote:
This doesn't seem to work for me and my config has that exact config. My password.extra file has just one line for the one account I am testing with at the moment:
user1:::::::userdb_mail_replica=tcps:lightning.reub.net:4813,userdb_mail_replica=tcp:pi.x.y:4814
This breaks access for other system users such as my own account which do not have entries:
ay 7 21:19:06 tornado.reub.net dovecot: imap-login: Internal login failure (pid=22573 id=1) (internal failure, 1 successful auths): user=<reuben>, auth-method=PLAIN, remote=2001:44b8:31d4:1311::50, local=2001:44b8:31d4:1310::20, TLS
which then starts soon spitting this out 10s of times per second in the mail log:
May 7 21:19:32 tornado.reub.net dovecot: auth-worker(23738): Error: Auth worker sees different passdbs/userdbs than auth server. Maybe config just changed and this goes away automatically?
This is with -hg latest as of now.
This system uses PAM for local users. Do I need to replicate all of the system users including those who do not need any extra settings, in the passwd.extra file too?
Is my syntax above for two mail_replica servers correct?
Thanks, Reuben
On 8/05/2015 6:10 PM, Teemu Huovila wrote:
With -hg as of now it's still not any better:
tornado log # dovecot --version 2.2.16 (f2a8e1793718+) tornado log #
===================
System users (NSS, /etc/passwd, or similiar). In many systems nowadays
this
uses Name Service Switch, which is configured in /etc/nsswitch.conf.
userdb {
<doc/wiki/AuthDatabase.Passwd.txt>
driver = passwd
[blocking=no]
#args =
Override fields from passwd
#override_fields = home=/home/virtual/%u
result_success = continue-ok }
Add some extra fields such as replication..
userdb { driver = passwd-file args = /etc/dovecot/passwd.extra skip = notfound }
==============
May 8 22:59:11 tornado.reub.net dovecot: imap: Error: Authenticated user not found from userdb, auth lookup id=586547201 (client-pid=29035 client-id=1) May 8 22:59:11 tornado.reub.net dovecot: imap-login: Internal login failure (pid=29035 id=1) (internal failure, 1 successful auths): user=<reuben>, auth-method=PLAIN, remote=2001:44b8:31d4:1311::50, local=2001:44b8:31d4:1310::20, TLS
It logs an awful lot of those lines in short succession also, at least 15 per second...
Reuben
On 8/05/2015 11:04 PM, Reuben Farrelly wrote:
Following on from this I've managed to get it to work - but there is one outstanding problem which I suspect may be a bug. Running -hg build as of today.
In case anyone else tries this, I had to separate each userdb_mail_replica entry with a space. This is however, documented in the wiki.
The outstanding issue is that even though I've had 'skip = notfound' in the second userdb as above, if I don't add all of the users to that file (even with no extra variables set) those users who are not added cannot log in. They fail with the error above about an 'internal failure'.
It seems that the second passdb is not actually being skipped at all if the user is not listed in it...Timo?
Thanks, Reuben
participants (3)
-
Reuben Farrelly
-
Teemu Huovila
-
Timo Sirainen