On Tue, 2013-08-06 at 14:30 +0300, Timo Sirainen wrote:
Here's another idea:
Thank you for still looking into this
Try disabling replicator plugin from only one side, so there's not possibility of two dsyncs running at the same time. That should be prevented already by locking though.
I disabled the replication on node b, restarted both, and connected to node a to deliver and read mail, and had the same symptoms. Tried it with replication enabled on node b but disabled on node a, and naturally the message didn't get replicated at all, and so didn't reappear.
The servers have different hostnames, right?
They do. There was a record that pointed to both IP addresses, but I've removed it after reading your suggestion here, and still see the symptoms. I also have a test system which has never had that A record that can show the same symptoms.
The more I think about it, the more this makes sense. You seem to have different hostnames, but .. maybe they're not from Dovecot's point of view for some reason? I added a new dovecot --hostdomain parameter to check it: http://hg.dovecot.org/dovecot-2.2/rev/5a3821097f3c
root@intmail3a:~# /mail/sbin/dovecot --hostdomain intmail3a.internal.sanger.ac.uk root@intmail3b:~# /mail/sbin/dovecot --hostdomain intmail3b.internal.sanger.ac.uk
Each hostname points to 1 IP address, and the only PTR for each IP address is the hostname. No entry in /etc/hosts for either server name.
Inspired by this, I have also tried disabling ipv6 on both servers, in case the lack of DNS entries there was causing an issue, but it didn't fix it.
Simon.
-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.