Master-Master replication question
Dear list,
I have setup a master-master replication setup. My primairy MX's send email over on a DNS loadbalanced way, so DNS is doing some kind of round-robin way of sending mail to both master servers.
I found out, that on one of the two machines, the email synchronisation is heavily delayed. Lets assume server A receives a mail from the MX; it synchronises almost instantly with the other server.
Whenever server B receives the email, it could take up to several hours to synchronise the email, it seems that it is not detected prior.
It is also interesting to see, that the mailboxes on server A (Where users login to retrieve their email via webmail/clients) are significantly smaller then the mailboxes on server B. When investigating, it seems that "older" mailboxes (or storage rather since we use mdbox) are still there on server B, which already had been removed on server A.
My personal mailbox was 170MB on server A, while it was still 2.5GB on server B. (which was around that size before cleaning up the mailsboxes).
I enabled debugging on the servers, and I see rather quick : "Replication requests" on server A, but when getting an email on server B, I do not see the request at all.
My servers are both running the same version, same configuration (utilizing puppet), both running on ZFS and FreeBSD. Where server B is more loaded in it's memory because of some bhyve VM's and the server A does not run any VM.
Does someone have any pointers on where to look?
Thanks in advance ;-) Remko
Included below the configurations from server A and B:
Server A:
# 2.2.25 (7be1766): /usr/local/etc/dovecot/dovecot.conf
# Pigeonhole version 0.4.14 (099a97c)
# OS: FreeBSD 10.3-RELEASE-p2 amd64
auth_mechanisms = plain login
disable_plaintext_auth = no
doveadm_password = # hidden, use -P to show it
haproxy_trusted_networks = YYYY
lda_mailbox_autocreate = yes
lda_mailbox_autosubscribe = yes
lmtp_save_to_detail_mailbox = yes
mail_debug = yes
mail_fsync = always
mail_location = mdbox:~/mdbox
mail_plugins = " quota notify replication"
managesieve_notify_capability = mailto
managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext
namespace {
inbox = yes
location =
mailbox Drafts {
auto = subscribe
special_use = \Drafts
}
mailbox Junk {
special_use = \Junk
}
mailbox Sent {
auto = subscribe
special_use = \Sent
}
mailbox "Sent Messages" {
special_use = \Sent
}
mailbox Spam {
auto = subscribe
special_use = \Junk
}
mailbox Trash {
auto = subscribe
special_use = \Trash
}
prefix =
separator = .
}
passdb {
driver = pam
}
plugin {
antispam_backend = mailtrain
antispam_mail_notspam = --ham
antispam_mail_sendmail = /usr/local/bin/sa-learn.sh
antispam_mail_spam = --spam
antispam_spam_pattern_ignorecase = spam;junk
antispam_trash_pattern_ignorecase = trash;deleted items;deleted messages
antispam_verbose_debug = 1
mail_log_events = delete undelete expunge copy mailbox_delete mailbox_rename
mail_log_fields = uid box msgid size
mail_replica = tcps:the other server:12346
sieve = ~/.dovecot.sieve
sieve_dir = ~/sieve
sieve_global_dir = /usr/local/etc/dovecot/sieve/global/
sieve_global_path = /usr/local/etc/dovecot/sieve/default.sieve
}
postmaster_address = postmaster@xxx
protocols = imap pop3 lmtp sieve
replication_dsync_parameters = -d -N -l 60 -U
replication_max_conns = 100
service aggregator {
fifo_listener replication-notify-fifo {
mode = 0666
}
unix_listener replication-notify {
mode = 0666
}
}
service auth {
unix_listener /var/spool/postfix/private/auth {
mode = 0666
}
}
service doveadm {
inet_listener {
port = 12346
ssl = yes
}
}
service imap-login {
inet_listener imap_haproxy {
haproxy = yes
port = 10143
}
inet_listener imaps_haproxy {
haproxy = yes
port = 10144
ssl = yes
}
service_count = 1
}
service imap {
process_limit = 1024
}
service lmtp {
unix_listener /var/spool/postfix/private/dovecot-lmtp {
group = postfix
mode = 0600
user = postfix
}
}
service pop3 {
process_limit = 1024
}
service replicator {
process_min_avail = 1
unix_listener replicator-doveadm {
mode = 0666
}
}
ssl_ca =
Server B:
# 2.2.25 (7be1766): /usr/local/etc/dovecot/dovecot.conf
# Pigeonhole version 0.4.14 (099a97c)
# OS: FreeBSD 10.3-RELEASE amd64
auth_mechanisms = plain login
disable_plaintext_auth = no
doveadm_password = # hidden, use -P to show it
haproxy_trusted_networks = YYYY
lda_mailbox_autocreate = yes
lda_mailbox_autosubscribe = yes
lmtp_save_to_detail_mailbox = yes
mail_debug = yes
mail_fsync = always
mail_location = mdbox:~/mdbox
mail_plugins = " quota notify replication"
managesieve_notify_capability = mailto
managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext
namespace {
inbox = yes
location =
mailbox Drafts {
auto = subscribe
special_use = \Drafts
}
mailbox Junk {
special_use = \Junk
}
mailbox Sent {
auto = subscribe
special_use = \Sent
}
mailbox "Sent Messages" {
special_use = \Sent
}
mailbox Spam {
auto = subscribe
special_use = \Junk
}
mailbox Trash {
auto = subscribe
special_use = \Trash
}
prefix =
separator = .
}
passdb {
driver = pam
}
plugin {
antispam_backend = mailtrain
antispam_mail_notspam = --ham
antispam_mail_sendmail = /usr/local/bin/sa-learn.sh
antispam_mail_spam = --spam
antispam_spam_pattern_ignorecase = spam;junk
antispam_trash_pattern_ignorecase = trash;deleted items;deleted messages
antispam_verbose_debug = 1
mail_log_events = delete undelete expunge copy mailbox_delete mailbox_rename
mail_log_fields = uid box msgid size
mail_replica = tcps:the other host:12346
sieve = ~/.dovecot.sieve
sieve_dir = ~/sieve
sieve_global_dir = /usr/local/etc/dovecot/sieve/global/
sieve_global_path = /usr/local/etc/dovecot/sieve/default.sieve
}
postmaster_address = postmaster@xxx
protocols = imap pop3 lmtp sieve
replication_dsync_parameters = -d -N -l 60 -U
replication_max_conns = 100
service aggregator {
fifo_listener replication-notify-fifo {
mode = 0666
}
unix_listener replication-notify {
mode = 0666
}
}
service auth {
unix_listener /var/spool/postfix/private/auth {
mode = 0666
}
}
service doveadm {
inet_listener {
port = 12346
ssl = yes
}
}
service imap-login {
inet_listener imap_haproxy {
haproxy = yes
port = 10143
}
inet_listener imaps_haproxy {
haproxy = yes
port = 10144
ssl = yes
}
service_count = 1
}
service imap {
process_limit = 1024
}
service lmtp {
unix_listener /var/spool/postfix/private/dovecot-lmtp {
group = postfix
mode = 0600
user = postfix
}
}
service pop3 {
process_limit = 1024
}
service replicator {
process_min_avail = 1
unix_listener replicator-doveadm {
mode = 0666
}
}
ssl_ca =
You are not alone!
On Wednesday, July 06, 2016 01:15:34 PM Remko Lodder wrote:
Dear list,
I have setup a master-master replication setup. My primairy MX's send email over on a DNS loadbalanced way, so DNS is doing some kind of round-robin way of sending mail to both master servers.
I found out, that on one of the two machines, the email synchronisation is heavily delayed. Lets assume server A receives a mail from the MX; it synchronises almost instantly with the other server.
Whenever server B receives the email, it could take up to several hours to synchronise the email, it seems that it is not detected prior.
I have been dealing with this for months. http://www.dovecot.org/list/dovecot/2016-March/103680.html
For a band aid I use this crontab entry. On the 2nd mail server.
*/15 * * * * root /usr/bin/doveadm sync -u "*" remote:mail1
However in doing this, and at other times during the sync. Something happens and kmail pulls in the email twice, and puts one in an odd state, grayed out in the GUI. I have to go do the directory and delete. Once read it has a T flag which other emails do not have. That becomes more common when I use the above, but otherwise happens on occasion.
It is also interesting to see, that the mailboxes on server A (Where users login to retrieve their email via webmail/clients) are significantly smaller then the mailboxes on server B. When investigating, it seems that "older" mailboxes (or storage rather since we use mdbox) are still there on server B, which already had been removed on server A.
I experience every bit of what you are describing. Also seems to be effected when email arrives on one server, but users are checking/pulling email from another. They never see the ones on the other, and can have emails arrive, be deleted, etc.
My personal mailbox was 170MB on server A, while it was still 2.5GB on server B. (which was around that size before cleaning up the mailsboxes).
I enabled debugging on the servers, and I see rather quick : "Replication requests" on server A, but when getting an email on server B, I do not see the request at all.
My servers are both running the same version, same configuration (utilizing puppet), both running on ZFS and FreeBSD. Where server B is more loaded in it's memory because of some bhyve VM's and the server A does not run any VM.
Does someone have any pointers on where to look?
I have been hoping its some issue that gets fixed in some new release. It seems there might have been some regressions there, as at times it seemed to have gotten better and other times worse.
I think it has something to do with full vs fast/quick syncing. I think the above command forces a full, and most times its doing a fast sync. There are not many settings to play with or adjust so seems to be something that requires addressing in the code itself unless some new settings are introduced.
-- William L. Thomson Jr. Obsidian-Studios, Inc. http://www.obsidian-studios.com
Quoting "William L. Thomson Jr." wlt-ml@o-sinc.com:
You are not alone!
On Wednesday, July 06, 2016 01:15:34 PM Remko Lodder wrote:
Dear list,
I have setup a master-master replication setup. My primairy MX's send email over on a DNS loadbalanced way, so DNS is doing some kind of round-robin way of sending mail to both master servers.
I found out, that on one of the two machines, the email synchronisation is heavily delayed. Lets assume server A receives a mail from the MX; it synchronises almost instantly with the other server.
Whenever server B receives the email, it could take up to several hours to synchronise the email, it seems that it is not detected prior.
I have been dealing with this for months. http://www.dovecot.org/list/dovecot/2016-March/103680.html
For a band aid I use this crontab entry. On the 2nd mail server.
*/15 * * * * root /usr/bin/doveadm sync -u "*" remote:mail1 <snip>
Are you guys using LMTP to deliver from your MX server to the mailbox server?
I have a similar setup, but not yet synched, because as I understand it - using 'deliver' to drop mail into an NFS mount won't inititate a sync. I have to migrate my procmail scripts to sieve (and use the execute plugin) and change my final delivery to be a redirect to LMTP. Not sure how replication will work when running old procmail scripts from sieve...
In any case.. If you're piping to dovecot's deliver/dovecot-lda, here is a rudimentary LMTP script I hacked together that I planned to use to replace deliver with... I'd grab the 'master' mailbox server IP for each user for the command line.
#!/usr/bin/perl
use Net::LMTP; use Getopt::Std;
$opts{'s'} = "localhost";
$opts{'p'} = "24";
$opts{'f'} = 'root@' . hostname
;
chomp($opts{'f'});
chomp($opts{'s'});
getopts("hs:p:f:u:", \%opts);
if ($opts{'h'}) { print " lmtpsend [-s lmtpserver] [-f fromaddress] [-u subject] toaddress [...]
lmtpsend will send an email from the commandline.
Options: -s lmtpserver Sets the lmtpserver for where to send the mail through. -f fromaddress Sets the email address to be used on the From: line. -u subject Sets the email subject to be used from the Subject line. toaddress Where you want the email sent to.
"; exit; }
die "no recepients to send mail to" if ($#ARGV < 0);
@emailbody = <STDIN> ;
# send the message
$message = Net::LMTP->new($opts{'s'},$opts{'p'}) || die "can't talk to server $opts{'s'}\n";
$message->mail($opts{'f'}); $message->to(@ARGV) || die "failed to send to the recepients ",join(",",@ARGV),": $!"; $message->data(); $message->datasend("To: " . join(", ",@ARGV) . "\n"); $message->datasend(@emailbody); $message->dataend(); $message->quit;
Rick
On 11 Jul 2016, at 17:36, Rick Romero rick@havokmon.com wrote:
Quoting "William L. Thomson Jr." wlt-ml@o-sinc.com:
You are not alone!
On Wednesday, July 06, 2016 01:15:34 PM Remko Lodder wrote:
Dear list,
I have setup a master-master replication setup. My primairy MX's send email over on a DNS loadbalanced way, so DNS is doing some kind of round-robin way of sending mail to both master servers.
I found out, that on one of the two machines, the email synchronisation is heavily delayed. Lets assume server A receives a mail from the MX; it synchronises almost instantly with the other server.
Whenever server B receives the email, it could take up to several hours to synchronise the email, it seems that it is not detected prior.
I have been dealing with this for months. http://www.dovecot.org/list/dovecot/2016-March/103680.html
For a band aid I use this crontab entry. On the 2nd mail server.
*/15 * * * * root /usr/bin/doveadm sync -u "*" remote:mail1 <snip>
Are you guys using LMTP to deliver from your MX server to the mailbox server?
Local delivery on the destination server is LMTP but the transport between MX and destination server is just plain SMTP.
I could try and revert to dovecot-lda and see what that does?
Cheers remko
I have a similar setup, but not yet synched, because as I understand it - using 'deliver' to drop mail into an NFS mount won't inititate a sync. I have to migrate my procmail scripts to sieve (and use the execute plugin) and change my final delivery to be a redirect to LMTP. Not sure how replication will work when running old procmail scripts from sieve...
In any case.. If you're piping to dovecot's deliver/dovecot-lda, here is a rudimentary LMTP script I hacked together that I planned to use to replace deliver with... I'd grab the 'master' mailbox server IP for each user for the command line.
#!/usr/bin/perl
use Net::LMTP; use Getopt::Std;
$opts{'s'} = "localhost"; $opts{'p'} = "24"; $opts{'f'} = 'root@' .
hostname
; chomp($opts{'f'}); chomp($opts{'s'}); getopts("hs:p:f:u:", \%opts);if ($opts{'h'}) { print " lmtpsend [-s lmtpserver] [-f fromaddress] [-u subject] toaddress [...]
lmtpsend will send an email from the commandline.
Options: -s lmtpserver Sets the lmtpserver for where to send the mail through. -f fromaddress Sets the email address to be used on the From: line. -u subject Sets the email subject to be used from the Subject line. toaddress Where you want the email sent to.
"; exit; }
die "no recepients to send mail to" if ($#ARGV < 0);
@emailbody = <STDIN> ;
# send the message
$message = Net::LMTP->new($opts{'s'},$opts{'p'}) || die "can't talk to server $opts{'s'}\n";
$message->mail($opts{'f'}); $message->to(@ARGV) || die "failed to send to the recepients ",join(",",@ARGV),": $!"; $message->data(); $message->datasend("To: " . join(", ",@ARGV) . "\n"); $message->datasend(@emailbody); $message->dataend(); $message->quit;
Rick
Quoting Remko Lodder remko@freebsd.org:
On 11 Jul 2016, at 17:36, Rick Romero rick@havokmon.com wrote:
Quoting "William L. Thomson Jr." wlt-ml@o-sinc.com:
You are not alone!
On Wednesday, July 06, 2016 01:15:34 PM Remko Lodder wrote:
Dear list,
I have setup a master-master replication setup. My primairy MX's send email over on a DNS loadbalanced way, so DNS is doing some kind of round-robin way of sending mail to both master servers.
I found out, that on one of the two machines, the email synchronisation is heavily delayed. Lets assume server A receives a mail from the MX; it synchronises almost instantly with the other server.
Whenever server B receives the email, it could take up to several hours to synchronise the email, it seems that it is not detected prior.
I have been dealing with this for months. http://www.dovecot.org/list/dovecot/2016-March/103680.html
For a band aid I use this crontab entry. On the 2nd mail server.
*/15 * * * * root /usr/bin/doveadm sync -u "*" remote:mail1 <snip>
Are you guys using LMTP to deliver from your MX server to the mailbox server?
Local delivery on the destination server is LMTP but the transport between MX and destination server is just plain SMTP.
I could try and revert to dovecot-lda and see what that does?
I don't think that'll help. From what I understand, LMTP is required for replication on delivery.
Out of curiousity, why do you use SMTP from the MX to the destination server instead of LMTP?
Hi Rick,
Local delivery on the destination server is LMTP but the transport between MX and destination server is just plain SMTP.
I could try and revert to dovecot-lda and see what that does?
I don't think that'll help. From what I understand, LMTP is required for replication on delivery.
Out of curiousity, why do you use SMTP from the MX to the destination server instead of LMTP?
It was using that already :-), I do not see a direct reason for this to change, I will test it at some point though :)
On Monday, July 11, 2016 10:53:05 AM Rick Romero wrote:
I don't think that'll help. From what I understand, LMTP is required for replication on delivery.
Where did you come across that requirement? I do not recall that.
Out of curiousity, why do you use SMTP from the MX to the destination server instead of LMTP?
My reason is because qmail does not support that. I am not sure if I will migrate to exim or postfix. Seems others have inquired about LMTP with qmail, might be something out there.
-- William L. Thomson Jr. Obsidian-Studios, Inc. http://www.obsidian-studios.com
Quoting "William L. Thomson Jr." wlt-ml@o-sinc.com:
On Monday, July 11, 2016 10:53:05 AM Rick Romero wrote:
I don't think that'll help. From what I understand, LMTP is required for replication on delivery.
Where did you come across that requirement? I do not recall that.
Hmmm I can't seem to find any reference to it. Maybe it was from the old blog - http://blog.dovecot.org/2012/02/dovecot-clustering-with-dsync-based.html My understanding/assumption is that LDA delivers and updates indexes. I assume using LMTP delivers, updates indexes and kicks off a quick sync.
Out of curiousity, why do you use SMTP from the MX to the destination
server instead of LMTP?
My reason is because qmail does not support that. I am not sure if I will migrate to exim or postfix. Seems others have inquired about LMTP with qmail, might be something out there.
I use qmail as well - that's why I wrote/hacked the LMTP script :) Basically, my last step (if no .qmail exists) is 'pipe to dovecot deliver'
- I need to change that to 'pipe to this LMTP script' The script allows you to specify a hostname to deliver to, so that you can dynamic deliver to the primary server for each user, assuming you're already doing that with a director instance.
So the theory is. I've been hitting on pieces of this for years, and I want to get all my data replicated before actually I start testing again.. The LMTP script is 6 months old and I haven't done anything beyond basic testing with it yet :/
If it weren't for all the procmail stuff I've put in over the years I'd already be done. *sigh*
Rick
On Monday, July 11, 2016 12:46:50 PM Rick Romero wrote:
Quoting "William L. Thomson Jr." wlt-ml@o-sinc.com:
On Monday, July 11, 2016 10:53:05 AM Rick Romero wrote:
I don't think that'll help. From what I understand, LMTP is required
for
replication on delivery.
Where did you come across that requirement? I do not recall that.
Hmmm I can't seem to find any reference to it. Maybe it was from the old blog - http://blog.dovecot.org/2012/02/dovecot-clustering-with-dsync-based.html My understanding/assumption is that LDA delivers and updates indexes. I assume using LMTP delivers, updates indexes and kicks off a quick sync.
That is what I read as well, when others said NFS would not work. I seemed to have missed the part you mentioned. Likely skimmed vs read over a cup of tea... Maybe that is why I have syncing issues. I will go back and reread. Also I could change one end to not be NFS and see if that helps, not sure I can do both to really rule that out/in. Likely need to change both to be 100%, unless one side being NFS helps show the problem.
I use qmail as well - that's why I wrote/hacked the LMTP script :)
That is good to know, I might play around with it in that case. Not to mention that there are others still using qmail. With patch for IPv6, not sure I really need to replace qmail. I have ASSP in front of qmail, and tends to do more of the modern things qmail does not. But thats OT for this list.
Basically, my last step (if no .qmail exists) is 'pipe to dovecot deliver'
- I need to change that to 'pipe to this LMTP script' The script allows you to specify a hostname to deliver to, so that you can dynamic deliver to the primary server for each user, assuming you're already doing that with a director instance.
I do not have a primary server, as I want both to be the same so it does not matter which is used, or if either has an issue and goes away. But I might be able to achieve the same by setting a primary. Having a primary would likely fix most syncing issues, but more with users checking email on the same one its arriving on. Other syncing issues might still remain.
So the theory is. I've been hitting on pieces of this for years, and I want to get all my data replicated before actually I start testing again.. The LMTP script is 6 months old and I haven't done anything beyond basic testing with it yet :/
I have my replication stuff in production, and for the most part no problems short of the delayed emails at times and duplicates at others. If I can run the manual command to sync, it will band aid the replication problems.
If it weren't for all the procmail stuff I've put in over the years I'd already be done. *sigh*
I never got hooked on that, but I do know its quite powerful. I likely need to keep and do more server side. I pull most to client and do filter. Though procmail can do more than filter.
-- William L. Thomson Jr. Obsidian-Studios, Inc. http://www.obsidian-studios.com
Quoting Rick Romero rick@havokmon.com:
Quoting "William L. Thomson Jr." wlt-ml@o-sinc.com:
On Monday, July 11, 2016 10:53:05 AM Rick Romero wrote:
I don't think that'll help. From what I understand, LMTP is required for replication on delivery.
Where did you come across that requirement? I do not recall that.
Hmmm I can't seem to find any reference to it. Maybe it was from the old blog -
http://blog.dovecot.org/2012/02/dovecot-clustering-with-dsync-based.html
My understanding/assumption is that LDA delivers and updates indexes. I assume using LMTP delivers, updates indexes and kicks off a quick sync.
Out of curiousity, why do you use SMTP from the MX to the destination
server instead of LMTP?
My reason is because qmail does not support that. I am not sure if I will migrate to exim or postfix. Seems others have inquired about LMTP with qmail, might be something out there.
I use qmail as well - that's why I wrote/hacked the LMTP script :)
I've gotten replication working with both LDA and LMTP. Though I believe the LDA replication that I've seen is actually from the IMAP notify/replicaiton plugin. I've only done some preliminary testing. LMTP replication is imemdiate when I use my LMTP perl script instead of procmail or vdelivermail. Otherwise, as I said with LDA above, I'm not sure how dovecot would know the indexes change from the front-end NFS mounted MX.
In any case, what I got stuck on was the mail_plugins. The Replication page seems to just refer to the global plugin settings, but you also have to add them to each service. That finally worked.
protocol imap { imap_client_workarounds = delay-newmail tb-extra-mailbox-sep mail_max_userip_connections = 25 mail_plugins = " quota zlib stats notify replication imap_zlib quota imap_quota NOTIFY REPLICATION" } protocol pop3 { mail_max_userip_connections = 25 mail_plugins = quota NOTIFY REPLICATION pop3_client_workarounds = outlook-no-nuls oe-ns-eoh pop3_uidl_format = %08Xu%08Xv } protocol lda { mail_plugins = sieve REPLICATION NOTIFY userdb { args = /usr/local/etc/dovecot/dovecot-sql.conf driver = sql name = } } protocol lmtp { info_log_path = /var/log/dovecot-lmtp.log mail_plugins = sieve quota REPLICATION NOTIFY userdb { args = /usr/local/etc/dovecot/dovecot-sql.conf driver = sql name = } }
Rick
On Monday, July 11, 2016 10:36:13 AM Rick Romero wrote:
Are you guys using LMTP to deliver from your MX server to the mailbox server?
I am not at this time. My use of NFS is for other reasons. All services SMTP, POP, and IMAP are on the same system.
I have not tried it without NFS to see if that is causing the problem or not. I do not believe it to be a NFS issue but it might.
-- William L. Thomson Jr. Obsidian-Studios, Inc. http://www.obsidian-studios.com
On 11 Jul 2016, at 17:21, William L. Thomson Jr. wlt-ml@o-sinc.com wrote:
You are not alone!
Hello,
Now that’s a relief!
One of the things that I described and observed is that it seems that serverB is not seeing the email (or at least there is no connection that when an email is send and stored on the mailserver that the services see them and notify the other end). With tcpdump there is no traffic at all, until there is a sync the other way around.
As said both systems are identical in hardware setup and use puppet to obtain their configuration, which is the same for both hosts (except the IP adresses and hostname);
But since we are with at least two, we might have better luck in getting some help with this. I currently do not have an idea on where to look and how to investigate this properly.
Any pointers from the list are welcome!
Cheers Remko
-- William L. Thomson Jr. Obsidian-Studios, Inc. http://www.obsidian-studios.com
On Monday, July 11, 2016 05:38:20 PM Remko Lodder wrote:
On 11 Jul 2016, at 17:21, William L. Thomson Jr. wlt-ml@o-sinc.com wrote:
You are not alone!
Hello,
Now that’s a relief!
Maybe if I had a solution, but I guess knowing others suffer the same can be reliving.
One of the things that I described and observed is that it seems that serverB is not seeing the email (or at least there is no connection that when an email is send and stored on the mailserver that the services see them and notify the other end). With tcpdump there is no traffic at all, until there is a sync the other way around.
I really have not had a chance to debug this. I was under the impression one side thought it had synced. Since both sides tend to show fast sync, but its the full sync I have been curious about. When I run the manual command, it seems to do a full sync. Also not clear if emails are supposed to be on both or if one has reference to emails on the other. When I do a manual sync via command line, it seems to make both have the same emails, but different file names.
The manual syncing I think it triggers another issue with duplicate emails. Another started a topic on duplicate emails from dsync, which I suffer from as well when I try to force syncing, or as a result of syncing at times. That as well I have not had a chance to debug.
As said both systems are identical in hardware setup and use puppet to obtain their configuration, which is the same for both hosts (except the IP adresses and hostname);
Same here, I literally cloned my 2nd one as both are VMs. I use Ansible to make them identical configuration wise. Only thing that is different is the data, email that arrives on one or the other.
But since we are with at least two, we might have better luck in getting some help with this. I currently do not have an idea on where to look and how to investigate this properly.
It seems there might have been a few regressions, maybe or hopefully. Things seemed to get better and/or go away entirely for a month or so after a past updated. I commented about that on list. Though it seems to have regressed with 2.2.24. I haven't upgraded to 2.2.25 yet. Seems that might have other regressions not sure or maybe fixes.
Any pointers from the list are welcome!
Beyond running the manual sync via command line, not sure at this time. The manual sync via cli seemed to stop working a few updates back.
Just as I type that, I went to run the command again so I could get errors to pass along and it worked. I know I tried to run it the other day and it failed. Something about unable to lookup UID or switch to the users. I had cron running it every 15 minutes to force things to sync. I stopped when I started getting emails of errors when it ran every 15 minutes.
I think error is similar for the use case for the dsync wrapper script for root, mentioned here. When I get the error it seems root has a problem changing to another UID. Which seems that is what the script does, wrap users for root. http://wiki.dovecot.org/Replication
Just odd that it works sometimes and not others. I thought it stopped working during an update. Now I think it is related to the syncing. Maybe when syncing is not working, if I run that command I will get the errors. Not sure if it will shed any light on syncing. At least I know that is not related to an update or regression. I will see about replicating the manual sync errors, and see if regular syncing is broken at the same time.
Beyond that, I am open to any input from the list as well... Though need to do my part and try to debug a bit more.
-- William L. Thomson Jr. Obsidian-Studios, Inc. http://www.obsidian-studios.com
participants (4)
-
Remko Lodder
-
Remko Lodder
-
Rick Romero
-
William L. Thomson Jr.