Replication not working - GUIDs conflict - will be merged later
Hi everyone,
I have a weird problem with replication that I'm unable to solve.
A new account is sychronized from an external provider via imapsync. The mails end up on my backend1. I see that the folder structure is immediately replicated to backend2.
However, a lot of mails are missing and "doveadm replicator status" also states that something failed:
priority fast sync full sync success sync
failed mail@example.com none 00:02:36 00:39:47 - y
There are no error log entries regarding this user at all.
After some research I tried to start the replication manually with "doveadm -D backup -u mail@example.com -d tcp:x.y.z.11" and finally I got an error message:
Jul 31 13:55:37
doveadm(mail@example.com)<11341>
As a matter of fact, the mails in the inbox are the ones that are missing on backend2.
I always end up with this error, no matter what I do:
- I completely removed the folder structure on backend2.
- I removed all database entries.
- I removed this user from the replication, ran imapsync again followed by "doveadm -D backup ..."
I also removed fts-flatcurve to rule this out as a cause of error.
I have no idea what to do. This configuration worked for months. I got this error with 2.3.17.1 and 2.3.19.1
Any hints would be highly appreciated.
Regards Patrick
# 2.3.19.1 (9b53102964): /usr/local/etc/dovecot/dovecot.conf # Pigeonhole version 0.5.19 (4eae2f79) auth_master_user_separator = * auth_mechanisms = plain login doveadm_password = # hidden, use -P to show it doveadm_port = 12345 listen = x.y.z.12 lmtp_save_to_detail_mailbox = yes log_debug = category=fts-flatcurve log_path = /var/log/dovecot.log mail_debug = yes mail_home = /srv/mail/%Ld/%Ln mail_location = maildir:~/Maildir mail_plugins = quota notify replication managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext editheader metric imapc_traffic { fields = bytes_in bytes_out filter = event=imap_command_finished group_by = user } namespace inbox { inbox = yes location = mailbox Drafts { auto = subscribe special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { auto = subscribe special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Spamverdacht { auto = subscribe special_use = \Junk } mailbox Trash { auto = subscribe special_use = \Trash } prefix = INBOX/ separator = / type = private } passdb { args = /usr/local/etc/dovecot/master-user driver = passwd-file master = yes pass = yes } passdb { args = /usr/local/etc/dovecot/dovecot-sql.conf.ext driver = sql } plugin { mail_replica = tcp:x.y.z.11 quota = count:User quota quota_grace = 25M quota_rule2 = INBOX/Trash:storage=+100M quota_status_nouser = DUNNO quota_status_success = DUNNO quota_vsizes = yes quota_warning = storage=100%% quota-full %u quota_warning2 = storage=95%% quota-warning 95 %u quota_warning3 = storage=80%% quota-warning 80 %u quota_warning4 = -storage=100%% quota-ok %u sieve = ~/.dovecot.sieve sieve_after = /usr/local/etc/dovecot/sieve/sieve_after.sieve sieve_default = /usr/local/etc/dovecot/sieve/default.sieve sieve_dir = ~/sieve sieve_extensions = +editheader } protocols = imap pop3 lmtp sieve service aggregator { fifo_listener replication-notify-fifo { user = vmail } unix_listener replication-notify { user = vmail } } service auth { unix_listener auth-userdb { group = vmail mode = 0666 user = vmail } } service config { unix_listener config { mode = 0600 user = vmail } } service doveadm { inet_listener { port = 12345 } } service imap-login { process_min_avail = 2 service_count = 0 } service lmtp { executable = lmtp -L inet_listener lmtp { address = x.y.z.12 port = 24 } process_min_avail = 20 } service managesieve-login { inet_listener sieve { port = 4190 } } service quota-full { executable = script /usr/local/etc/dovecot/quota_full.sh unix_listener quota-full { user = vmail } user = root } service quota-ok { executable = script /usr/local/etc/dovecot/quota_ok.sh unix_listener quota-ok { user = vmail } user = root } service quota-status { client_limit = 1 executable = quota-status -p postfix inet_listener { port = 12340 } } service quota-warning { executable = script /usr/local/etc/dovecot/quota_warning.sh unix_listener quota-warning { user = vmail } user = root } service replicator { process_min_avail = 1 unix_listener replicator-doveadm { mode = 0600 user = vmail } } ssl = required ssl_cert =
Ok This is speculation but i understand the issue at a programming level
what needs to be understood is that imap's uids & ugid's are relative to the host server the email is coming from.
this is generally not an issue with replication on cyrus or dovecot because the server and the replication is being handled by the same server set (ie the same uids & guids etc are generated as things happen)
example replicated data :
-rw------- 1 vmail vmail uarch 185K Jul 29 09:30 1659101404.M875201P20192.mail19.scom.ca,S=189252,W=192431:2,S -rw------- 1 vmail vmail uarch 1.5K Jul 29 09:53 1659102818.M268117P41331.mail18.scom.ca,S=1583,W=1639:2,S -rw------- 1 vmail vmail uarch 1.0M Jul 29 12:52 1659113530.M841469P58214.mail18.scom.ca,S=1095861,W=1113817:2,S -rw------- 1 vmail vmail uarch 210K Jul 29 13:15 1659114913.M958008P31982.mail19.scom.ca,S=215405,W=219216:2,S
you will note the originating server is in the mail file name (mail19 & mail18 in my case)
this is how dovecot sorts out the uids etc on the fly. (i think)
If i have read this correctly you are trying to sync to an external imap server that carries its own uids guids etc which will be different.
where you are saying that you are using imap sync i assume you are using the unix version
# imapsync Name:
imapsync - Email IMAP tool for syncing, copying, migrating and archiving email mailboxes between two imap servers, one way, and without duplicates.
Version:
This documentation refers to Imapsync $Revision: 1.977 $
if so look at the
--useuid :
Use UIDs instead of headers as a criterion to recognize messages. Option --usecache is then implied unless --nousecache is used.
and the --logfile (ie run a logging file when connecting the the external account), it might help with any errors being generated (run imapsync in debug mode to get full detail)
basically using useuid deals with sometimes getting a different uid back from the origional server
i go through this issue more with pop3 as it returns the id list starting at uid 1 (for example) instead of the actual uid against the email on the server.
uids will force a proper sync (imap or pop3) because the uid on the server will always return the same uid for that email message and increments forward inside the account.
if so then imap sync should be sorting this out when syncing the imap accounts ? (ie creating new usid guids etc)
so assuming the above is happening the next question is are you using replication that is fully setup between the two servers or are you doing manual replication (ie running the doveadm command to do the sync?)
(you mentioned using the backup command which would kinda work but full replication does the changes on the fly and should work)
if you are running manual replication you should consider going to the live replication, it will sort out stuff as the imap folders sync etc. (or it should)
The next thing to consider is there were some issues that were fixed in 2.3.19 replication, are you running the same dovecot versions on both servers ?
I do a ton of emails, reporting etc and find that replication works well on dovecot 2.3.19 bewteen both of my mail servers. ie it does not matter which one receives the email it gets sorted out. If there is an error the replication will sort it out on the next sync run through the replication process running in the background.
you can set all of the retries etc for replication in the config files.
the merged later is probably indicating that dovecot will sort stuff out in the background (ie a reindex etc) but that is putting extra stress on the server(s), i used to get the merge or duplicate uids, guids on cyrus and it would try to sort it out on the fly. this would occur when one replicated server was offline and i was forcing a sync update after bringing it back online, this was the case because both servers had received emails into the same account from seperate sources thus the same uid was set for two different messages on each server. (fyi)
with syrus a rebuild was the only was to sort this out
dovecot seems way more resiliant in this department.
again full replication setup would sort these issue out i expect as each server would handle stuff as it happens and adjust uid,guids accordingly.
Happy Sunday !!! Thanks - paul
Paul Kudla
Scom.ca Internet Services http://www.scom.ca 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3
Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca
On 7/31/2022 8:16 AM, Patrick Westenberg wrote:
Hi everyone,
I have a weird problem with replication that I'm unable to solve.
A new account is sychronized from an external provider via imapsync. The mails end up on my backend1. I see that the folder structure is immediately replicated to backend2.
However, a lot of mails are missing and "doveadm replicator status" also states that something failed:
priority fast sync full sync success sync
failed mail@example.com none 00:02:36 00:39:47 - y
There are no error log entries regarding this user at all.
After some research I tried to start the replication manually with "doveadm -D backup -u mail@example.com -d tcp:x.y.z.11" and finally I got an error message:
Jul 31 13:55:37 doveadm(mail@example.com)<11341>
: Debug: brain M: Mailbox INBOX: local=e74590221b6ce6620d29000024583f4e/0/1, remote=20a9ce2b1c6ce662244e0000baba0ddd/0/1: GUIDs conflict - will be merged later As a matter of fact, the mails in the inbox are the ones that are missing on backend2.
I always end up with this error, no matter what I do:
- I completely removed the folder structure on backend2.
- I removed all database entries.
- I removed this user from the replication, ran imapsync again followed by "doveadm -D backup ..."
I also removed fts-flatcurve to rule this out as a cause of error.
I have no idea what to do. This configuration worked for months. I got this error with 2.3.17.1 and 2.3.19.1
Any hints would be highly appreciated.
Regards Patrick
# 2.3.19.1 (9b53102964): /usr/local/etc/dovecot/dovecot.conf # Pigeonhole version 0.5.19 (4eae2f79) auth_master_user_separator = * auth_mechanisms = plain login doveadm_password = # hidden, use -P to show it doveadm_port = 12345 listen = x.y.z.12 lmtp_save_to_detail_mailbox = yes log_debug = category=fts-flatcurve log_path = /var/log/dovecot.log mail_debug = yes mail_home = /srv/mail/%Ld/%Ln mail_location = maildir:~/Maildir mail_plugins = quota notify replication managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext editheader metric imapc_traffic { fields = bytes_in bytes_out filter = event=imap_command_finished group_by = user } namespace inbox { inbox = yes location = mailbox Drafts { auto = subscribe special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { auto = subscribe special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Spamverdacht { auto = subscribe special_use = \Junk } mailbox Trash { auto = subscribe special_use = \Trash } prefix = INBOX/ separator = / type = private } passdb { args = /usr/local/etc/dovecot/master-user driver = passwd-file master = yes pass = yes } passdb { args = /usr/local/etc/dovecot/dovecot-sql.conf.ext driver = sql } plugin { mail_replica = tcp:x.y.z.11 quota = count:User quota quota_grace = 25M quota_rule2 = INBOX/Trash:storage=+100M quota_status_nouser = DUNNO quota_status_success = DUNNO quota_vsizes = yes quota_warning = storage=100%% quota-full %u quota_warning2 = storage=95%% quota-warning 95 %u quota_warning3 = storage=80%% quota-warning 80 %u quota_warning4 = -storage=100%% quota-ok %u sieve = ~/.dovecot.sieve sieve_after = /usr/local/etc/dovecot/sieve/sieve_after.sieve sieve_default = /usr/local/etc/dovecot/sieve/default.sieve sieve_dir = ~/sieve sieve_extensions = +editheader } protocols = imap pop3 lmtp sieve service aggregator { fifo_listener replication-notify-fifo { user = vmail } unix_listener replication-notify { user = vmail } } service auth { unix_listener auth-userdb { group = vmail mode = 0666 user = vmail } } service config { unix_listener config { mode = 0600 user = vmail } } service doveadm { inet_listener { port = 12345 } } service imap-login { process_min_avail = 2 service_count = 0 } service lmtp { executable = lmtp -L inet_listener lmtp { address = x.y.z.12 port = 24 } process_min_avail = 20 } service managesieve-login { inet_listener sieve { port = 4190 } } service quota-full { executable = script /usr/local/etc/dovecot/quota_full.sh unix_listener quota-full { user = vmail } user = root } service quota-ok { executable = script /usr/local/etc/dovecot/quota_ok.sh unix_listener quota-ok { user = vmail } user = root } service quota-status { client_limit = 1 executable = quota-status -p postfix inet_listener { port = 12340 } } service quota-warning { executable = script /usr/local/etc/dovecot/quota_warning.sh unix_listener quota-warning { user = vmail } user = root } service replicator { process_min_avail = 1 unix_listener replicator-doveadm { mode = 0600 user = vmail } } ssl = required ssl_cert =
Replication is fully setup between the two backend servers and worked like a charm for years.
My manual replication was just a desperate trial.
Am 31.07.22 um 14:47 schrieb Paul Kudla (SCOM.CA Internet Services Inc.):
Ok This is speculation but i understand the issue at a programming level
what needs to be understood is that imap's uids & ugid's are relative to the host server the email is coming from.
this is generally not an issue with replication on cyrus or dovecot because the server and the replication is being handled by the same server set (ie the same uids & guids etc are generated as things happen)
example replicated data :
-rw------- 1 vmail vmail uarch 185K Jul 29 09:30 1659101404.M875201P20192.mail19.scom.ca,S=189252,W=192431:2,S -rw------- 1 vmail vmail uarch 1.5K Jul 29 09:53 1659102818.M268117P41331.mail18.scom.ca,S=1583,W=1639:2,S -rw------- 1 vmail vmail uarch 1.0M Jul 29 12:52 1659113530.M841469P58214.mail18.scom.ca,S=1095861,W=1113817:2,S -rw------- 1 vmail vmail uarch 210K Jul 29 13:15 1659114913.M958008P31982.mail19.scom.ca,S=215405,W=219216:2,S
you will note the originating server is in the mail file name (mail19 & mail18 in my case)
this is how dovecot sorts out the uids etc on the fly. (i think)
If i have read this correctly you are trying to sync to an external imap server that carries its own uids guids etc which will be different.
where you are saying that you are using imap sync i assume you are using the unix version
# imapsync Name:
imapsync - Email IMAP tool for syncing, copying, migrating and archiving email mailboxes between two imap servers, one way, and without duplicates.
Version:
This documentation refers to Imapsync $Revision: 1.977 $
if so look at the
--useuid :
Use UIDs instead of headers as a criterion to recognize messages. Option --usecache is then implied unless --nousecache is used.
and the --logfile (ie run a logging file when connecting the the external account), it might help with any errors being generated (run imapsync in debug mode to get full detail)
basically using useuid deals with sometimes getting a different uid back from the origional server
i go through this issue more with pop3 as it returns the id list starting at uid 1 (for example) instead of the actual uid against the email on the server.
uids will force a proper sync (imap or pop3) because the uid on the server will always return the same uid for that email message and increments forward inside the account.
if so then imap sync should be sorting this out when syncing the imap accounts ? (ie creating new usid guids etc)
so assuming the above is happening the next question is are you using replication that is fully setup between the two servers or are you doing manual replication (ie running the doveadm command to do the sync?)
(you mentioned using the backup command which would kinda work but full replication does the changes on the fly and should work)
if you are running manual replication you should consider going to the live replication, it will sort out stuff as the imap folders sync etc. (or it should)
The next thing to consider is there were some issues that were fixed in 2.3.19 replication, are you running the same dovecot versions on both servers ?
I do a ton of emails, reporting etc and find that replication works well on dovecot 2.3.19 bewteen both of my mail servers. ie it does not matter which one receives the email it gets sorted out. If there is an error the replication will sort it out on the next sync run through the replication process running in the background.
you can set all of the retries etc for replication in the config files.
the merged later is probably indicating that dovecot will sort stuff out in the background (ie a reindex etc) but that is putting extra stress on the server(s), i used to get the merge or duplicate uids, guids on cyrus and it would try to sort it out on the fly. this would occur when one replicated server was offline and i was forcing a sync update after bringing it back online, this was the case because both servers had received emails into the same account from seperate sources thus the same uid was set for two different messages on each server. (fyi)
with syrus a rebuild was the only was to sort this out
dovecot seems way more resiliant in this department.
again full replication setup would sort these issue out i expect as each server would handle stuff as it happens and adjust uid,guids accordingly.
Happy Sunday !!! Thanks - paul
Paul Kudla
Scom.ca Internet Services http://www.scom.ca 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3
Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca
On 7/31/2022 8:16 AM, Patrick Westenberg wrote:
Hi everyone,
I have a weird problem with replication that I'm unable to solve.
A new account is sychronized from an external provider via imapsync. The mails end up on my backend1. I see that the folder structure is immediately replicated to backend2.
However, a lot of mails are missing and "doveadm replicator status" also states that something failed:
priority fast sync full sync success sync failed mail@example.com none 00:02:36 00:39:47 - y
There are no error log entries regarding this user at all.
After some research I tried to start the replication manually with "doveadm -D backup -u mail@example.com -d tcp:x.y.z.11" and finally I got an error message:
Jul 31 13:55:37 doveadm(mail@example.com)<11341>
: Debug: brain M: Mailbox INBOX: local=e74590221b6ce6620d29000024583f4e/0/1, remote=20a9ce2b1c6ce662244e0000baba0ddd/0/1: GUIDs conflict - will be merged later As a matter of fact, the mails in the inbox are the ones that are missing on backend2.
I always end up with this error, no matter what I do:
- I completely removed the folder structure on backend2.
- I removed all database entries.
- I removed this user from the replication, ran imapsync again followed by "doveadm -D backup ..."
I also removed fts-flatcurve to rule this out as a cause of error.
I have no idea what to do. This configuration worked for months. I got this error with 2.3.17.1 and 2.3.19.1
Any hints would be highly appreciated.
Regards Patrick
# 2.3.19.1 (9b53102964): /usr/local/etc/dovecot/dovecot.conf # Pigeonhole version 0.5.19 (4eae2f79) auth_master_user_separator = * auth_mechanisms = plain login doveadm_password = # hidden, use -P to show it doveadm_port = 12345 listen = x.y.z.12 lmtp_save_to_detail_mailbox = yes log_debug = category=fts-flatcurve log_path = /var/log/dovecot.log mail_debug = yes mail_home = /srv/mail/%Ld/%Ln mail_location = maildir:~/Maildir mail_plugins = quota notify replication managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext editheader metric imapc_traffic { fields = bytes_in bytes_out filter = event=imap_command_finished group_by = user } namespace inbox { inbox = yes location = mailbox Drafts { auto = subscribe special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { auto = subscribe special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Spamverdacht { auto = subscribe special_use = \Junk } mailbox Trash { auto = subscribe special_use = \Trash } prefix = INBOX/ separator = / type = private } passdb { args = /usr/local/etc/dovecot/master-user driver = passwd-file master = yes pass = yes } passdb { args = /usr/local/etc/dovecot/dovecot-sql.conf.ext driver = sql } plugin { mail_replica = tcp:x.y.z.11 quota = count:User quota quota_grace = 25M quota_rule2 = INBOX/Trash:storage=+100M quota_status_nouser = DUNNO quota_status_success = DUNNO quota_vsizes = yes quota_warning = storage=100%% quota-full %u quota_warning2 = storage=95%% quota-warning 95 %u quota_warning3 = storage=80%% quota-warning 80 %u quota_warning4 = -storage=100%% quota-ok %u sieve = ~/.dovecot.sieve sieve_after = /usr/local/etc/dovecot/sieve/sieve_after.sieve sieve_default = /usr/local/etc/dovecot/sieve/default.sieve sieve_dir = ~/sieve sieve_extensions = +editheader } protocols = imap pop3 lmtp sieve service aggregator { fifo_listener replication-notify-fifo { user = vmail } unix_listener replication-notify { user = vmail } } service auth { unix_listener auth-userdb { group = vmail mode = 0666 user = vmail } } service config { unix_listener config { mode = 0600 user = vmail } } service doveadm { inet_listener { port = 12345 } } service imap-login { process_min_avail = 2 service_count = 0 } service lmtp { executable = lmtp -L inet_listener lmtp { address = x.y.z.12 port = 24 } process_min_avail = 20 } service managesieve-login { inet_listener sieve { port = 4190 } } service quota-full { executable = script /usr/local/etc/dovecot/quota_full.sh unix_listener quota-full { user = vmail } user = root } service quota-ok { executable = script /usr/local/etc/dovecot/quota_ok.sh unix_listener quota-ok { user = vmail } user = root } service quota-status { client_limit = 1 executable = quota-status -p postfix inet_listener { port = 12340 } } service quota-warning { executable = script /usr/local/etc/dovecot/quota_warning.sh unix_listener quota-warning { user = vmail } user = root } service replicator { process_min_avail = 1 unix_listener replicator-doveadm { mode = 0600 user = vmail } } ssl = required ssl_cert =
-- Westenberg + Kueppers GbR Spanische Schanzen 37 ---- Buero Koeln ---- 47495 Rheinberg pwestenberg@wk-serv.de Tel.: +49 (0)2843 90369-06 http://www.wk-serv.de Fax : +49 (0)2843 90369-07 Gesellschafter: Sebastian Kueppers & Patrick Westenberg
Very interesting new insights:
When I use imapsync and let it synchronize mails from INBOX to INBOX/testfolder, the automatic replication works fine. All mails are synchronized between my two backends.
When I move the mails to the INBOX (doveadm move -u mail@example.com INBOX mailbox INBOX/testfolder all), these mails are lost on the replica! They are neither in INBOX, nor in INBOX/testfolder
Regards Patrick
OK thanks for the updates
Long story short i went through a bunch of replication issues when i was first setting up dovecot.
Most of which were fixed in 2.3.19 and i have not seen any issues since.
in general i had to turn on the debugging mode (mail_debug = yes)
and filter syslog by "replication"
It was discovered through some work that any account over 300 physical folders (give or take) would not replicate (physical size of mailbox had nothing to do with the issue just folder count) and would fail without error (why i asked the version this was an issue in 2.3.18 and before?)
when a replication sync failed it did say in the logs replication requested for <email account> but it would fail and not log the error why, replication timeouts were however recorded?
I my self ended up patching the c code in the replicator to get more detail how far a replication sync would go (ie i added a bunch of logging code to track the issues better)
maybe look at folder counts? - It was only effecting 5 of my customers but was a pain to find the issue.
that being said i had to make some scripts to show user replication and what was outstanding between the servers (i had to run them on both servers seperately to acurately get the replication status going in both directions.
I now run these scripts in the background every 5 minutes to make sure replication on both sides are in sync.
also i found tcpip replication (without ssl) worked the best? tcpip over SSL had timing errors, ssl in my case was not required as the two servers were hardwired together (ie no security leaks)
if your two servers are at two different sites consider a small vpn setup
i like using gre for this as it is hardwired by static ip addresses on both sides and you can set a 10.x.x.x (or whatever) to communicate between the servers ??
also on using the doveadm ssh scripts introduced rights issues between the two servers file systems (even though they were identical)
Basically i tried everything !
tcpip - non ssl just seem to work the best.
sync.status :
doveadm replicator status echo ' ' doveadm replicator dsync-status | grep -v 'Not connected'
which outputs :
# sync.status Queued 'sync' requests 0
Queued 'high' requests 0
Queued 'low' requests 0
Queued 'failed' requests 0
Queued 'full resync' requests 0
Waiting 'failed' requests 0
Total number of known users 269
oh and another major thing was the replication selecting from the database properly.
i use postgresql
#iterate_query = SELECT user, password FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'
iterate_query = SELECT "username" as user, domain FROM email_users WHERE status = True and alias_flag = False
note my db setup uses status=True for an active user and alias_flag=False for an alias email redirect inside postfix (fyi), you can ignore these based on how your database is setup.
dovecot is very intelligent if an active box gets activity on one server then it wont replicate to the other server if that mailbox is not returned to sync users list (but it will setup / activate replication on the server that received the email) - this took a bit to figure out as well.
# cat sync.users doveadm replicator status '*' | grep ' y'
would only sync what was bad in the replication??
try running
# doveadm user '*' | wc 269 269 5244
on both servers the account count (269) in my case should be the same on both servers.
Here is my dovecot.conf config maybe it will help
replication configs (other the the server ip address) must be the same on both sides.
# cat dovecot.conf # 2.3.14 (cee3cbc0d): /usr/local/etc/dovecot/dovecot.conf # OS: FreeBSD 12.1-RELEASE amd64 # Hostname: mail18.scom.ca
auth_debug = no auth_debug_passwords = no
default_process_limit = 16384
mail_debug = no
#lock_method = dotlock #mail_max_lock_timeout = 300s
#mbox_read_locks = dotlock #mbox_write_locks = dotlock
mmap_disable = yes dotlock_use_excl = no mail_fsync = always mail_nfs_storage = no mail_nfs_index = no
auth_mechanisms = plain login auth_verbose = yes base_dir = /data/dovecot/run/ debug_log_path = syslog disable_plaintext_auth = no dsync_features = empty-header-workaround
info_log_path = syslog login_greeting = SCOM.CA Internet Services Inc. - Dovecot ready login_log_format_elements = user=<%u> method=%m rip=%r lip=%l mpid=%e %c
mail_location = maildir:~/
mail_plugins = " virtual notify replication fts fts_lucene " mail_prefetch_count = 20
protocols = imap pop3 lmtp sieve
protocol lmtp { mail_plugins = $mail_plugins sieve postmaster_address = monitor@scom.ca }
service lmtp { process_limit=1000 vsz_limit = 512m client_limit=1 unix_listener /usr/home/postfix.local/private/dovecot-lmtp { group = postfix mode = 0600 user = postfix } }
protocol lda { mail_plugins = $mail_plugins sieve }
service lda { process_limit=1000 vsz_limit = 512m }
service imap { process_limit=4096 vsz_limit = 2g client_limit=1 }
service pop3 { process_limit=1000 vsz_limit = 512m client_limit=1 }
namespace inbox { inbox = yes location = mailbox Drafts { auto = subscribe special_use = \Drafts } mailbox Sent { auto = subscribe special_use = \Sent } mailbox Trash { auto = subscribe special_use = \Trash } prefix = separator = / }
passdb { args = /usr/local/etc/dovecot/dovecot-pgsql.conf driver = sql }
doveadm_port = 12345 doveadm_password = secretxxxx
service doveadm { process_limit = 0 process_min_avail = 0 idle_kill = 0 client_limit = 1 user = vmail inet_listener { port = 12345 } }
service config { unix_listener config { user = vmail } }
dsync_remote_cmd = ssh -l%{login} %{host} doveadm dsync-server -u%u #dsync_remote_cmd = doveadm sync -d -u%u
replication_dsync_parameters = -d -N -l 300 -U
plugin { mail_log_events = delete undelete expunge copy mailbox_delete mailbox_rename mail_log_fields = uid, box, msgid, from, subject, size, vsize, flags push_notification_driver = dlog
sieve = file:~/sieve;active=~/sieve/.dovecot.sieve #sieve = ~/.dovecot.sieve sieve_duplicate_default_period = 1h sieve_duplicate_max_period = 1h sieve_extensions = +duplicate +notify +imapflags +vacation-seconds sieve_global_dir = /usr/local/etc/dovecot/sieve sieve_before = /usr/local/etc/dovecot/sieve/duplicates.sieve
mail_replica = tcp:10.221.0.19:12345 #mail_replica = remote:vmail@10.221.0.19 #replication_sync_timeout = 2
fts = lucene fts_lucene = whitespace_chars=@. fts_autoindex = yes fts_languages = en }
#sieve_extensions = vnd.dovecot.duplicate
#sieve_plugins = vnd.dovecot.duplicate
service anvil { process_limit = 1 client_limit=5000 vsz_limit = 512m unix_listener anvil { group = vmail mode = 0666 } }
service indexer-worker { vsz_limit = 2g }
service auth { process_limit = 1 client_limit=5000 vsz_limit = 1g
unix_listener auth-userdb {
mode = 0660
user = vmail
group = vmail
}
unix_listener /var/spool/postfix/private/auth {
mode = 0666
}
}
service stats { process_limit = 1000 vsz_limit = 1g unix_listener stats-reader { group = vmail mode = 0666 } unix_listener stats-writer { group = vmail mode = 0666 } } userdb { args = /usr/local/etc/dovecot/dovecot-pgsql.conf driver = sql
}
protocol imap { mail_max_userip_connections = 50 mail_plugins = $mail_plugins notify replication }
protocol pop3 { mail_max_userip_connections = 50 mail_plugins = $mail_plugins notify replication }
protocol imaps { mail_max_userip_connections = 25 mail_plugins = $mail_plugins notify replication }
protocol pop3s { mail_max_userip_connections = 25 mail_plugins = $mail_plugins notify replication }
service managesieve-login { process_limit = 1000 vsz_limit = 1g inet_listener sieve { port = 4190 } }
verbose_proctitle = yes
replication_max_conns = 100
replication_full_sync_interval = 1d
service replicator { client_limit = 0 drop_priv_before_exec = no idle_kill = 4294967295s process_limit = 1 process_min_avail = 0 service_count = 0 vsz_limit = 8g unix_listener replicator-doveadm { mode = 0600 user = vmail } vsz_limit = 8192M }
service aggregator { process_limit = 1000 #vsz_limit = 1g fifo_listener replication-notify-fifo { user = vmail group = vmail mode = 0666 }
}
service pop3-login { process_limit = 1000 client_limit = 100 vsz_limit = 512m }
service imap-urlauth-login { process_limit = 1000 client_limit = 1000 vsz_limit = 1g }
service imap-login { process_limit=1000 client_limit = 1000 vsz_limit = 1g }
protocol sieve { managesieve_implementation_string = Dovecot Pigeonhole managesieve_max_line_length = 65536 }
#Addition ssl config !include sni.conf
# cat dovecot-pgsql.conf driver = pgsql connect = host=localhost port=5433 dbname= user= password= default_pass_scheme = PLAIN
password_query = SELECT username as user, password FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'
user_query = SELECT home, uid, gid FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'
#iterate_query = SELECT user, password FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'
iterate_query = SELECT "username" as user, domain FROM email_users WHERE status = True and alias_flag = False
Please note the above selects return a full email address (how i do it)
- ie paul@scom.ca
cat sni.conf
#sni.conf ssl = yes verbose_ssl = yes ssl_dh =
#Default *.scom.ca ssl_key =
local_name .scom.ca { ssl_key = /programs/common/getssl.cert -c *.scom.ca -q yes ssl_cert = /programs/common/getssl.cert -c *.scom.ca -q yes ssl_ca = /programs/common/getssl.cert -c *.scom.ca -q yes }
local_name mail.clancyca.com { ssl_key = /programs/common/getssl.cert -c mail.clancyca.com -q yes ssl_cert = /programs/common/getssl.cert -c mail.clancyca.com -q yes ssl_ca = /programs/common/getssl.cert -c mail.clancyca.com -q yes }
local_name mail.paulkudla.net { ssl_key = /programs/common/getssl.cert -c mail.paulkudla.net -q yes ssl_cert = /programs/common/getssl.cert -c mail.paulkudla.net -q yes ssl_ca = /programs/common/getssl.cert -c mail.paulkudla.net -q yes }
local_name secure.clancyca.com { ssl_key = /programs/common/getssl.cert -c secure.clancyca.com -q yes ssl_cert = /programs/common/getssl.cert -c secure.clancyca.com -q yes ssl_ca = /programs/common/getssl.cert -c secure.clancyca.com -q yes }
local_name mail.ekst.ca { ssl_key = /programs/common/getssl.cert -c mail.ekst.ca -q yes ssl_cert = /programs/common/getssl.cert -c mail.ekst.ca -q yes ssl_ca = /programs/common/getssl.cert -c mail.ekst.ca -q yes }
local_name mail.hamletdevelopments.ca { ssl_key = /programs/common/getssl.cert -c mail.hamletdevelopments.ca -q yes ssl_cert = /programs/common/getssl.cert -c mail.hamletdevelopments.ca -q yes ssl_ca = /programs/common/getssl.cert -c mail.hamletdevelopments.ca -q yes }
note the sni.conf above suck in the certs from a db.
Another thought is are you running duplicate supression, i am not sure how that would work when using imapsync (ie i have to assume a lot of emails when you run a sync would carry the same info)
duplicate suppression seems to pickup on job numbers, to, from etc to decide if an email is a duplicate. Maybe this is also an issue.
# cat duplicates.sieve require "duplicate"; # for dovecot >= 2.2.18
if duplicate { discard; stop; }
Happy Monday !!! Thanks - paul
Paul Kudla
Scom.ca Internet Services http://www.scom.ca 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3
Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca
On 8/1/2022 5:15 AM, Patrick Westenberg wrote:
Very interesting new insights:
When I use imapsync and let it synchronize mails from INBOX to INBOX/testfolder, the automatic replication works fine. All mails are synchronized between my two backends.
When I move the mails to the INBOX (doveadm move -u mail@example.com INBOX mailbox INBOX/testfolder all), these mails are lost on the replica! They are neither in INBOX, nor in INBOX/testfolder
Regards Patrick
On 8/1/22 11:15, Patrick Westenberg wrote:
Very interesting new insights:
When I use imapsync and let it synchronize mails from INBOX to INBOX/testfolder, the automatic replication works fine. All mails are synchronized between my two backends.
When I move the mails to the INBOX (doveadm move -u mail@example.com INBOX mailbox INBOX/testfolder all), these mails are lost on the replica! They are neither in INBOX, nor in INBOX/testfolder
Regards Patrick
Hi,
every now and then I have the same problem on our servers. Currently, I'm running Dovecot 2.3.19.1 as well, but I upgraded directly from 2.3.16 due to other issues with the versions in between.
Last time I observed a de-sync due to a GUID change, it appeared like the user had moved a folder around in their mailbox. And indeed, the output of 'doveadm mailbox status -u someuser guid '*' listed different GUIDs. Dovecot actually logged some errors for this case:
Dovecot log from replica1: Jul 27 12:06:08 replica1 dovecot[3431]: doveadm(someuser)<10206><s1aFMQ8O4WLeJwAAyQQkNg>: Error: Duplicate mailbox GUID 78c9dc2c0c0ee162c10800000ca22142 for mailboxes path/to/folder and path/to/folder-temp-1 - giving a new GUID b0053e390f0ee162de270000c9042436 to path/to/folder Jul 27 12:06:08 replica1 dovecot[3431]: doveadm(someuser)<10208><fgWCCRAO4WLgJwAAyQQkNg>: Error: Duplicate mailbox GUID 78c9dc2c0c0ee162c10800000ca22142 for mailboxes path/to/folder and path/to/folder-temp-1 - giving a new GUID 5823fe0d100ee162e0270000c9042436 to path/to/folder
Dovecot log from replica2:
Jul 27 12:06:04 replica2 dovecot[47018]:
doveadm(someuser)<2239>
At that time, only replica2 was accepting imap connections. In this particular case, Dovecot eventually managed to get things back in sync after way over 24h, but I also had users out of sync for multiple days. Running 'doveadm -Dv sync -u someuser -d' manually gave me the same error message, but didn't change anything.
Other things I've observed:
- it's not limited to a fixed set of users (unlike the too-many-folders-thing with Dovecot 2.3.1[78])
- it's not limited to newly created users, but also affects users, that have been in sync for months/years
- it's not limited to mailboxes with lots of imap operations going on
- it's not specific to very large or very small mailboxes (although I've only seen it for folders with a small number of mails in them)
- in most cases, Dovecot doesn't log any errors
- it does seem to be related to something an imap client can trigger
As of now, my "fix" is to
- make sure that one of the replicas has all mails for that folder (we're using maildir, so I can just rsync the individual mails/folders)
- create a full copy of the complete folder as backup
- remove the user from replication
- 'doveadm mailbox delete' the folder on one replica to get rid of one of the conflicting guids (one time, Dovecot replicated the deletion despite removing the user from replication, so the backup came in handy)
- alternatively, you might be fine by deleting the folder's index files
- add the user back to replication
- let dsync replicate the user -> fixed
It's not a very convenient way to resolve this, but maybe it helps. Any better solutions are greatly appreciated!
Best Sebastian
ok i went through this as well a bit
there is a replication full sync variable (i am having trouble finding it)
24h is the default but i might have rebuilt dovecot modifying this default
after i got things working i put everything back to default code.
yep i did
from dovecot-2.3.19/src/replication
see :
aggregator/replicator-connection.c:#define MAX_INBUF_SIZE 1024 aggregator/replicator-connection.c:#define REPLICATOR_MEMBUF_MAX_SIZE 1024*1024 aggregator/replicator-connection.c: conn->queue[i] = buffer_create_dynamic(default_pool, 1024); Binary file replicator/replicator-brain.o matches replicator/replicator-settings.c: .replication_full_sync_interval = 60*60*24, replicator/notify-connection.c:#define MAX_INBUF_SIZE (1024*64) Binary file replicator/doveadm-connection.o matches Binary file replicator/.libs/replicator matches replicator/replicator-brain.c: pool = pool_alloconly_create("replication brain", 1024); replicator/replicator-queue.c: queue->user_queue = priorityq_init(user_priority_cmp, 1024); replicator/replicator-queue.c: hash_table_create(&queue->user_hash, default_pool, 1024, Binary file replicator/notify-connection.o matches Binary file replicator/dsync-client.o matches
I do not believe there is a settable variable in dovecot.conf ?
I could be wrong.
the actual code containing the variable is below, change and recompile all and that should/might help.
replicator/replicator-settings.c: .replication_full_sync_interval = 60*60*24,
change to 24 so something more practical ?
note 60*60*24 is math (ie how many seconds in between full syncs) - ie do not change 24 to 24h for example.
do this on both servers.
note that a full sync interval stress wise on the server is dependant on how much physical mail you have in the mbox.
note that the full resync interval syncs both accounts from scratch.
also note 6hrs is not a bad place to start?
the replicator service will deal with this in the background
there are also other variables hard set (like i believe 15m for the retry bad sync interval ?)
you will need to dig through the replicator code to find these.
Happy Tuesday !!! Thanks - paul
Paul Kudla
Scom.ca Internet Services http://www.scom.ca 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3
Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca
On 8/2/2022 9:30 AM, Sebastian Marske wrote:
On 8/1/22 11:15, Patrick Westenberg wrote:
Very interesting new insights:
When I use imapsync and let it synchronize mails from INBOX to INBOX/testfolder, the automatic replication works fine. All mails are synchronized between my two backends.
When I move the mails to the INBOX (doveadm move -u mail@example.com INBOX mailbox INBOX/testfolder all), these mails are lost on the replica! They are neither in INBOX, nor in INBOX/testfolder
Regards Patrick
Hi,
every now and then I have the same problem on our servers. Currently, I'm running Dovecot 2.3.19.1 as well, but I upgraded directly from 2.3.16 due to other issues with the versions in between.
Last time I observed a de-sync due to a GUID change, it appeared like the user had moved a folder around in their mailbox. And indeed, the output of 'doveadm mailbox status -u someuser guid '*' listed different GUIDs. Dovecot actually logged some errors for this case:
Dovecot log from replica1: Jul 27 12:06:08 replica1 dovecot[3431]: doveadm(someuser)<10206><s1aFMQ8O4WLeJwAAyQQkNg>: Error: Duplicate mailbox GUID 78c9dc2c0c0ee162c10800000ca22142 for mailboxes path/to/folder and path/to/folder-temp-1 - giving a new GUID b0053e390f0ee162de270000c9042436 to path/to/folder Jul 27 12:06:08 replica1 dovecot[3431]: doveadm(someuser)<10208><fgWCCRAO4WLgJwAAyQQkNg>: Error: Duplicate mailbox GUID 78c9dc2c0c0ee162c10800000ca22142 for mailboxes path/to/folder and path/to/folder-temp-1 - giving a new GUID 5823fe0d100ee162e0270000c9042436 to path/to/folder
Dovecot log from replica2: Jul 27 12:06:04 replica2 dovecot[47018]: doveadm(someuser)<2239>
: Warning: Failed to do incremental sync for mailbox path/to/folder, retry with a full sync (uidnext 1 < 13) Jul 27 12:06:04 replica2 dovecot[47018]: doveadm(someuser)<2241><ix0uKQwO4WLBCAAADKIhQg>: Error: Duplicate mailbox GUID 0ccaab01079031620e1e00000ca22142 for mailboxes path/to/folder and some/folder - giving a new GUID 78c9dc2c0c0ee162c10800000ca22142 to path/to/folder At that time, only replica2 was accepting imap connections. In this particular case, Dovecot eventually managed to get things back in sync after way over 24h, but I also had users out of sync for multiple days. Running 'doveadm -Dv sync -u someuser -d' manually gave me the same error message, but didn't change anything.
Other things I've observed:
- it's not limited to a fixed set of users (unlike the too-many-folders-thing with Dovecot 2.3.1[78])
- it's not limited to newly created users, but also affects users, that have been in sync for months/years
- it's not limited to mailboxes with lots of imap operations going on
- it's not specific to very large or very small mailboxes (although I've only seen it for folders with a small number of mails in them)
- in most cases, Dovecot doesn't log any errors
- it does seem to be related to something an imap client can trigger
As of now, my "fix" is to
- make sure that one of the replicas has all mails for that folder (we're using maildir, so I can just rsync the individual mails/folders)
- create a full copy of the complete folder as backup
- remove the user from replication
- 'doveadm mailbox delete' the folder on one replica to get rid of one of the conflicting guids (one time, Dovecot replicated the deletion despite removing the user from replication, so the backup came in handy)
- alternatively, you might be fine by deleting the folder's index files
- add the user back to replication
- let dsync replicate the user -> fixed
It's not a very convenient way to resolve this, but maybe it helps. Any better solutions are greatly appreciated!
Best Sebastian
Hello
Am 02.08.22 um 20:24 schrieb Gerald Galster:
(we're using maildir, so I can just rsync the individual mails/folders)
I'm curious if anybody experienced this issue using mdbox. As far as I remember it's better suited for replication as filenames and location do not change on disk (index only).
Yes, we occationally got those errors too with mdbox format. Usually the user renamed or moved one of the system folders like INBOX, Sent etc. Then dovecot was faster in recreating it again, than replicating that change to the standby machine. Therefore it can not create the same folder again on the replica.
This will not be fixed automatically. You have to check that all emails are on the original machine, then remove that account from the replica and sync.
Kind regards, Christian Mack
-- Christian Mack Universität Konstanz Kommunikations-, Informations-, Medienzentrum (KIM) Abteilung IT-Dienste Forschung, Lehre, Infrastruktur 78457 Konstanz +49 7531 88-4416
Here are some logs, maybe related to my problem.
Aug 23 20:43:57 Panic: doveadm(mail@example.com)<1355655><rdvZDu0fBWOHrxQAuroN3Q>: file dsync-brain-mailbox.c: line 883 (dsync_brain_slave_recv_mailbox): assertion failed: (memcmp(dsync_box->mailbox_guid, local_dsync_box.mailbox_guid, sizeof(dsync_box->mailbox_guid)) == 0) Aug 23 20:43:57 Error: doveadm(mail@example.com)<1355655><rdvZDu0fBWOHrxQAuroN3Q>: Raw backtrace: /usr/local/lib/dovecot/libdovecot.so.0(backtrace_append+0x42) [0x7f3c4c30e562] -> /usr/local/lib/dovecot/libdovecot.so.0(backtrace_get+0x1e) [0x7f3c4c30e67e] -> /usr/local/lib/dovecot/libdovecot.so.0(+0x1022db) [0x7f3c4c31b2db] -> /usr/local/lib/dovecot/libdovecot.so.0(+0x102371) [0x7f3c4c31b371] -> /usr/local/lib/dovecot/libdovecot.so.0(+0x55589) [0x7f3c4c26e589] -> dovecot/doveadm-server 172.17.1.12 mail@example.com slave_recv_mailbox [0x56099f680618] -> dovecot/doveadm-server 172.17.1.12 mail@example.com slave_recv_mailbox [0x56099f67de26] -> dovecot/doveadm-server 172.17.1.12 mail@example.com slave_recv_mailbox [0x56099f67e483] -> dovecot/doveadm-server 172.17.1.12 mail@example.com slave_recv_mailbox [0x56099f68fe1f] -> /usr/local/lib/dovecot/libdovecot.so.0(io_loop_call_io+0x69) [0x7f3c4c331509] -> /usr/local/lib/dovecot/libdovecot.so.0(io_loop_handler_run_internal+0x132) [0x7f3c4c332bf2] -> /usr/local/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0x50) [0x7f3c4c3315b0] -> /usr/local/lib/dovecot/libdovecot.so.0(io_loop_run+0x40) [0x7f3c4c331770] -> dovecot/doveadm-server 172.17.1.12 mail@example.com slave_recv_mailbox [0x56099f662335] -> dovecot/doveadm-server 172.17.1.12 mail@example.com slave_recv_mailbox [0x56099f664ba5] -> dovecot/doveadm-server 172.17.1.12 mail@example.com slave_recv_mailbox [0x56099f665c5a] -> dovecot/doveadm-server 172.17.1.12 mail@example.com slave_recv_mailbox [0x56099f675bb1] -> dovecot/doveadm-server 172.17.1.12 mail@example.com slave_recv_mailbox [0x56099f679ffa] -> /usr/local/lib/dovecot/libdovecot.so.0(io_loop_call_io+0x69) [0x7f3c4c331509] -> /usr/local/lib/dovecot/libdovecot.so.0(io_loop_handler_run_internal+0x132) [0x7f3c4c332bf2] -> /usr/local/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0x50) [0x7f3c4c3315b0] -> /usr/local/lib/dovecot/libdovecot.so.0(io_loop_run+0x40) [0x7f3c4c331770] -> /usr/local/lib/dovecot/libdovecot.so.0(master_service_run+0x13) [0x7f3c4c2a4333] -> dovecot/doveadm-server 172.17.1.12 mail@example.com slave_recv_mailbox [0x56099f655162] -> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x7f3c4bf30d0a] -> dovecot/doveadm-server 172.17.1.12 mail@example.com slave_recv_mailbox [0x56099f6551ea]
participants (5)
-
Christian Mack
-
Gerald Galster
-
Patrick Westenberg
-
Paul Kudla (SCOM.CA Internet Services Inc.)
-
Sebastian Marske