Sieve Script Replication Gliches (Report #2)
Hi,
I've observed some odd behaviour with dsync replication between two hosts, specifically to do with sieve script replication.
In short, I have two hosts which replicate in a master-master type setup where almost all of the reads and writes happen to just one of the two hosts.
They are both running 2.2.devel (9dc6403), which is close to the latest 2.2 -git . Pigeonhole is running master-0.4 . This is on x86_64 Gentoo.
Normal mail replication between Maildir's for all users works fine, however it appears that something recently committed to the code has broken sieve script replication between the two. I am sure this did once work. Replication is via tcps: .
Sieve scripts on the lesser-used host are not up to date by an order of days/weeks with the main host and they don't seem to re-replicate - even if the rules don't exist at all on the replica.
The symptoms and effects look to be the same as this (unanswered) post from December:
http://dovecot.org/list/dovecot/2015-December/102690.html
I am not sure how to view the transaction log files, but I am seeing the same symptoms, ie no live replication, and on the lesser-used host almost all the scripts were old and some had the 1970 date on them.
Even after forcing a [dsync replication replicate '*'] the scripts are not replicated. As it stands now there are no sieve scripts on one of the two members and the system seems unable to replicate by itself.
Secondly, I am also seeing some doubled up outputs if I run 'doveadm sieve list -A':
thunderstorm reuben # doveadm sieve list -A reuben rules ACTIVE liam rules ACTIVE kaylene rules ACTIVE reuben rules ACTIVE liam rules ACTIVE kaylene rules ACTIVE ...
Has anyone else experienced the replication problem? Are sieve scripts actually replicating in live time for other 2.2.24/2.2.25 users as well? For me I didn't notice this till I went looking so I wonder if other people are experiencing this but just not aware of it yet...?
Reuben
Reuben Farrelly <reuben-dovecot@reub.net> wrote:
I've observed some odd behaviour with dsync replication between two hosts, specifically to do with sieve script replication.
[…]
Has anyone else experienced the replication problem? Are sieve scripts actually replicating in live time for other 2.2.24/2.2.25 users as well? For me I didn't notice this till I went looking so I wonder if other people are experiencing this but just not aware of it yet...?
Welcome to the club: http://dovecot.org/list/dovecot/2016-July/105014.html
If I am not mistaken you are the fourth now reporting this issue. No solution yet.
Regards, Michael
Op 7/31/2016 om 4:27 AM schreef Reuben Farrelly:
Hi,
I've observed some odd behaviour with dsync replication between two hosts, specifically to do with sieve script replication.
In short, I have two hosts which replicate in a master-master type setup where almost all of the reads and writes happen to just one of the two hosts.
They are both running 2.2.devel (9dc6403), which is close to the latest 2.2 -git . Pigeonhole is running master-0.4 . This is on x86_64 Gentoo.
Normal mail replication between Maildir's for all users works fine, however it appears that something recently committed to the code has broken sieve script replication between the two. I am sure this did once work. Replication is via tcps: .
Sieve scripts on the lesser-used host are not up to date by an order of days/weeks with the main host and they don't seem to re-replicate - even if the rules don't exist at all on the replica.
The symptoms and effects look to be the same as this (unanswered) post from December:
http://dovecot.org/list/dovecot/2015-December/102690.html
I am not sure how to view the transaction log files, but I am seeing the same symptoms, ie no live replication, and on the lesser-used host almost all the scripts were old and some had the 1970 date on them.
Even after forcing a [dsync replication replicate '*'] the scripts are not replicated. As it stands now there are no sieve scripts on one of the two members and the system seems unable to replicate by itself.
Secondly, I am also seeing some doubled up outputs if I run 'doveadm sieve list -A':
thunderstorm reuben # doveadm sieve list -A reuben rules ACTIVE liam rules ACTIVE kaylene rules ACTIVE reuben rules ACTIVE liam rules ACTIVE kaylene rules ACTIVE ...
Has anyone else experienced the replication problem? Are sieve scripts actually replicating in live time for other 2.2.24/2.2.25 users as well? For me I didn't notice this till I went looking so I wonder if other people are experiencing this but just not aware of it yet...?
I will look at this more soon.
Regards,
Stephan.
On 1/08/2016 10:01 AM, Stephan Bosch wrote:
Op 7/31/2016 om 4:27 AM schreef Reuben Farrelly:
Hi,
I've observed some odd behaviour with dsync replication between two hosts, specifically to do with sieve script replication.
Has anyone else experienced the replication problem? Are sieve scripts actually replicating in live time for other 2.2.24/2.2.25 users as well? For me I didn't notice this till I went looking so I wonder if other people are experiencing this but just not aware of it yet...?
I will look at this more soon.
Regards,
Stephan.
Some further information.
On the primary host:
thunderstorm home # ls -al */sieve/rules.sieve -rw------- 1 user1 user1 3570 Jul 31 11:45 user1/sieve/rules.sieve -rw------- 1 user2 user2 175 Mar 15 2014 user2/sieve/rules.sieve -rw------- 1 user3 user3 725 Jul 31 09:32 user3/sieve/rules.sieve -rw------- 1 user4 user4 0 Jan 1 1970 user4/sieve/rules.sieve -rw------- 1 user5 user5 0 Jan 1 1970 user5/sieve/rules.sieve -rw-r--r-- 1 user6 user6 3719 Jul 31 11:24 user6/sieve/rules.sieve thunderstorm home #
On the secondary host:
lightning home # ls -al */sieve/rules.sieve -rw------- 1 user1 user1 3570 Jan 1 1970 user1/sieve/rules.sieve -rw------- 1 user2 user2 175 Mar 14 2014 user2/sieve/rules.sieve -rw------- 1 user3 user3 725 Jul 31 07:32 user3/sieve/rules.sieve -rw------- 1 user4 user4 0 Jan 1 1970 user4/sieve/rules.sieve -rw-r--r-- 1 user5 user5 0 Jan 1 1970 user5/sieve/rules.sieve -rw-r--r-- 1 user6 user6 3719 Jan 1 1970 user6/sieve/rules.sieve lightning home #
In other words, the rules did eventually get propagated across, and based on the file sizes they are complete.
But there is obviously something amiss with handling of dates (which in turn may relate to how the system determines that the file on each server is up to date or not, I guess). In this case the two systems are in different timezones - the primary is GMT+10 and the secondary GMT+8.
Also the status of active users is not always replicated either. On one host the output of 'doveadm sieve list -A' shows my own account as ACTIVE but the other host shows all users - except for my account - as being active, and the sieve script for my account is not being replicated.
The other interesting thing is the output of:
dovecot sieve list -A
While (as I said above) the output of this command is doubled up on the main host, it is not doubled up on the secondary host.
Reuben
Op 8/1/2016 om 3:37 AM schreef Reuben Farrelly:
On 1/08/2016 10:01 AM, Stephan Bosch wrote:
Op 7/31/2016 om 4:27 AM schreef Reuben Farrelly:
Hi,
I've observed some odd behaviour with dsync replication between two hosts, specifically to do with sieve script replication.
Has anyone else experienced the replication problem? Are sieve scripts actually replicating in live time for other 2.2.24/2.2.25 users as well? For me I didn't notice this till I went looking so I wonder if other people are experiencing this but just not aware of it yet...?
I will look at this more soon.
Regards,
Stephan.
Some further information.
On the primary host:
thunderstorm home # ls -al */sieve/rules.sieve -rw------- 1 user1 user1 3570 Jul 31 11:45 user1/sieve/rules.sieve -rw------- 1 user2 user2 175 Mar 15 2014 user2/sieve/rules.sieve -rw------- 1 user3 user3 725 Jul 31 09:32 user3/sieve/rules.sieve -rw------- 1 user4 user4 0 Jan 1 1970 user4/sieve/rules.sieve -rw------- 1 user5 user5 0 Jan 1 1970 user5/sieve/rules.sieve -rw-r--r-- 1 user6 user6 3719 Jul 31 11:24 user6/sieve/rules.sieve thunderstorm home #
On the secondary host:
lightning home # ls -al */sieve/rules.sieve -rw------- 1 user1 user1 3570 Jan 1 1970 user1/sieve/rules.sieve -rw------- 1 user2 user2 175 Mar 14 2014 user2/sieve/rules.sieve -rw------- 1 user3 user3 725 Jul 31 07:32 user3/sieve/rules.sieve -rw------- 1 user4 user4 0 Jan 1 1970 user4/sieve/rules.sieve -rw-r--r-- 1 user5 user5 0 Jan 1 1970 user5/sieve/rules.sieve -rw-r--r-- 1 user6 user6 3719 Jan 1 1970 user6/sieve/rules.sieve lightning home #
In other words, the rules did eventually get propagated across, and based on the file sizes they are complete.
But there is obviously something amiss with handling of dates (which in turn may relate to how the system determines that the file on each server is up to date or not, I guess). In this case the two systems are in different timezones - the primary is GMT+10 and the secondary GMT+8.
Also the status of active users is not always replicated either. On one host the output of 'doveadm sieve list -A' shows my own account as ACTIVE but the other host shows all users - except for my account - as being active, and the sieve script for my account is not being replicated.
This should fix the file timestamps getting set at unix time_t 0:
https://github.com/dovecot/pigeonhole/commit/af91dd3f2d78da752292dce27f9e76d...
I haven't been able to replicate the situation where this occurs though, since my current replication setup is very simple.
I need to extend my replication setup to test this more thoroughly.
So, please test this at your end first.
Regards,
Stephan.
On 24/08/2016 10:58 AM, Stephan Bosch wrote:
Op 8/1/2016 om 3:37 AM schreef Reuben Farrelly:
In other words, the rules did eventually get propagated across, and based on the file sizes they are complete.
But there is obviously something amiss with handling of dates (which in turn may relate to how the system determines that the file on each server is up to date or not, I guess). In this case the two systems are in different timezones - the primary is GMT+10 and the secondary GMT+8.
Also the status of active users is not always replicated either. On one host the output of 'doveadm sieve list -A' shows my own account as ACTIVE but the other host shows all users - except for my account - as being active, and the sieve script for my account is not being replicated. This should fix the file timestamps getting set at unix time_t 0:
https://github.com/dovecot/pigeonhole/commit/af91dd3f2d78da752292dce27f9e76d...
I haven't been able to replicate the situation where this occurs though, since my current replication setup is very simple.
I need to extend my replication setup to test this more thoroughly.
So, please test this at your end first.
Regards,
Stephan.
Thanks Stephan. I have re-tested and the dates are now all look to be correct on the replicated scripts. We can cross that off as fixed now.
There is still a problem with the scripts not being replicated though between replicated hosts. They do eventually catch up many hours later. I don't know what the trigger is for them updating but it's not triggered by delivery attempts (as every time a delivery was attempted the secondary complained about the missing sieve script).
Thanks, Reuben
Hey guys,
I was gonna report this issue too. New script FILES get replicated right away but changes to an existing file are only replicated with a full sync (looks like this is every 24h by default).
My assumption is this happens bc there’s no index file for sieve scripts.
Cheers, Jean-Luc
On Sep 7, 2016, at 5:44 AM, Reuben Farrelly <reuben-dovecot@reub.net> wrote:
On 24/08/2016 10:58 AM, Stephan Bosch wrote:
Op 8/1/2016 om 3:37 AM schreef Reuben Farrelly:
In other words, the rules did eventually get propagated across, and based on the file sizes they are complete.
But there is obviously something amiss with handling of dates (which in turn may relate to how the system determines that the file on each server is up to date or not, I guess). In this case the two systems are in different timezones - the primary is GMT+10 and the secondary GMT+8.
Also the status of active users is not always replicated either. On one host the output of 'doveadm sieve list -A' shows my own account as ACTIVE but the other host shows all users - except for my account - as being active, and the sieve script for my account is not being replicated.
This should fix the file timestamps getting set at unix time_t 0:
https://github.com/dovecot/pigeonhole/commit/af91dd3f2d78da752292dce27f9e76d...
I haven't been able to replicate the situation where this occurs though, since my current replication setup is very simple.
I need to extend my replication setup to test this more thoroughly.
So, please test this at your end first.
Regards,
Stephan.
Thanks Stephan. I have re-tested and the dates are now all look to be correct on the replicated scripts. We can cross that off as fixed now.
There is still a problem with the scripts not being replicated though between replicated hosts. They do eventually catch up many hours later. I don't know what the trigger is for them updating but it's not triggered by delivery attempts (as every time a delivery was attempted the secondary complained about the missing sieve script).
Thanks, Reuben
Op 8-9-2016 om 0:40 schreef Jean-Luc Wasmer:
Hey guys,
I was gonna report this issue too. New script FILES get replicated right away but changes to an existing file are only replicated with a full sync (looks like this is every 24h by default).
My assumption is this happens bc there’s no index file for sieve scripts.
Looking at his is on my list. Will do that soon..
Regards,
Stephan.
On Sep 7, 2016, at 5:44 AM, Reuben Farrelly <reuben-dovecot@reub.net> wrote:
On 24/08/2016 10:58 AM, Stephan Bosch wrote:
Op 8/1/2016 om 3:37 AM schreef Reuben Farrelly:
In other words, the rules did eventually get propagated across, and based on the file sizes they are complete.
But there is obviously something amiss with handling of dates (which in turn may relate to how the system determines that the file on each server is up to date or not, I guess). In this case the two systems are in different timezones - the primary is GMT+10 and the secondary GMT+8.
Also the status of active users is not always replicated either. On one host the output of 'doveadm sieve list -A' shows my own account as ACTIVE but the other host shows all users - except for my account - as being active, and the sieve script for my account is not being replicated.
This should fix the file timestamps getting set at unix time_t 0:
https://github.com/dovecot/pigeonhole/commit/af91dd3f2d78da752292dce27f9e76d...
I haven't been able to replicate the situation where this occurs though, since my current replication setup is very simple.
I need to extend my replication setup to test this more thoroughly.
So, please test this at your end first.
Regards,
Stephan.
Thanks Stephan. I have re-tested and the dates are now all look to be correct on the replicated scripts. We can cross that off as fixed now.
There is still a problem with the scripts not being replicated though between replicated hosts. They do eventually catch up many hours later. I don't know what the trigger is for them updating but it's not triggered by delivery attempts (as every time a delivery was attempted the secondary complained about the missing sieve script).
Thanks, Reuben
Hi,
Could you guys send us your current configuration (output from dovecot -n
)?
First of all, we would like to compare it to a configuration-related problem we've seen in the wild. That is a bit of a long shot. That issue revolves around the target username not always being the same for one physical user (e.g. when aliases are involved).
In any case, the configuration may be useful for reproducing the problem at our end.
Regards,
Stephan.
Op 7-9-2016 om 11:44 schreef Reuben Farrelly:
On 24/08/2016 10:58 AM, Stephan Bosch wrote:
Op 8/1/2016 om 3:37 AM schreef Reuben Farrelly:
In other words, the rules did eventually get propagated across, and based on the file sizes they are complete.
But there is obviously something amiss with handling of dates (which in turn may relate to how the system determines that the file on each server is up to date or not, I guess). In this case the two systems are in different timezones - the primary is GMT+10 and the secondary GMT+8.
Also the status of active users is not always replicated either. On one host the output of 'doveadm sieve list -A' shows my own account as ACTIVE but the other host shows all users - except for my account - as being active, and the sieve script for my account is not being replicated. This should fix the file timestamps getting set at unix time_t 0:
https://github.com/dovecot/pigeonhole/commit/af91dd3f2d78da752292dce27f9e76d...
I haven't been able to replicate the situation where this occurs though, since my current replication setup is very simple.
I need to extend my replication setup to test this more thoroughly.
So, please test this at your end first.
Regards,
Stephan.
Thanks Stephan. I have re-tested and the dates are now all look to be correct on the replicated scripts. We can cross that off as fixed now.
There is still a problem with the scripts not being replicated though between replicated hosts. They do eventually catch up many hours later. I don't know what the trigger is for them updating but it's not triggered by delivery attempts (as every time a delivery was attempted the secondary complained about the missing sieve script).
Thanks, Reuben
Op 7/31/2016 om 4:27 AM schreef Reuben Farrelly:
Hi,
I've observed some odd behaviour with dsync replication between two hosts, specifically to do with sieve script replication.
In short, I have two hosts which replicate in a master-master type setup where almost all of the reads and writes happen to just one of the two hosts.
They are both running 2.2.devel (9dc6403), which is close to the latest 2.2 -git . Pigeonhole is running master-0.4 . This is on x86_64 Gentoo.
Normal mail replication between Maildir's for all users works fine, however it appears that something recently committed to the code has broken sieve script replication between the two. I am sure this did once work. Replication is via tcps: .
Sieve scripts on the lesser-used host are not up to date by an order of days/weeks with the main host and they don't seem to re-replicate - even if the rules don't exist at all on the replica.
The symptoms and effects look to be the same as this (unanswered) post from December:
http://dovecot.org/list/dovecot/2015-December/102690.html
I am not sure how to view the transaction log files, but I am seeing the same symptoms, ie no live replication, and on the lesser-used host almost all the scripts were old and some had the 1970 date on them.
Even after forcing a [dsync replication replicate '*'] the scripts are not replicated. As it stands now there are no sieve scripts on one of the two members and the system seems unable to replicate by itself.
The following bugs were fixed recently:
https://github.com/dovecot/core/commit/b4adb461ce12bf578d2d70806b205cf3cbf1a... https://github.com/dovecot/core/commit/27ccbb0f36e07141785db94557afb63a2aa9e...
I wonder whether this also applies to your problem.
Regards,
Stephan.
participants (4)
-
Jean-Luc Wasmer
-
Michael Grimm
-
Reuben Farrelly
-
Stephan Bosch