I have two Dovecot mail servers that replicate to each other. Sometimes there are delays in the synchronization, and I notice that the mail log has entries like this:
Error: dsync(spokane): I/O has stalled, no activity for 600 seconds (last sent=mailbox, last recv=mailbox_state)
Five minutes seems like a long time to sit there waiting with nothing happening. Is there a way to reduce this timeout so that I don't have so many replicaton connections just sitting around doing nothing?
(Of course, a way to prevent the I/O stalls would be great too, but with my limited upload bandwidth, they may be unavoidable.)
Andy
I forgot to mention in my original message that I'm running Dovecot 2.3.21 (47349e2482).
It seems like the stalls are more likely to happen when the type of sync is "incremental" rather than "normal" or "full". (I'm inclined to think they only happen for incremental syncs, but I'm not sure.)
Andy
On Friday, January 19, 2024 9:26:29 AM PST, Andy Balholm wrote:
I have two Dovecot mail servers that replicate to each other. Sometimes there are delays in the synchronization, and I notice that the mail log has entries like this:
Error: dsync(spokane): I/O has stalled, no activity for 600 seconds (last sent=mailbox, last recv=mailbox_state)
Five minutes seems like a long time to sit there waiting with nothing happening. Is there a way to reduce this timeout so that I don't have so many replicaton connections just sitting around doing nothing?
(Of course, a way to prevent the I/O stalls would be great too, but with my limited upload bandwidth, they may be unavoidable.)
Andy
Can you try with doveadm -D and send the log?
Aki
On 20/01/2024 19:51 EET Andy Balholm andy@balholm.com wrote:
I forgot to mention in my original message that I'm running Dovecot 2.3.21 (47349e2482).
It seems like the stalls are more likely to happen when the type of sync is "incremental" rather than "normal" or "full". (I'm inclined to think they only happen for incremental syncs, but I'm not sure.)
Andy
On Friday, January 19, 2024 9:26:29 AM PST, Andy Balholm wrote:
I have two Dovecot mail servers that replicate to each other. Sometimes there are delays in the synchronization, and I notice that the mail log has entries like this:
Error: dsync(spokane): I/O has stalled, no activity for 600 seconds (last sent=mailbox, last recv=mailbox_state)
Five minutes seems like a long time to sit there waiting with nothing happening. Is there a way to reduce this timeout so that I don't have so many replicaton connections just sitting around doing nothing?
(Of course, a way to prevent the I/O stalls would be great too, but with my limited upload bandwidth, they may be unavoidable.)
Andy
dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-leave@dovecot.org
I'm not sure how to do that, because I'm doing automatic replication, not running doveadm sync manually. I tried adding -D to replication_dsync_parameters, but that gave me an error, because the -D was in the wrong place on the command line. (It should be doveadm -D sync, and it was ending up with something like doveadm sync -D)
Andy
On Sunday, January 21, 2024 7:36:13 AM PST, Aki Tuomi wrote:
Can you try with doveadm -D and send the log?
Aki
On 20/01/2024 19:51 EET Andy Balholm andy@balholm.com wrote:
I forgot to mention in my original message that I'm running Dovecot 2.3.21 (47349e2482).
It seems like the stalls are more likely to happen ...
you could try running it manually from cli..
doveadm -D sync <rest of the options>
Aki
On 22/01/2024 20:32 EET Andy Balholm andy@balholm.com wrote:
I'm not sure how to do that, because I'm doing automatic replication, not running doveadm sync manually. I tried adding -D to replication_dsync_parameters, but that gave me an error, because the -D was in the wrong place on the command line. (It should be doveadm -D sync, and it was ending up with something like doveadm sync -D)
Andy
On Sunday, January 21, 2024 7:36:13 AM PST, Aki Tuomi wrote:
Can you try with doveadm -D and send the log?
Aki
On 20/01/2024 19:51 EET Andy Balholm andy@balholm.com wrote:
I forgot to mention in my original message that I'm running Dovecot 2.3.21 (47349e2482).
It seems like the stalls are more likely to happen ...
dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-leave@dovecot.org
Is there a way to find out the exact command line that the replicator is using to invoke doveadm sync?
Andy
On Monday, January 22, 2024 10:55:12 AM PST, Aki Tuomi wrote:
you could try running it manually from cli..
doveadm -D sync <rest of the options>
Aki
On 22/01/2024 20:32 EET Andy Balholm andy@balholm.com wrote:
I'm not sure how to do that, because I'm doing automatic replication, not running doveadm sync manually. I tried adding -D to replication_dsync_parameters, ...
doveconf replication_dsync_parameters
then you can do
doveadm sync -u <username>
Aki
On 22/01/2024 21:05 EET Andy Balholm andy@balholm.com wrote:
Is there a way to find out the exact command line that the replicator is using to invoke doveadm sync?
Andy
On Monday, January 22, 2024 10:55:12 AM PST, Aki Tuomi wrote:
you could try running it manually from cli..
doveadm -D sync <rest of the options>
Aki
On 22/01/2024 20:32 EET Andy Balholm andy@balholm.com wrote:
I'm not sure how to do that, because I'm doing automatic replication, not running doveadm sync manually. I tried adding -D to replication_dsync_parameters, ...
dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-leave@dovecot.org
That lets me run sync jobs and get verbose output. But I haven't managed to have a manuall-started sync that stalled.
It's only a small fraction of the sync jobs that stall, even though at any given time there are usually several stalled jobs running. (Because the stalled jobs take much longer than the ones that complete normally.)
Andy
On Monday, January 22, 2024 11:22:14 AM PST, Aki Tuomi wrote:
doveconf replication_dsync_parameters
then you can do
doveadm sync -u <username>
Aki
On 22/01/2024 21:05 EET Andy Balholm andy@balholm.com wrote:
Is there a way to find out the exact command line that the replicator is using to invoke doveadm sync?
Andy ...
Hi!
We received some more information to this. Are you by chance running these from some scheduler? It seems that if dovecot is logging, some schedulers can actually start blocking on log writes to stdout/stderr, which can lead to this problem.
I wonder if this could be the case for you?
Aki
On 21/01/2024 17:36 EET Aki Tuomi via dovecot dovecot@dovecot.org wrote:
Can you try with doveadm -D and send the log?
Aki
On 20/01/2024 19:51 EET Andy Balholm andy@balholm.com wrote:
I forgot to mention in my original message that I'm running Dovecot 2.3.21 (47349e2482).
It seems like the stalls are more likely to happen when the type of sync is "incremental" rather than "normal" or "full". (I'm inclined to think they only happen for incremental syncs, but I'm not sure.)
Andy
On Friday, January 19, 2024 9:26:29 AM PST, Andy Balholm wrote:
I have two Dovecot mail servers that replicate to each other. Sometimes there are delays in the synchronization, and I notice that the mail log has entries like this:
Error: dsync(spokane): I/O has stalled, no activity for 600 seconds (last sent=mailbox, last recv=mailbox_state)
Five minutes seems like a long time to sit there waiting with nothing happening. Is there a way to reduce this timeout so that I don't have so many replicaton connections just sitting around doing nothing?
(Of course, a way to prevent the I/O stalls would be great too, but with my limited upload bandwidth, they may be unavoidable.)
Andy
dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-leave@dovecot.org
dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-leave@dovecot.org
I'm not using any scheduler. It's just being activated by the standard replication-notify mechanism in Dovecot.
Andy
On Thursday, March 28, 2024 12:56:18 AM PDT, Aki Tuomi wrote:
Hi!
We received some more information to this. Are you by chance running these from some scheduler? It seems that if dovecot is logging, some schedulers can actually start blocking on log writes to stdout/stderr, which can lead to this problem.
I wonder if this could be the case for you?
Aki
On 21/01/2024 17:36 EET Aki Tuomi via dovecot dovecot@dovecot.org wrote:
Can you try with doveadm -D and send the log?
Aki ...
participants (2)
-
Aki Tuomi
-
Andy Balholm