Hello,
I have a cluster with two dovecot nodes with dovecot replication between them. The setup works fine and now I'm searching for a way to monitor the users so that I can get an information if the replication fails for one user for a longer time and I have to trigger the replication manually. Most of the time if I see a replication failure the self healing of dovecot replication repairs this in max. 10 min.
I have tried different combinations of querying " doveadm replicator status '*' " and search for failed users and then send an alarm if one of fast sync, full sync or success sync reaches a threshold. But there is no combination that seems to be working if I only want to trigger this if I have to fix the replication manualy.
Can someone tell me what I have to query to get only the user who's replication failed for a longer time (10 min +) and that I have to fix manually?
Thank you.
Oliver