AW: Monitoring Dovecot Replication

MK dovecot-ml at mk.de
Fri Feb 19 09:49:12 EET 2021


Hello David,

that's the thing I want to know. To build a script to check this is not the problem. 
In the first check I have started with " doveadm replicator status" search for " Waiting 'failed' requests" and if this is > 0 then
give me a failure. But if I have this in my monitoring then I have a lot of alarms that where cleared during the next poll.
For example: OpenNMS polls this nrpe check that looks at the value described, there are one or more "Waiting 'failed' requests"
it gives an alarm. 5 min later (the next poll from OpenNMS) the "Waiting 'failed' requests" are 0 because dovecot has fixed the 
the failed users by itself. And so I have a lot of alarms that where cleared 5-10 min after they came into the monitoring without
doing anything. 
I'm searching for a way to get the user out of the system where dovecot could not solve a failure by itself. 
Because this is what I want to altert so that I can take a look and fix it. 

Regards,
Oliver

-----Ursprüngliche Nachricht-----
Von: dovecot [mailto:dovecot-bounces at dovecot.org] Im Auftrag von David Morsberger
Gesendet: Donnerstag, 18. Februar 2021 23:17
An: MK
Cc: dovecot at dovecot.org
Betreff: Re: Monitoring Dovecot Replication

Oliver,

What’s your observable event that indicates replication has failed or is behind? Log message? Different file checksums?

David 

> On Feb 18, 2021, at 10:54 AM, MK <dovecot-ml at mk.de> wrote:
> 
> Hello Andrea,
> 
> thanks for sharing your script to the community.
> 
> But think your script does not solve my problem. Monitoring failed replication with the output of "doveadm replicator status"
> I have allready tried. In my opinion there is nothing in this output and also in other status output I found that shows me the
> user that failed longer time and where the replication process does not solve this failure by itself. 
> I'm searching for something that shows me an alarm if dovecot could not fix a replication by itself
> after > 10 min. With my experience the most replication failures where fixed by dovecot automatically
> in under 10 min. Because dovecot starts every 5min another try. 
> Or did you have a logic outside this script, maybe in Check_MK that knows when a user is greater than 10 min 
> out of replication or something like hat? Until now I don't unterstand how this works for you as monitoring the 
> replication. 
> 
> To understand my side better. We are using OpenNMS to monior our servers and in this case I would use a 
> nrpe check on the cluster to monitor this. OpenNMS polls this check every 5 min and if it gives a fail result
> I have an alarm. Maybe this helps a little bit to understand my problem.
> 
> Regards,
> Oliver
> 
> -----Ursprüngliche Nachricht-----
> Von: dovecot [mailto:dovecot-bounces at dovecot.org] Im Auftrag von Andrea Gabellini
> Gesendet: Montag, 15. Februar 2021 11:04
> An: Steven Varco; dovecot at dovecot.org
> Betreff: Re: Monitoring Dovecot Replication
> 
> Hello,
> 
> here my script. I'm not a professional programmer... ;-)
> 
> Andrea
> 
> Il 12/02/21 17:53, Steven Varco ha scritto:
>> Hi Andrea
>> 
>> It would be great if oyu could post that here, as I (and possibly others) would also be interested. :)
>> 
>> thanks,
>> Steven
>> 
> 
> -- 
> __________________________
> hAS ANYONE SEEN MY cAPSLOCK KEY?
> __________________________
> 
> TIM San Marino S.p.A.
> Andrea Gabellini
> Engineering R&D
> TIM San Marino S.p.A. - https://www.telecomitalia.sm
> Via Ventotto Luglio, 212 - Piano -2
> 47893 - Borgo Maggiore - Republic of San Marino
> Tel: (+378) 0549 886237
> Fax: (+378) 0549 886188
> 
> 
> 
> --
> Informativa Privacy
> 
> Questa email ha per destinatari dei contatti presenti negli archivi di TIM San Marino S.p.A.. Tutte le informazioni vengono trattate e tutelate nel rispetto della normativa vigente sulla protezione dei dati personali (Reg. EU 2016/679). Per richiedere informazioni e/o variazioni e/o la cancellazione dei vostri dati presenti nei nostri archivi potete inviare una email a privacy at telecomitalia.sm.
> 
> Avviso di Riservatezza
> 
> Il contenuto di questa e-mail e degli eventuali allegati e' strettamente confidenziale e destinato alla/e persona/e a cui e' indirizzato. Se avete ricevuto per errore questa e-mail, vi preghiamo di segnalarcelo immediatamente e di cancellarla dal vostro computer. E' fatto divieto di copiare e divulgare il contenuto di questa e-mail. Ogni utilizzo abusivo delle informazioni qui contenute da parte di persone terze o comunque non indicate nella presente e-mail potra' essere perseguito ai sensi di legge.



More information about the dovecot mailing list