[Dovecot] director monitoring?

Kelsey Cummings kgc at corp.sonic.net
Thu Jun 2 22:29:10 EEST 2011


On Thu, Jun 02, 2011 at 10:37:23AM +0200, Cor Bosman wrote:
> We use a setup as seen on http://grab.by/agCb for about 30.000 simultaneous(!) imap connections. 

This might as well be a diagram of my network, although, if I remember,
you're running quite a few more netapps clusters than I am. ;)

> We have 2 Foundry loadbalancers. They check the health of the directors. We have 3 directors, and each one runs Brandon's poolmon script (https://github.com/brandond/poolmon). This script removes real servers out of the director pool. The dovecot imap servers are monitored with nagios just to tell us when they're down. 

I'm using a hacked up version of poolmon.  The only important changes
are that it actually logs into the real server rather than just making a
connection to it and that has heuristics to prevent the real servers
from flapping and added a timeout to scan_host so if a real server
blocks after the connection is established it won't hang indefinitely.

> This setup has been absolutely rock solid for us. I have not touched the whole system since november and we have not seen any more corruption of meta data, which is the whole reason for the directors.  Kudos to Timo for fixing this difficult problem.

That is always good to hear!

I'd be a lot happier if I was able to monitor the directors and make
sure that they were connected and correctly synced with eachother - even
as a protection from human error rather than anticipated software failure.

-- 
Kelsey Cummings - kgc at corp.sonic.net      sonic.net, inc.
System Architect                          2260 Apollo Way
707.522.1000                              Santa Rosa, CA 95407


More information about the dovecot mailing list