[Dovecot] Dovecot + DRBD/GFS mailstore

Sat Sep 26 01:18:35 EEST 2009

Eric Jon Rostetter wrote:
> Quoting Mario Antonio <support at webjogger.net>:
>
>> How does the system behave when you shutdown one server, and bring it 
>> back later ?  (are you using an IP load balancer/heart beat etc ?)
>
> I'm just using RHCS with GFS over DRBD.  DRBD and LVM are started by
> the system (not managed by the cluster) and everything else (including
> GFS) is managed by RHCS.  So there is no load balancer, and nothing
> external to RHCS like heartbeat et al. (There is a two-cluster 
> active/passive
> firewall in front of these that acts as a traffic director, but it isn't
> concerned about load balancing, and is a separate stand-alone cluster
> from the one running DRBD and GFS).
>
> The DRBD+GFS cluster is a simple 3 node RHCS cluster.  Two nodes (mailer1
> and mailer2) run DRBD+GFS (active/active), while the 3rd node (webmail1)
> does not (just local ext3 file systems).  I may add more nodes in the
> future if needed, but so far this is sufficient for my needs.  The third
> node is nice as it prevents cluster (not DRBD) split-brain situations, 
> and
> allows me to maintain real quorum when I need to reboot a node, etc.
>
> BTW, they are all running CentOS 5.3 (started on RHEL, moved to CentOS
> which I actually find easier to use for DRBD/GFS/etc than RHEL).
>
> If I do an orderly shutdown of the node, it all works fine.  All
> services fail-over at the shutdown to the remaining node without a hitch.
>
> At startup, they almost always migrate back automatically, and if not I
> can migrate them back later by hand.  The reason they don't always 
> migrate
> back at startup seems to be that if the node is down too long, then drbd
> takes a while to sync back up, and this can prevent lvm and gfs from
> starting at boot, which means of course the services can't migrate back.
> (I don't have drbd and lvm under cluster control, so if they don't start
> at boot, I need to manually fix them).
>
> If I 'crash' a node (kill the power, reboot it via a hardware stonith 
> card,
> etc) sometimes it doesn't work so fine and I need to manually intervene.
> Often it will all come up fine, but sometimes the drbd won't come up as
> primary/primary, and I'll need to fix it by hand.  Or sometimes the drbd
> will come up, but the lvm or gfs won't (like above).  So often I have to
> manually fix things.
>
> But the good news is that in any case (shutdown, crash, etc) the cluster
> is always up and running, since only one node is down...  So my services
> are always available, though maybe slower when a node isn't participating
> properly.  Not the best situation, but certainly I'm able to live with 
> it.
>
> My main goal was to be able to do orderly shutdowns, and that works 
> great.
> That way I can update kernels, tweak hardware (e.g., add RAM or upgrade
> disks), etc. with no real service interruption.  So I'm not as worried
> about the "crash" situation, since it happens so much less often than the
> orderly shutdown, which was my main concern.
>
> In any case, after many shutdowns and crashes and bad software upgrades
> and such, I've not lost any data or anything like that.  Overall I'm
> very happy.  Sure, I could be a bit happier with the recovery after
> a crash, but I'm tickled with the way it works the rest of the time,
> and it is a large improvement over my old setup.
>
>> Regards,
>>
>> Mario Antonio
>
Great!
Any good documentation regarding building a RHCS with GFS over DRBD ...? 
(or just the Rethat web site ..)
Just curious, which Dovecot Version are you using? and which Web-mail 
system? and Postfix or Exim? and user database on Mysql or Ldap?

M.A.