[Dovecot] Dovecot + DRBD/GFS mailstore
Hi guys,
I'm looking at the possibility of running a pair of servers with Dovecot LDA/imap/pop3 using internal drives with DRBD and GFS (or other clustered FS) for the mail storage and ext3 for the root drive.
I'm currently using maildrop for delivery and Dovecot imap/pop3 with the stores over NFS. I'm looking for better performance but still keeping the HA element I have now with shared storage over NFS.
Has anyone had experience with a setup like the one I'm suggesting? What was performance like with Dovecot using GFS?
Thanks Guy
-- Don't just do something...sit there!
Guy schrieb:
Hi guys,
I'm looking at the possibility of running a pair of servers with Dovecot LDA/imap/pop3 using internal drives with DRBD and GFS (or other clustered FS) for the mail storage and ext3 for the root drive.
I'm currently using maildrop for delivery and Dovecot imap/pop3 with the stores over NFS. I'm looking for better performance but still keeping the HA element I have now with shared storage over NFS.
Has anyone had experience with a setup like the one I'm suggesting? What was performance like with Dovecot using GFS?
Thanks Guy
i tested DRBD with ocfs which works nice, also did tests with linux ha
perhaps this helps
http://www.idimmu.net/2008/01/08/High-availability-with-LVS-using-LVSadmin http://www.drbd.org/users-guide/ch-gfs.html https://blog.devnu11.net/2008/04/ha-mit-debian-lenny-drbd8-ocfs2-heartbeat-p...
cant get GFS run on ubuntu ( maybe my fault )
didnt test dovecot and postfix setup implemented in the ha-lb-cluster yet ( but ocfs was reported to work nice with dovecot deliver as far i remember), i used only apache http service for testing, but that worked nice in a vmware testing field with 4 servers ( 2 redundant loadbalancer and 2 Ha Servers )
-- Best Regards
MfG Robert Schetterer
Germany/Munich/Bavaria
To update an old thread...
I'm looking at the possibility of running a pair of servers with Dovecot LDA/imap/pop3 using internal drives with DRBD and GFS (or other clustered FS) for the mail storage and ext3 for the root drive.
I'm in testing right now with this setup. Two Dell PE 2900 servers (quad core @ 2.3 GHz, 8 GB RAM, raid 10 for the GFS+DRBD disk, raid 1 for the ext3 disks). Running DRBD as a master/master setup. [...] So far it is early testing. 63 users, but only about 12 of those are "power users". The performance has been real good so far, but as I say, not many users yet.
Well, as of yesterday, I've gone "live" with this setup with about 1K users. Averaging about 150 to 200 concurrent sessions (higher during certain day hours, lower at night, etc).
Slightly slower with 1K users than with 63 users (of course) but so far it is proving very stable and reasonably fast.
Most of the time it is performing faster than my old system with similar load, though there are rare "stalls" of webmail imap operations (connect, get data, and disconnect session) where it might take about 5 to 10 seconds to complete. I'm thinking it is a locking issue, but not sure. The average time for such a webmail operation is 0 to 2 seconds (which is reasonable, based on the message/mailbox size; using mbox here, so we have some 2 GB to 3 GB mbox files with large messages in them, etc).
Anyway, the point is that doing a cluster like this is very reasonable from a cluster/stability point of view. Jury is still out on performance, but I should know soon since I've now got a "significant" number of users hitting it.
My gut feeling is that there will be some slow connections from time to time due to locking probably, but that overall it will scale under load better and not die when a spammer attacks us or we otherwise get flooded...
-- Eric Rostetter The Department of Physics The University of Texas at Austin
This message is provided "AS IS" without warranty of any kind, either expressed or implied. Use this message at your own risk.
Eric,
Thanks for the update ...
How does the system behave when you shutdown one server, and bring it back later ? (are you using an IP load balancer/heart beat etc ?)
Regards,
Mario Antonio
Eric Jon Rostetter wrote:
To update an old thread...
I'm looking at the possibility of running a pair of servers with Dovecot LDA/imap/pop3 using internal drives with DRBD and GFS (or other clustered FS) for the mail storage and ext3 for the root drive.
I'm in testing right now with this setup. Two Dell PE 2900 servers (quad core @ 2.3 GHz, 8 GB RAM, raid 10 for the GFS+DRBD disk, raid 1 for the ext3 disks). Running DRBD as a master/master setup. [...] So far it is early testing. 63 users, but only about 12 of those are "power users". The performance has been real good so far, but as I say, not many users yet.
Well, as of yesterday, I've gone "live" with this setup with about 1K users. Averaging about 150 to 200 concurrent sessions (higher during certain day hours, lower at night, etc).
Slightly slower with 1K users than with 63 users (of course) but so far it is proving very stable and reasonably fast.
Most of the time it is performing faster than my old system with similar load, though there are rare "stalls" of webmail imap operations (connect, get data, and disconnect session) where it might take about 5 to 10 seconds to complete. I'm thinking it is a locking issue, but not sure. The average time for such a webmail operation is 0 to 2 seconds (which is reasonable, based on the message/mailbox size; using mbox here, so we have some 2 GB to 3 GB mbox files with large messages in them, etc).
Anyway, the point is that doing a cluster like this is very reasonable from a cluster/stability point of view. Jury is still out on performance, but I should know soon since I've now got a "significant" number of users hitting it.
My gut feeling is that there will be some slow connections from time to time due to locking probably, but that overall it will scale under load better and not die when a spammer attacks us or we otherwise get flooded...
Quoting Mario Antonio <support@webjogger.net>:
How does the system behave when you shutdown one server, and bring
it back later ? (are you using an IP load balancer/heart beat etc ?)
I'm just using RHCS with GFS over DRBD. DRBD and LVM are started by the system (not managed by the cluster) and everything else (including GFS) is managed by RHCS. So there is no load balancer, and nothing external to RHCS like heartbeat et al. (There is a two-cluster active/passive firewall in front of these that acts as a traffic director, but it isn't concerned about load balancing, and is a separate stand-alone cluster from the one running DRBD and GFS).
The DRBD+GFS cluster is a simple 3 node RHCS cluster. Two nodes (mailer1 and mailer2) run DRBD+GFS (active/active), while the 3rd node (webmail1) does not (just local ext3 file systems). I may add more nodes in the future if needed, but so far this is sufficient for my needs. The third node is nice as it prevents cluster (not DRBD) split-brain situations, and allows me to maintain real quorum when I need to reboot a node, etc.
BTW, they are all running CentOS 5.3 (started on RHEL, moved to CentOS which I actually find easier to use for DRBD/GFS/etc than RHEL).
If I do an orderly shutdown of the node, it all works fine. All services fail-over at the shutdown to the remaining node without a hitch.
At startup, they almost always migrate back automatically, and if not I can migrate them back later by hand. The reason they don't always migrate back at startup seems to be that if the node is down too long, then drbd takes a while to sync back up, and this can prevent lvm and gfs from starting at boot, which means of course the services can't migrate back. (I don't have drbd and lvm under cluster control, so if they don't start at boot, I need to manually fix them).
If I 'crash' a node (kill the power, reboot it via a hardware stonith card, etc) sometimes it doesn't work so fine and I need to manually intervene. Often it will all come up fine, but sometimes the drbd won't come up as primary/primary, and I'll need to fix it by hand. Or sometimes the drbd will come up, but the lvm or gfs won't (like above). So often I have to manually fix things.
But the good news is that in any case (shutdown, crash, etc) the cluster is always up and running, since only one node is down... So my services are always available, though maybe slower when a node isn't participating properly. Not the best situation, but certainly I'm able to live with it.
My main goal was to be able to do orderly shutdowns, and that works great. That way I can update kernels, tweak hardware (e.g., add RAM or upgrade disks), etc. with no real service interruption. So I'm not as worried about the "crash" situation, since it happens so much less often than the orderly shutdown, which was my main concern.
In any case, after many shutdowns and crashes and bad software upgrades and such, I've not lost any data or anything like that. Overall I'm very happy. Sure, I could be a bit happier with the recovery after a crash, but I'm tickled with the way it works the rest of the time, and it is a large improvement over my old setup.
Regards,
Mario Antonio
-- Eric Rostetter The Department of Physics The University of Texas at Austin
This message is provided "AS IS" without warranty of any kind, either expressed or implied. Use this message at your own risk.
Eric Jon Rostetter wrote:
Quoting Mario Antonio <support@webjogger.net>:
How does the system behave when you shutdown one server, and bring it back later ? (are you using an IP load balancer/heart beat etc ?)
I'm just using RHCS with GFS over DRBD. DRBD and LVM are started by the system (not managed by the cluster) and everything else (including GFS) is managed by RHCS. So there is no load balancer, and nothing external to RHCS like heartbeat et al. (There is a two-cluster active/passive firewall in front of these that acts as a traffic director, but it isn't concerned about load balancing, and is a separate stand-alone cluster from the one running DRBD and GFS).
The DRBD+GFS cluster is a simple 3 node RHCS cluster. Two nodes (mailer1 and mailer2) run DRBD+GFS (active/active), while the 3rd node (webmail1) does not (just local ext3 file systems). I may add more nodes in the future if needed, but so far this is sufficient for my needs. The third node is nice as it prevents cluster (not DRBD) split-brain situations, and allows me to maintain real quorum when I need to reboot a node, etc.
BTW, they are all running CentOS 5.3 (started on RHEL, moved to CentOS which I actually find easier to use for DRBD/GFS/etc than RHEL).
If I do an orderly shutdown of the node, it all works fine. All services fail-over at the shutdown to the remaining node without a hitch.
At startup, they almost always migrate back automatically, and if not I can migrate them back later by hand. The reason they don't always migrate back at startup seems to be that if the node is down too long, then drbd takes a while to sync back up, and this can prevent lvm and gfs from starting at boot, which means of course the services can't migrate back. (I don't have drbd and lvm under cluster control, so if they don't start at boot, I need to manually fix them).
If I 'crash' a node (kill the power, reboot it via a hardware stonith card, etc) sometimes it doesn't work so fine and I need to manually intervene. Often it will all come up fine, but sometimes the drbd won't come up as primary/primary, and I'll need to fix it by hand. Or sometimes the drbd will come up, but the lvm or gfs won't (like above). So often I have to manually fix things.
But the good news is that in any case (shutdown, crash, etc) the cluster is always up and running, since only one node is down... So my services are always available, though maybe slower when a node isn't participating properly. Not the best situation, but certainly I'm able to live with it.
My main goal was to be able to do orderly shutdowns, and that works great. That way I can update kernels, tweak hardware (e.g., add RAM or upgrade disks), etc. with no real service interruption. So I'm not as worried about the "crash" situation, since it happens so much less often than the orderly shutdown, which was my main concern.
In any case, after many shutdowns and crashes and bad software upgrades and such, I've not lost any data or anything like that. Overall I'm very happy. Sure, I could be a bit happier with the recovery after a crash, but I'm tickled with the way it works the rest of the time, and it is a large improvement over my old setup.
Regards,
Mario Antonio
Great! Any good documentation regarding building a RHCS with GFS over DRBD ...? (or just the Rethat web site ..) Just curious, which Dovecot Version are you using? and which Web-mail system? and Postfix or Exim? and user database on Mysql or Ldap?
M.A.
Quoting Mario Antonio <support@webjogger.net>:
Any good documentation regarding building a RHCS with GFS over DRBD
...? (or just the Rethat web site ..)
I've got my internal docs, which I could be talked into sharing... Other than that, the Red Hat docs and the DRBD docs are the best source. Not a lot out there.
Just curious, which Dovecot Version are you using? and which
Web-mail system? and Postfix or Exim? and user database on Mysql or
Ldap?
I started in testing with 1.1.11 and then moved to 1.1.18, both of which worked fine with no problems noticed. Then we got a new high-level boss who wanted shared folders, so I went to 1.2.4 which is where I am now. I'm not sure it matters.
My DRBD+GFS layout is: /cluster_data holds configuration files, etc. /var/spool/mail holds mbox inboxes /var/dovecot holds dovecot indexes, control files, acl files, etc. /var/log/dovecot holds logs for all mail programs (so I can see logs for any node from any cluster node).
Webmail is Horde/IMP with postgresql, MTA is MailScanner with sendmail, user database is LDAP (used to be pam, but now direct to LDAP).
M.A.
-- Eric Rostetter The Department of Physics The University of Texas at Austin
This message is provided "AS IS" without warranty of any kind, either expressed or implied. Use this message at your own risk.
Quoting Guy <wyldfury@gmail.com>:
I'm looking at the possibility of running a pair of servers with Dovecot LDA/imap/pop3 using internal drives with DRBD and GFS (or other clustered FS) for the mail storage and ext3 for the root drive.
I'm in testing right now with this setup. Two Dell PE 2900 servers (quad core @ 2.3 GHz, 8 GB RAM, raid 10 for the GFS+DRBD disk, raid 1 for the ext3 disks). Running DRBD as a master/master setup.
I added a third node for webmail (Dell PE 2650), but it doesn't do the DRBD or GFS. It is there mostly to make a 3-node cluster versus 2-node cluster, to avoid split-brain type situations. And of course to do the webmail. :)
Using MailScanner as the MTA, dovecot for pop/imap, mailman for mailing lists, Horde/IMP/etc for webmail. All held together with RHCS on CentOS 5.3.
All services run on only one node at a time, with failover... This may or may not help with GFS lock contention (not for /var/spool/mail, since it is always accessed from both nodes at once, but yes for dovecot indexes since they are only ever accessed on one node at a time, etc). This is probably where performance will really be decided (GFS lock contention).
Cluster Status for mailer @ Mon Aug 24 10:27:12 2009 Member Status: Quorate
Member Name ID Status
mailer1-hb.localdomain 1 Online, rgmanager mailer2-hb.localdomain 2 Online, Local, rgmanager webmail1-hb.localdomain 3 Online, rgmanager
Service Name Owner (Last) State
service:Apache mailer1-hb.localdomain started service:Dovecot mailer1-hb.localdomain started service:MailMan mailer2-hb.localdomain started service:MailScanner mailer2-hb.localdomain started service:VIP-MAIL mailer1-hb.localdomain started service:VIP-SMTP mailer2-hb.localdomain started service:WebMail webmail1-hb.localdomain started
Has anyone had experience with a setup like the one I'm suggesting? What was performance like with Dovecot using GFS?
So far it is early testing. 63 users, but only about 12 of those are "power users". The performance has been real good so far, but as I say, not many users yet.
My GFS is sharing the mail log files (via syslog-ng, what would otherwise be /var/log/maillog), the dovecot index files, the /var/spool/mail/ mbox spool (yes, I use mbox), and "shared" configuration files for the two nodes (mailman data, MailScanner/Sendmail configs, dovecot config, clamav/spamd config, procmail config, apache config, ssl certificates, etc).
If interested, I can let you know about performance once I know more...
Thanks Guy
-- Eric Rostetter The Department of Physics The University of Texas at Austin
This message is provided "AS IS" without warranty of any kind, either expressed or implied. Use this message at your own risk.
Last time i checked the free version of DRBD only supports 2 nodes.
The paid version supports 16 nodes. This however, doesnt mean that
you cannot use the storage via NFS or SMB/CIFS mount point. Only that
the DRBD replication will only happen to 2 nodes. If a third node is
supported on the free version, it would be for quorum only.
They might have changed it since, but i doubt it.
On Aug 24, 2009, at 10:51 AM, Eric Jon Rostetter wrote:
Quoting Guy <wyldfury@gmail.com>:
I'm looking at the possibility of running a pair of servers with Dovecot LDA/imap/pop3 using internal drives with DRBD and GFS (or other clustered FS) for the mail storage and ext3 for the root drive.
I'm in testing right now with this setup. Two Dell PE 2900 servers (quad core @ 2.3 GHz, 8 GB RAM, raid 10 for the GFS+DRBD disk, raid 1 for the ext3 disks). Running DRBD as a master/master setup.
I added a third node for webmail (Dell PE 2650), but it doesn't do the DRBD or GFS. It is there mostly to make a 3-node cluster versus 2- node cluster, to avoid split-brain type situations. And of course to do
the webmail. :)Using MailScanner as the MTA, dovecot for pop/imap, mailman for
mailing lists, Horde/IMP/etc for webmail. All held together with RHCS on
CentOS 5.3.All services run on only one node at a time, with failover... This
may or may not help with GFS lock contention (not for /var/spool/mail,
since it is always accessed from both nodes at once, but yes for dovecot
indexes since they are only ever accessed on one node at a time, etc).
This is probably where performance will really be decided (GFS lock
contention).Cluster Status for mailer @ Mon Aug 24 10:27:12 2009 Member Status: Quorate
Member Name ID Status
mailer1-hb.localdomain 1 Online, rgmanager mailer2-hb.localdomain 2 Online, Local,
rgmanager webmail1-hb.localdomain 3 Online, rgmanagerService Name Owner (Last) State
service:Apache mailer1-hb.localdomain started service:Dovecot mailer1-hb.localdomain started service:MailMan mailer2-hb.localdomain started service:MailScanner mailer2-hb.localdomain started service:VIP-MAIL mailer1-hb.localdomain started service:VIP-SMTP mailer2-hb.localdomain started service:WebMail webmail1-hb.localdomain started
Has anyone had experience with a setup like the one I'm suggesting? What was performance like with Dovecot using GFS?
So far it is early testing. 63 users, but only about 12 of those are "power users". The performance has been real good so far, but as I
say, not many users yet.My GFS is sharing the mail log files (via syslog-ng, what would
otherwise be /var/log/maillog), the dovecot index files, the /var/spool/mail/
mbox spool (yes, I use mbox), and "shared" configuration files for the
two nodes (mailman data, MailScanner/Sendmail configs, dovecot config, clamav/ spamd config, procmail config, apache config, ssl certificates, etc).If interested, I can let you know about performance once I know
more...Thanks Guy
-- Eric Rostetter The Department of Physics The University of Texas at Austin
This message is provided "AS IS" without warranty of any kind, either expressed or implied. Use this message at your own risk.
Quoting Romer Ventura <rventura@h-st.com>:
Last time i checked the free version of DRBD only supports 2 nodes.
Correct. But RHCS supports more, and works best with an off number (to prevent cluster splits, etc).
The paid version supports 16 nodes.
I think there are some limits on that too... Like two read-write nodes, plus more failover nodes? Not sure though.
This however, doesnt mean that you cannot use the storage via NFS or
SMB/CIFS mount point. Only that
Or, since I'm using GFS, via gnbd or such also.
the DRBD replication will only happen to 2 nodes. If a third node is
supported on the free version, it would be for quorum only.
I'm using the 3rd node only with RHCS, not with DRBD. The webmail needs to actual access to the file storage, it does everything via IMAP calls to dovecot.
-- Eric Rostetter The Department of Physics The University of Texas at Austin
This message is provided "AS IS" without warranty of any kind, either expressed or implied. Use this message at your own risk.
participants (5)
-
Eric Jon Rostetter
-
Guy
-
Mario Antonio
-
Robert Schetterer
-
Romer Ventura