On 1/9/2012 8:08 AM, Sven Hartge wrote:
Stan Hoeppner <stan@hardwarefreak.com> wrote:
The quota for students is 1GiB here. If I provide each of my 4 nodes with 500GiB of storage space, this gives me 2TiB now, which should be sufficient. If a nodes fills, I increase its storage space. Only if it fills too fast, I may have to rebalance users.
That should work.
And I never wanted to place the users based on their current size. I knew this was not going to work because of the reasons you mentioned.
I just want to hash their username and use this as a function to distribute the users, keeping it simple and stupid.
My apologies Sven. I just re-read your first messages and you did mention this method.
Yes, I know. But right now, if I lose my one and only mail storage servers, all users mailboxes will be offline, until I am either a) able to repair the server, b) move the disks to my identical backup system (or the backup system to the location of the failed one) or c) start the backup system and lose all mails not rsynced since the last rsync-run.
True. 3/4 of users remaining online is much better than none. :)
It is not easy designing a mail system without a SPoF which still performs under load.
And many other systems for that matter.
For example, once a time I had a DRDB (active/passive( setup between the two storage systems. This would allow me to start my standby system without losing (nearly) any mail. But this was awful slow and sluggish.
Eric Rostetter at University of Texas at Austin has reported good performance with his twin Dovecot DRBD cluster. Though in his case he's doing active/active DRBD with GFS2 sitting on top, so there is no failover needed. DRBD is obviously not an option for your current needs.
- You will consume more SAN volumes and LUNs. Most arrays have a fixed number of each. May or may not be an issue.
Not really an issue here. The SAN is exclusive for the VMware cluster, so most LUNs are quite big (1TiB to 2TiB) but there are not many of them.
I figured this wouldn't be a problem. I'm just trying to be thorough, mentioning anything I can think of that might be an issue.
The more I think about your planned architecture the more it reminds me of a "shared nothing" database cluster--even a relatively small one can outrun a well tuned mainframe, especially doing decision support/data mining workloads (TPC-H).
As long as you're prepared for the extra administration, which you obviously are, this setup will yield better performance than the NFS setup I recommended. Performance may not be quite as good as 4 physical hosts with local storage, but you haven't mentioned the details of your SAN storage nor the current load on it, so obviously I can't say with any certainty. If the controller currently has plenty of spare IOPS then the performance difference would be minimal. And using the SAN allows automatic restart of a VM if a physical node dies.
As with Phil, I'm anxious to see how well it works in production. When you send an update please CC me directly as sometimes I don't read all the list mail.
I hope my participation was helpful to you Sven, even if only to a small degree. Best of luck with the implementation.
-- Stan