Stan Hoeppner <stan@hardwarefreak.com> wrote:
On 1/8/2012 9:39 AM, Sven Hartge wrote:
Memory size. I am a bit hesistant to deploy a VM with 16GB of RAM. My cluster nodes each have 48GB, so no problem on this side though.
Shouldn't be a problem if you're going to spread the load over 2 to 4 cluster nodes. 16/2 = 8GB per VM, 16/4 = 4GB per Dovecot VM. This, assuming you are able to evenly spread user load.
I think I will be able to do that. If I devide my users by using a hash like MD5 or SHA1 over their username, this should give me an even distribution.
So, this reads like my idea in the first place.
Only you place all the mails on the NFS server, whereas my idea was to just share the shared folders from a central point and keep the normal user dirs local to the different nodes, thus reducing network impact for the way more common user access.
To be quite honest, after thinking this through a bit, many traditional advantages of a single shared mail store start to disappear. Whether you use NFS or a clusterFS, or 'local' disk (RDMs), all IO goes to the same array, so the traditional IO load balancing advantage disappears. The other main advantage, replacing a dead hardware node, simply mapping the LUNs to the new one and booting it up, also disappears due to VMware's unique abilities, including vmotion. Efficient use of storage isn't an issue as you can just as easily slice off a small LUN to each of 2/4 Dovecot VMs as a larger one to the NFS VM.
Yes. Plus I can much more easily increase a LUNs size, if the need arises.
So the only disadvantages I see are with the 'local' disk RDM mailstore location. 'manual' connection/mailbox/size balancing, all increasing administrator burden.
Well, I don't see size balancing as a problem since I can increase the size of the disk for a node very easy.
Load should be fairly even, if I distribute the 10,000 users across the nodes. Even if there is a slight imbalance, the systems should have enough power to smooth that out. I could measure the load every user creates and use that as a distribution key, but I believe this to be a wee bit over-engineered for my scenario.
Initial placement of a new user will be automatic, during the activation of the account, so no administrative burden there.
It seems my initial idea was not so bad after all ;) Now I "just" need o built a little test setup, put some dummy users on it and see, if anything bad happens while accessing the shared folders and how the reaction of the system is, should the shared folder server be down.
2.3GHz for most VMware nodes.
How many total cores per VMware node (all sockets)?
8
You got the numbers wrong. And I got a word wrong ;)
Should have read "900GB _of_ 1300GB used".
My bad. I misunderstood.
Here the memory statistics an 14:30 o'clock:
total used free shared buffers cached
Mem: 12046 11199 847 0 88 7926 -/+ buffers/cache: 3185 8861 Swap: 5718 10 5707
So not much wiggle room left.
And that one is retiring anyway as you state below. So do you have plenty of space on your VMware SAN arrays? If not can you add disks or do you need another array chassis?
The SAN has plenty space. Over 70TiB at this time, with another 70TiB having just arrived and waiting to be connected.
This is a Transtec Provigo 610. This is a 24 disk enclosure, 12 disks with 150GB (7.200k) each for the main mail storage in RAID6 and another 10 disks with 150GB (5.400k) for a backup LUN. I daily rsnapshot my /home onto this local backup (20 days of retention), because it is easier to restore from than firing up Bacula, which has the long retention time of 90 days. But must users need a restore of mails from $yesterday or $the_day_before.
And your current iSCSI SAN array(s) backing the VMware farm? Total disks? Is it monolithic, or do you have multiple array chassis from one or multiple vendors?
The iSCSI storage nodes (HP P4500) use 600GB SAS6 at 15k rpm with 12 disks per node, configured in 2 RAID5 sets with 6 disks each.
But this is internal to each storage node, which are kind of a blackbox and have to be treated as such.
The HP P4500 is a but unique, since it does not consist of a head node which storage arrays connected to it, but of individual storage nodes forming a self balancing iSCSI cluster. (The nodes consist of DL320s G2.)
So far, I had no performance or other problems with this setup and it scales quite nice, as you <marketing> buy as you grow </marketing>.
And again, price was also a factor, deploying a FC-SAN would have cost us more than thrice the amount than the amount the deployment of an iSCSI solution did, because the latter is "just" ethernet, while the former would have needed a lot more totally new components.
Well, it was either Parallel-SCSI or FC back then, as far as I can remember. The price difference between the U320 version and the FC one was not so big and I wanted to avoid having to route those big SCSI-U320 through my racks.
Can't blame you there. I take it you hadn't built the iSCSI SAN yet at that point?
No, at that time (2005/2006) nobody thought of a SAN. That is a fairly "new" idea here, first implemented for the VMware cluster in 2008.
Then we bought new hardware (the one previous to the current one), this time with more RAM, better RAID controller, smarter disk setup. We outgrew this one really fast and a disk upgrade was not possible; it lasted only 2 years.
Did you need more space or more spindles?
More space. The IMAP usage became more prominent which caused a steep rise in space needed on the mail storage server. But 74GiB SCA drives where expensive and 130GiB SCA drives where not available at that time.
And this is why I kind of hold this upgrade back until dovecot 2.1 is released, as it has some optimizations here.
Sounds like it's going to be a bit more than an 'upgrade'. ;)
Well, yes. It is more a re-implementation than an upgrade.
I have plenty space for 2U systems and already use DL385 G7s, I am not fixed on Intel or AMD, I'll gladly use the one which is the most fit for a given jobs.
Just out of curiosity do you have any Power or SPARC systems, or all x86?
Central IT here this days only uses x86-based systems. There where some Sun SPARC systems, but both have been decomissioned. New SPARC hardware is just too expensive for our scale. And if you want to use virtualization, you can either use only SPARC systems and partition them or use x86 based systems. And then there is the need to virtualize Windows, so x86 is the only option.
Most bigger Universities in Germany make nearly exclusive use of SPARC systems, but they had a central IT with big irons (IBM, HP, etc.) since back in the 1960's, so naturally the continue on that path.
Grüße, Sven.
-- Sigmentation fault. Core dumped.