On 1/9/2012 7:48 AM, Sven Hartge wrote:
It seems my initial idea was not so bad after all ;)
Yeah, but you didn't know how "not so bad" it really was until you had me analyze it, flesh it out, and confirm it. ;)
Now I "just" need o built a little test setup, put some dummy users on it and see, if anything bad happens while accessing the shared folders and how the reaction of the system is, should the shared folder server be down.
It won't be down. Because instead of using NFS you're going to use GFS2 for the shared folder LUN so each user accesses the shared folders locally just as they do their mailbox. Pat yourself on the back Sven, you just eliminated a SPOF. ;)
How many total cores per VMware node (all sockets)?
8
Fairly beefy. Dual socket quad core Xeons I'd guess.
Here the memory statistics an 14:30 o'clock:
total used free shared buffers cached
Mem: 12046 11199 847 0 88 7926 -/+ buffers/cache: 3185 8861 Swap: 5718 10 5707
That doesn't look too bad. How many IMAP user connections at that time? Is that a high average or low for that day? The RAM numbers in isolation only paint a partial picture...
The SAN has plenty space. Over 70TiB at this time, with another 70TiB having just arrived and waiting to be connected.
140TB of 15k storage. Wow, you're so under privileged. ;)
The iSCSI storage nodes (HP P4500) use 600GB SAS6 at 15k rpm with 12 disks per node, configured in 2 RAID5 sets with 6 disks each.
But this is internal to each storage node, which are kind of a blackbox and have to be treated as such.
I cringe every time I hear 'black box'...
The HP P4500 is a but unique, since it does not consist of a head node which storage arrays connected to it, but of individual storage nodes forming a self balancing iSCSI cluster. (The nodes consist of DL320s G2.)
The 'black box' is Lefthand Networks SAN/iQ software stack. I wasn't that impressed with it when I read about it 8 or so years ago. IIRC, load balancing across cluster nodes is accomplished by resending host packets from a receiving node to another node after performing special sauce calculations regarding cluster load. Hence the need, apparently, for a full power, hot running, multi-core x86 CPU instead of an embedded low power/wattage type CPU such as MIPS, PPC, i960 descended IOP3xx, or even the Atom if they must stick with x86 binaries. If this choice was merely due to economy of scale of their server boards, they could have gone with a single socket board instead of the dual, which would have saved money. So this choice of a dual socket Xeon board wasn't strictly based on cost or ease of manufacture.
Many/most purpose built SAN arrays on the market don't use full power x86 chips, but embedded RISC chips, to cut cost, power draw, and heat generation. These RISC chips are typically in order designs, don't have branch prediction or register renaming logic circuits and they have tiny caches. This is because block moving code handles streams of data and doesn't typically branch nor have many conditionals. For streaming apps, data caches simply get in the way, although an instruction cache is beneficial. HP's choice of full power CPUs that have such features suggests branching conditional code is used. Which makes sense when running algorithms that attempt to calculate the least busy node.
Thus, this 'least busy node' calculation and packet shipping adds non trivial latency to host SCSI IO command completion, compared to traditional FC/iSCSI SAN arrays, or DAS, and thus has implications for high IOPS workloads and especially those making heavy use of FSYNC, such as SMTP and IMAP servers. FSYNC performance may not be an issue if the controller instantly acks FSYNC before data hits platter, but then you may run into bigger problems as you have no guarantee data hit the disk. Or, you may not run into perceptible performance issues at all given the number of P4500s you have and the proportionally light IO load of your 10K mail users. Sheer horsepower alone may prove sufficient.
Just in case, it may prove beneficial to fire up ImapTest or some other synthetic mail workload generator to see if array response times are acceptable under heavy mail loads.
So far, I had no performance or other problems with this setup and it scales quite nice, as you <marketing> buy as you grow </marketing>.
I'm glad the Lefthand units are working well for you so far. Are you hitting the arrays with any high random IOPS workloads as of yet?
And again, price was also a factor, deploying a FC-SAN would have cost us more than thrice the amount than the amount the deployment of an iSCSI solution did, because the latter is "just" ethernet, while the former would have needed a lot more totally new components.
I guess that depends on the features you need, such as PIT backups, remote replication, etc. I expanded a small FC SAN about 5 years ago for the same cost as an iSCSI array, simply due to the fact that the least expensive _quality_ unit with a good reputation happened to have both iSCSI and FC ports included. It was a 1U 8x500GB Nexsan Satablade, their smallest unit (since discontinued). Ran about $8K USD IIRC. Nexsan continues to offer excellent products.
For anyone interested in high density high performance FC+iSCSI SAN arrays at a midrange price, add Nexsan to your vendor research list: http://www.nexsan.com
No, at that time (2005/2006) nobody thought of a SAN. That is a fairly "new" idea here, first implemented for the VMware cluster in 2008.
You must have slower adoption on that side of the pond. As I just mentioned, I was expanding an already existing small FC SAN in 2006 that had been in place since 2004 IIRC. And this was at a small private 6-12 school with enrollment of about 500. iSCSI SANs took off like a rocket in the States around 06/07, in tandem with VMware ESX going viral here.
More space. The IMAP usage became more prominent which caused a steep rise in space needed on the mail storage server. But 74GiB SCA drives where expensive and 130GiB SCA drives where not available at that time.
With 144TB of HP Lefthand 15K SAS drives it appears you're no longer having trouble funding storage purchases. ;)
And this is why I kind of hold this upgrade back until dovecot 2.1 is released, as it has some optimizations here.
Sounds like it's going to be a bit more than an 'upgrade'. ;)
Well, yes. It is more a re-implementation than an upgrade.
It actually sounds like fun. To me anyway. ;) I love this stuff.
Central IT here this days only uses x86-based systems. There where some Sun SPARC systems, but both have been decomissioned. New SPARC hardware is just too expensive for our scale. And if you want to use virtualization, you can either use only SPARC systems and partition them or use x86 based systems. And then there is the need to virtualize Windows, so x86 is the only option.
Definitely a trend for a while now.
Most bigger Universities in Germany make nearly exclusive use of SPARC systems, but they had a central IT with big irons (IBM, HP, etc.) since back in the 1960's, so naturally the continue on that path.
Siemens/Fujitsu machines or SUN machines? I've been under the impression that Fujitsu sold more SPARC boxen in Europe, or at least Germany, than SUN did, due to the Siemens partnership. I could be wrong here.
-- Stan