Wayne Thursby put forth on 2/15/2010 11:42 PM:
Hello everyone,
Note domain in my email addy Wayne. ;)
I have been looking at the Dell EqualLogic stuff and it seems to provide what we need. I can get most of the information I need from the rep, but I wonder if anyone has any experience with high performance requirements on these kinds of storage.
EqualLogic has nice iSCSI SAN storage arrays with internal multiple snapshot ability and what not, but IMHO they're way over priced for what you get.
I'd like to continue running my current hardware as the primary mail server, but provide some kind of failover using the SAN. The primary usage of the SAN will be to make our 2TB document store highly available. I'm wondering what kind of options I might have in the way of piggybacking some email failover on this kind of hardware without sacrificing the performance I'm currently enjoying.
Give me the specs on your current SAN setup and I'll give you some good options.
- What/how many FC switches do you have, Brocade, Qlogic, etc?
- What make/model is your current SAN array controller(s), what disk config?
Is it possible to go with a virtual machine mounted on iSCSI acting as a backup mail server? How would I sync the two, NBD+MD? Any experience doing this with maildirs? I wonder about the performance.
This isn't the way to go about it. You already have an FC SAN and VMware ESX. ESX+SAN is _THE_ way to do HA/failover with Vmotion. I haven't use it since ESX3, but I must say, there is no better solution available on the planet. It's nearly perfect.
Can it be as simple as attaching my MD-1000's second controller to the SAN magic box via SAS and pressing the Easy button?
No. The MD-1000 is direct attached storage, i.e. dumb storage, and you have it configured with a hardware RAID controller in a single host. You can't share it with another host. To share storage arrays between/among multiple hosts requires an intelligent controller in the array chassis doing the RAID, multiple host port connections (FC, SCSI, iSCSI), and a cluster filesystem on the hosts to coordinate shared fs access and file locking. This is exactly what ESX does with multiple ESX hosts and a SAN array.
Is it as expensive as running my primary mailserver mounted from the SAN via Fiber Channel? Will that get me under 30ms latency?
I'm not sure what you mean by "expensive" in this context. Also, the latency will be dependent on the SAN storage array(s) and FC network. From experience, it is typically extremely low, adding an extra few milliseconds to disk access time, from less than 1ms with low load to maybe 3-5ms for a loaded good quality SAN array. This is also somewhat dependent on the number of FC switch hops between the ESX hosts and the SAN array box--the more switch hops in the chain, the greater the FC network latency. That said, the greatest latency is going to be introduced by the SAN storage controllers (those smart circuit boards inside the SAN disk boxen that perform the RAID and FC input/output functions.
To give you an idea of the performance you can get from an FC SAN and a couple of decent storage arrays, I architected and implemented a small FC SAN for a 500 user private school. I had 7 blade servers, 2 ESX, 4 Citrix, and one Exchange server, none with local disk. Everything booted and ran from SAN storage, the VMs and all their data, the Exchange server and its store, the Citrix blades, everything.
We had about 20 VMs running across the two ESX blades and I could vmotion just about any VM guest server, IN REAL TIME, from one ESX blade server to the other, in less than 5 seconds. Client network requests were never interrupted. Vmotion is freak'n amazing technology. Anyway, our total CPU and SAN load over the entire infrastructure averaged about 20% utilization. The VMs included two AD DCs, a MS SQL server, Windows file/print servers, myriad SuSE VMs one running a 400GB iFolder datastore (think network file shares on steroids, fully synchronized roaming laptop filesystems sync'd in real time over the network or internet to the iFolder data store), a Novell ZEN server for SuSE Linux workstation push/pull updates and laptop imaging, a Moodle Php/MySQL based course management system with a 50GB db, a Debian syslog collector, etc, etc. Most of the VMs' bot disk images resided on an IBM FasTt 600 array with 14 x 73GB 15Krpm disks in two RAID 5 arrays, one with 6 disks, one with 7, and one hot spare, only 128MB write cache. The 4 Citrix blades used FasTt LUNs for their local disks, and the Exchange server booted from a FasTt LUN and had its DB stored on a FasTt LUN. All other data storage including that of all the VMs resided on a Nexsan Satablade SAN storage array consisting of 8 x 500GB 7.2Krpm disks configured in a RAID5 set, no spares, 512MB write cache. The Bladecenter had an inbuilt 2 port FC switch. I uplinked these two ports via ISL to an 8 port Qlogic 2Gb FC switch. I had one 2 Gb FC link from the FasTt into the Qlogic switch and two 2 Gb links from the Satablade into the switch. For 500 users and every disk access going to these two SAN arrays, the hardware was actually overkill for current needs. But, it had plenty of headroom for spikes and future growth, both in terms of throughput, latency, and storage capacity.
I ran an entire 500 user environment, all systems, all applications, on two relatively low end FC SAN boxen, and you're concerned about the performance of a single mail SMTP/IMAP server over a SAN? I don't think you need to worry about performance, as long as all is setup correctly. ;)
To do this properly, you'll need a second Dell Server with FC HBA and an FC HBA for the existing server, ESX vmotion and HA options, which I'm not sure are available for ESXi. You may have to upgrade to ESX, which as you know has some pricey licensing. But, it's worth the cost just for vmotion/HA.
You'll export a SAN LUN of sufficient size (500GB-1TB) to cover the IMAP store needs from one of the SAN storage arrays to the WWNs of the HBAs in both of your two ESX hosts, and you'll add that LUN to the ESX storage pool as a raw LUN. Do NOT make it a VMFS volume. It's going to be huge, and you're only storing data on it, not virtual machines. VMFS volumes are for virtual machine storage, not data storage. Performance will suffer if you put large data in VMFS volumes. I cannot stress this enough. For HA and vmotion to work, you'll also need to export a small SAN LUN (20GB) to both ESX hosts' FC WWNs, format it as a VMFS, and you'll move the Postfix/Dovecot virtual machine to that ESX VMFS volume. I'm assuming you have the Postfix spool in the same VMFS volume as the boot and root filesystems. This will allow both ESX hosts to boot and run the VM and enables vmotion and HA. (I sincerely hope you don't currently have the VM files and data store for your current IMAP store all in a single VMFS volume. That's horrible ESX implementation and will make this migration a bear due to all the data shuffling you'll have to do between partitions/filesystems, and the fact you'll probably have to shut down the server during the file moving).
You may need to add a soft zone to your FC switches containing the WWNs of the ESX host HBAs and the WWN(s) of the SAN storage array ports you're exporting the LUNs through before you'll see the exposed LUNs on the arrays. Once you have ESX, vmotion, and HA running on both ESX machines, all you have to do is enable HA failover for the Postfix/Dovecot VM. If the ESX host on which it's running, or the VM guest, dies for any reason, the guest will be auto restarted within seconds on the other ESX host. This happens near instantaneously, and transparently, because both hosts have "local disk" access to the same .vmdk files and raw data LUN on the SAN arrays. Clients probably won't even see an error during the failover as IMAP clients reconnect and login automatically. The name and IP address of the servers stays the same, the underlying server itself, all it's config and spool files, metadata files, everything is identical to before the crash. It's just running on a different physical ESX machine. This capability, more than anything else, is what makes VMware ESX worth the licensing costs. Absolutely seamless fault recovery. If an organization can afford it (can any not?) it's the only way to go for x86 based systems.
I welcome any suggestions the group may have.
Unfortunately for this ESX HA architecture, your current MD-1000 isn't reusable. Direct attached storage will never work for any workable/functional HA setup. If I were you, after you migrate your VMs to the SAN such as I mention above, and obviously after you're comfortable all went as planned, I'd direct attach the MD-1000 to another server and use it as a near line network backup server or other meaningful purpose.
If your current FC SAN storage array doesn't have enough spare capacity (performance/space), you can get a suitable unit from Nesxan and other manufacturers for $10-15K in a single controller version. I personally recommend this:
http://www.nexsan.com/sataboy.php http://www.sandirect.com/product_info.php?cPath=171_208_363&products_id=1434
Get a unit with the FC+iSCSI controller, 14x500GB drives, 1GB cache. This is a standard product configuration at Sandirect. Configure a 13 drive RAID5 array with one spare. You may be balking at 7.2Krpm SATA drives and a RAID5 setup. I can tell you from experience with Nexsan's 8 drive Satablade unit (now discontinued) with "only" 512MB cache and an 8 drive RAID5, you won't come close to hitting any performance limits with your current load. This setup would likely carry 4-6 times your current load before introducing latency. Nexsan uses a PowerPC 64 chip on this controller and a very efficient RAID parity algorithm. The performance hit due to parity calculations going from RAID10 to RAID5 is about 10%, but you gain that back because your stripe width is 13 instead of 7, assuming you use all 14 drives for the RAID10 with no spares--if you use spares you must have two, since RAID10 requires an even number of disks, making your stripe width 6. Thus, with spares for each, the RAID5 stripe width is _double_ the RAID10 width.
The WEB GUI admin interface is fantastic, simple. Configure the RAID5 array, then create your initial volumes that you'll export as LUNs to your two ESX hosts' WWNs. Only connect one FC port to the FC switch, and expose volumes to both ESX hosts out the same port with the same LUN. This is critical. Both ESX hosts need to see the same info or you will break things. If you balk again thinking a single 2Gb/4Gb FC link won't be fast enough, you'd be wrong. In addition, if you want to use both ports, you must have dual FC adapters in each ESX host, and you must expose all LUNs out BOTH Sataboy ports to both FC WWNs on each ESX host. You then have to setup ESX FC multipathing, which IIRC is another additional licensing fee, although I'm not positive on that. To add insult to injury, AFAIK, Nexsan isn't an ESX certified SAN vendor, so if you run into problems getting the multipathing to work, VMware techs probably won't help you. As of 2006 they weren't certified, might be today, not sure. All their gear is fully FC compliant, but I guess they never felt like paying the "VMware tax".
This is probably way too much less than optimally written/organized information for the list, and probably a shade OT. I'd be more than glad to continue this off list with anyone interested in FC SAN stuff. I've got some overly aggressive spam filters, so if I block a direct email, hit postmaster@ my domain and I'll see it.
-- Stan