Wayne Thursby put forth on 2/16/2010 9:42 AM:
I was planning on using EqualLogic because the devices seem competent, and we already have an account with Dell. Also, being on VMware's HCL is important as we have a support contract with them.
Using the standard Gbe ports on the servers won't work, for two basic reasons:
This would require using the ESX iSCSI initiator which isn't up to the task, as it sucks too many CPU cycles under intense disk workloads, stealing those cycles from the guests and their applications, which, coincidentally are causing the big disk I/O workload.
Gbe iSCSI has a maximum raw signaling rate of 125MB/s, 100MB/s after TCP overhead. This is less than a single 15K rpm SAS disk. And you'll have 14 of those in the array. That spells Bottleneck with a CAPS B, a 14:1 bottleneck. It's just not suitable for anything but low demand file transfers or very low transaction databases.
Here's good news. I just looked, and most of Nexsan's SAN arrays are now VMware Ready certified, including all the ones I talk about here:
http://alliances.vmware.com/public_html/catalog/PublicCatalog.php
Here's where I think you misunderstood me. I have no SAN at the moment. I'm running a monolithic Postfix/Dovecot virtual machine on an ESXi host that is comprised of a Dell 2950 directly attached via SAS to a Dell MD-1000 disk array. We have no Fiber Channel anything, so going that route would require purchasing a full compliment of cards and switches.
Yes, I did misunderstand. My apologies. The way you worded you previous post led me to believe your organization had a small SAN being used for other things, and that you were consolidating some other applications to that SAN storage and were thinking of moving some of this VMware stuff onto it. I'm clear now that this isn't the case.
I did, however, fully understand what your current ESXi SMTP/IMAP server platform is and what you want to achieve moving forward.
Is it as expensive as running my primary mailserver mounted from the SAN via Fiber Channel? Will that get me under 30ms latency?
Without actually testing the iSCSI solution I can't state the latency. But, there is no doubt latency is going to be an order of magnitude higher with Gbe iSCSI than with 4Gb FC especially under high load. Make that 2-3 orders of magnitude higher if using software initiators. I can tell you that round trip latency of an FC block request from HBA through Qlogic switch to Nexsan array and back will be less than 10ms, and over 90% of that latency is the disk head reads, which you'll obviously have with any SAN. The magic is the low overhead of FC. With 1Gbe iSCSI, half or more of the total latency will be in the ethernet network and TCP processing.
I'm not sure what you mean by "expensive" in this context.
Simply that purchasing FC cards and switches adds to the cost, wheras we already have GbE for iSCSI.
As I stated above, 1Gbe ethernet with a software initiator is woefully inadequate for your needs. Using 1Gbe iSCSI HBAs would help slightly, 10-20% maybe, but you're still shackled with a maximum 100MB/s data rate. Again, that's slower than a single 15K SAS drive. That's not enough bandwidth for your workload, if I understand it correctly.
I ran an entire 500 user environment, all systems, all applications, on two relatively low end FC SAN boxen, and you're concerned about the performance of a single mail SMTP/IMAP server over a SAN? I don't think you need to worry about performance, as long as all is setup correctly. ;)
I hope that is correct, thank you for sharing your experiences. I inherited a mail system that had capable hardware but was crippled by bad sysadmin-ing, so I'm trying to make sure I'm going down the right path here.
You're welcome. There is no "hope" involved. It's just fact. These Nexsan controllers with big cache and fast disks can easily pump 50K random IOPs to cache and 2,500+ through to disk. They really are beasts. You would have to put 5-10X your current workload, including full body searches, through one of these Nexsan units before you'd come close to seeing any lag due to controller or disk bottle necking.
My main concern is when Dovecot tries to run a body search on an inbox with 14,000 emails in it, that the rest of the users don't experience any performance degradation. This works beautifully in my current setup, however the MD-1000 is not supported by VMWare, doesn't do vMotion, etc, etc. It sounds like I have nothing to worry about if I go with Fiber Channel, any idea about iSCSI?
Like I said, you'd have to go with 10Gbe iSCSI with HBAs and a 10Gbe switch to meet your needs. 1Gbe sotware initiator iSCSI will probably fall over with your workload, and your users will very likely see latency effects. And, as I said, due to this fact your costs will be far greater than the FC solution I've outlined.
My current disk layout is as follows: Filesystem Size Used Avail Use% Mounted on /dev/sda1 9.5G 4.2G 4.8G 47% / /dev/sdb1 199G 134G 55G 71% /var/vmail /dev/sdc1 20G 13G 6.8G 65% /var/sdc1 /dev/sdd1 1012M 20M 941M 3% /var/spool/postfix
/dev/sda1 is a regular VMWare disk. The other three are independent persistent disks so that I can snapshot/restore the VM without destroying the queue or stored email.
It's been a while since I worked with the VMware ESX GUI. Suffice it to say that each LUN you expose on the Nexsan will appear to ESX as a big SCSI disk, which you can use as VMFS to store guests, or you can assign it as a raw LUN ("raw device mapping" I think was official VMware jargon) to a particular guest. You've probably got more ESX experience at this point than I do. At the very least your experience is fresh, and mine is stale, back in the 3.0 days. I recall back in the day there were a couple of "gotchas", where if you chose one type of configuration for a LUN (disk) then you couldn't use some of the advanced backup/snapshot features. There were some trade offs one had to make. Man, it's been so long lol. Read the best practices and all the VMware info you can find on using fiber channel SANs with ESX. Avoid any gotchas WRT HA and snapshots.
You certainly clarified a number of things for me by detailing your past setup. I suppose I should clarify exactly what the current plan is.
We are migrating a number of other services to some kind of an HA setup using VMWare and vMotion, that much has been decided. My primary decision centers around choosing either iSCSI or Fiber Channel. We have *no* Fiber Channel infrastructure at the moment, so this would add significantly to the price of our setup (at least 2 cards + switch).
Nah, they're cheap, I'd say maybe $4K total. Lets see...
http://www.cdw.com/shop/products/default.aspx?EDC=1836712 http://www.qlogic.com/SiteCollectionDocuments/Education_and_Resource/Datashe...
http://www.cdw.com/shop/products/default.aspx?EDC=926795 http://download.qlogic.com/datasheet/42737/Datasheet%20-%20QLE2440%20%5BD%5D... http://driverdownloads.qlogic.com/QLogicDriverDownloads_UI/SearchByProduct.aspx?ProductCategory=39&Product=937&Os=167
http://www.cdw.com/shop/products/default.aspx?EDC=1021715 Get 4 SFP LC transceivers (always have a spare) You'll populate 3 switch ports with these, plugging ESX servers into two of them and FC port 0 on the Nexsan into the other. With these products you'll have end to end 4 Gb/s links, 800 MB/s total throughput per switch link--400MB/s full duplex per link.
So, lets see how close my guesstimate was:
1 x QLogic SANbox 3810, 8 x 8/4/2 Gb/s FC switch $1,880 2 x QLogic SANblade QLE2440 - host bus adapter $ 790 4 x IBM 4Gbps SW SFP Transceiver $ 140 Total: $4,020
Yep, about $4K. I under estimated by $20, but then again, CDW isn't the cheapest vendor by far, but I used them as an example as I knew they carried all this stuff. They carry all the Nexsan arrays as well, but unfortunately, just like everyone else, for SAN products in this price range you have to call to get a quote. Get yourself quotes from CDW and SANDirect.com on these standard factory configurations:
Nexsan SASBoy, 2 FC, 2 iSCSI, 2GB cache, 14 x 300GB 15K SAS drives Nexsan SATABoy, 2 FC, 2 iSCSI, 1GB cache, 14 x 500GB 7.2K SATA drives
The first will give you more performance than you can imagine, and will allow for 10 years of performance growth, though at 4.2 raw TB, you may run out of space before 10 years. Depends on if you store digital xrays etc on it. These arrays would really shine in this application BTW. Nexsans have won multiple performance awards for their streaming, although their random I/O is fantastic as well.
The other applications we are virtualizing are nowhere near as disk i/o intensive as our email server, so I feel confident that an iSCSI SAN would meet all performance requirements for everything *except* the email server.
One key point that you are failing to realize is that the advanced storage and backup features of ESX itself demands high bandwidth low latency access to the SAN storage arrays. Snapshots, backup, etc. VMware snapshots will fill FC links to capacity until completed, unless you lower their priority (not sure it that's possible). Anyway, if you want/need to use any of ESX's advanced capabilities, 1Gbe iSCSI isn't going to cut it. We had 2Gb FC, and some operations I performed had to be done at night or one weekends because they filled the SAN pipes. You may run into that even with 4Gb FC. And if you do, you can pat yourself on the back for going FC, as 1Gbe iSCSI would take over 4 times as long to complete the same storage operation. :)
I'm really looking for a way to get some kind of redundancy/failover for Postfix/Dovecot using just iSCSI, but without killing the performance I'm experiencing using direct attached storage, but it sounds like you're saying I need FC.
To maintain the level of I/O performance you currently have, but in a SAN environment which allows VMware magic, you will require either an FC SAN or 10Gbe iSCSI SAN. The 10Gbe iSCSI solution will probably be almost twice the total $price, will be more difficult to setup and troubleshoot, and will have no more, and likely less, total performance than the 4Gb FC solution.
Well, I've got the rest of my virtual infrastructure/SAN already figured out, so my questions are centering around providing redundancy for Dovecot/maildirs. I think you've answered all of my hardware questions (ya' freak). It really seems like Fiber Channel is the way to go if I want to have HA maildirs.
It's not just maildirs you're making HA but the entire Linux guest server, or all your VM guests if you want. All ESX servers connected to shared SAN storage can start and run any VM guest in the environment residing on those SAN LUNs and can access any raw device mappings (raw LUNs) associated with a VM. This is also what makes vmotion possible. It's incredible technology really. Once you start getting a good grasp on what VMware ESX, Vmotion, HA, Snapshots, etc can really do for you, you'll start buying additional machines and ESX licenses, and you'll end up consolidating every possible standalone server you have onto VMware. The single largest overriding reason for this is single point backup and disaster recovery.
With consolidated backup, and a large enough tape library system, it's possible to do a complete nightly backup of your entire VMware environment including all data on the SAN array(s), and rotate the entire set of tapes off site for catastrophic event recovery, for things such as fire, earthquake, flood, etc. In the immediate aftermath, you can acquire one big machine with appropriate HBAs, an identical SAN array, switch, tape library, etc, and restore the entire system in less than 24 hours, bringing up only critical VMs until you're able to get more new machines in and setup. The beauty of ESX is that there is nothing to restore onto all the ESX hosts. All you do is a fresh install of ESX and configure it to see the SAN LUNs. You can have a copy of the ESX host configuration files sitting on the SAN, and thus in the DR backup.
Normally this is done in a temporary data center colocation facility with internet access so at minimum principals within the organization (CEO, CFO, VPs, etc) can get access to critical information to start rebuilding the organization. This is all basic business continuity 101 stuff, so I won't go into more detail. The key point is that with ESX, an FC SAN, a tape library and consolidated backup, the time to get an organization back up and running after a catastrophe is cut from possibly weeks to a couple of days, most of that time being spent working with insurance folk and waiting on the emergency replacement hardware to arrive.
There is no replacement for off site tape but a hot/standby remote datacenter, and most can't afford that. Thus, one needs a high performance high capacity tape library/silo. Doing consolidated backup of one's VMware environment requires fast access to the storage. 1Gbe iSCSI is not even close to appropriate for this purpose. Case in point: you have 4Gb of VMs and data LUNs on your array. If you can get 100% of the iSCSI Gbe bandwidth for consolidated backup--which you can't because the VMs are going to be live at the time, and you can't get 100% out of Gbe anyway due to TCP--it'll would take 11 hours to backup that 4TB as it all has to come off the array via 1Gbe iSCSI. If you have 4Gb FC it would cut that time x 4, that 11 hours becoming a little under 3 hours. An 11 hour backup window is business disruptive, and makes it difficult to properly manage an off site backup procedure (which everyone should have).
I just don't know if I can justify the extra cost of a FC infrastructure just because a single service would benefit, especially if there's a hybrid solution possible, or if iSCSI was sufficient, thus my questions for the list.
I covered the costs above, and again, FC beats iSCSI all around the block and on Sunday, unless Equallogic has dropped their prices considerably since 2006. If by hybrid you mean a SAN array that supports both 4Gb FC and 1Gbe iSCSI, all of the Nexsan units fit that bill with two 4Gb FC and two 1Gbe iSCSI ports per controller.
Sorry this is so freak'n long. Hardware is my passion, and I guess verbosity is my disease. Hope the info helps in one way or another. If nothing else it may put you to sleep faster than a boring book. ;)
-- Stan