[]'sf.rique
On Sun, Jan 23, 2011 at 1:20 AM, Stan Hoeppner <stan@hardwarefreak.com>wrote:
Henrique Fernandes put forth on 1/22/2011 2:59 PM:
About change the EMC to raid 10, can not do it because other people are using it. So we can not edit anything one strage. Those other luns i talk about, they are ment to be for WEB but as we are testing we are allowed to use it.
You need to look at this: http://www.hardwarefreak.com/thin-provisioning.jpg
The exported LUNS need to be configured as independent of the underlying physical disk RAID level. You can reconfigure up to 16 disks of the CX4 as RAID 10, and export as many LUNS of any size as you like. I.e. you can keep the same exported LUNS you have now, although you may have to adjust their sizes a bit. What you gain by doing this is a doubling of random IOPS.
Will not be that easy because the people who actually got the storage need space not performance, we are just using the storage because we needed, it wasn't bought to us! ( but this is not your problens )
You will need to backup any real/live data on the current LUNS. You only have 16 disks in the CX4 correct? So reconfigure with 2 spares and 14 disks in a RAID 10. Then create the same LUNS you had before. I don't use EMC products so I don't know their terminology. But, this is usually called something like "virtual disks" or "virtual LUNS". The industry calls this "thin provisioning". You *need* to do this if you're going to support hundreds or more concurrent users. An effective stripe width of only 4 spindles, which is what you currently have with an 8 disk RAID10, isn't enough. That gives you only 600-1200 IOPS depending on the RPM of the disks in the array: 600 for 7.2k disks, and 1200 for 15k disks. With a 14 disk RAID10 you'll have IOPS of 1050 to 2100.
Have more i guess, need to talk with the person who know about the
storage. I will look about thin provisioning
Much that you said, we are trying to do, but we don't have the hardware.
Our XEN just have 2 1gbE enterfaces, and we are using one for get external and another one to use the storage ( ours vms is also in ocfs2 to be able to migrate and etc to any host )
In that case you _really really_ need more IOPS and throughput from the CX4.
The vms disks are in another LUN, but they don't suffer of performance issues, this is one of the reason we don't think it the storage problem but an filesystem.
You didn't answer my previous question regarding the physical connections from the CX4 to your ethernet switch. Do you have both ports connected, and do you have them link aggregated? If not, this is probably the single most important change you could make at this point. I seriously doubt that 100 MB/s is sufficient for the load you're putting on the CX4.
Only one cabe! Nothing agragated There is 2 connection i guess to the CX4, it is one for each SPARE , is something about the storage that saparete the LUNS not sure about why this.
We have been analising the througput and is not that high as i said, we now are having someproblens with monitoring and etc. ( another sector where i work, it does not belowng to me any of this conf ) But we did not see any port on the swicth with more than half of use!
Anyway, we are considering very much the idea of making DLM on a dedicated network. We gonna study some way to do it with or hardware.
As I said a single decent quality 24 port GbE managed switch will carry the DLM and iSCSI traffic just fine, especially if all the devices can support jumbo frames. Cheap desktop/soho switches such as DLink are going to cripple your operation.
About jumbo frames, the person had some problens configuring it, so now i am not sure if we are using jumbo frames or not, you be much better if we were using it ?
I migh look for the siwtch but is not that bad i guess.
Apreciate your help, i guess i lernad a lot. ( i did forward this emails for some of my bosses, hope will change anything about hardware, but who knows.
Glad to be of assistance. Hope you get all the problems worked out. The only money you really need to spend is on a decent GbE managed switch with 16-24 ports.
Anotehr thing to say, the email is not very tunning yet, but as we gonna improving we start to get more money to buy more stuff to the service. Thats is we we try to make it better with poor hardware. AS i said, before it was everything on fisica desktop with 500gb of disk. So right now we make a really big improvement.
It's still better than before, even with all the current performance problems? Well at least you're making some progress. :)
It is better, because now we have an decent webmail ( horde with dimp enable, before were just imp ) , and most people use to have pop configured, becasue of quota of 200mb, and little user use webmail. Now much more people use the webmail and imap cause quota is 1gb now. Any better free webmail to point out tu us test ?
Thanks a lot to all, i still apreciate any help. I am reading it all and trying to take the best of it!
Are you running OCFS in both the Xen guests and the Xen hosts? If so that may also be part of the performance problem. You need to look at ways to optimize (i.e. decrease) the OCFS metadata load. Can Xen export a filesystem up to the guest via some virtual mechanism, such as ESX presents virtual disks to a guest? If so you should do that.
Let me try explain how it is configured.
or 4 Xen host mount one disk exported over iSCSI from CX4. Each virtual machine has a disk.iso ( is an dd from /dev/zero ) places in the mounted CX4 and this iso is exported to xen as the disk of virtual machine. This has one xen interface ( eth0 ) to it self. the other interfafce eth1 have several vlans. This same interface eth0 is expored to an virtual machine that mount the CX4.
We are thinking in someway to improve performance by doing anything in dovecot index files. Diferenet lun or local for each host or something like it. Don't know with one is the best.
Right now we don't know what is cause the performance problems.
Oh i forgot to tellyou before: we have 3 nodes in ocfs2, 2 function as imap pop lda, and the othe rone is just for mailman. Before when some email where sent to an lis with all emails, the other 2 serves just stop working and get IO/wait about 90%. The same problens happens if i got to any host and simple do a big rm -rf on a big email account. Another thing that makes we think that is an ocfs2 problem.
Glad for you help!
-- Stan