So i read all emails and stuff.
But i am sorry to say, much of the things you said we are not able to do.
About change the EMC to raid 10, can not do it because other people are using it. So we can not edit anything one strage. Those other luns i talk about, they are ment to be for WEB but as we are testing we are allowed to use it.
Much that you said, we are trying to do, but we don't have the hardware.
Our XEN just have 2 1gbE enterfaces, and we are using one for get external and another one to use the storage ( ours vms is also in ocfs2 to be able to migrate and etc to any host )
Anyway, we are considering very much the idea of making DLM on a dedicated network. We gonna study some way to do it with or hardware.
Apreciate your help, i guess i lernad a lot. ( i did forward this emails for some of my bosses, hope will change anything about hardware, but who knows. )
Anotehr thing to say, the email is not very tunning yet, but as we gonna improving we start to get more money to buy more stuff to the service. Thats is we we try to make it better with poor hardware. AS i said, before it was everything on fisica desktop with 500gb of disk. So right now we make a really big improvement.
Thanks a lot to all, i still apreciate any help. I am reading it all and trying to take the best of it!
[]'sf.rique
On Fri, Jan 21, 2011 at 8:06 PM, Stan Hoeppner <stan@hardwarefreak.com>wrote:
Henrique Fernandes put forth on 1/21/2011 12:53 PM:
We think it is the ocfs2 and the size of the partition, becasue. We can write an big file in a accetable speed. But if we try to delete or create or read lots of small files the speed is horrible. We think is an DLM problem in propagate the locks and etc.
It's not the size of the filesystem that's the problem. But it is an issue with the DLM, and with the small RAID 10 set. This is why I recommended putting DLM on its own dedicated network segment, same with the iSCSI traffic, and making sure you're running full duplex GbE all round. DLM doesn't require GbE bandwidth, but the latency of GbE is less than fast ethernet. I'm also assuming, since you didn't say, that you were running all your ethernet traffic over a single GbE port on each Xen host. That just doesn't scale when doing filesystem clustering. The traffic load is too great, unless you're idling all the time, in which case, why did you go OCFS? :)
Do you have any idea how to test the storage from maildir usage ? We made a bashscript that write some diretores and lots of files and after it removes and etc.
This only does you any good if you have instrumentation setup to capture metrics while you run your test. You''ll need to run iostat on the host running the script tests, along with iftop, and any OCFS monitoring tools. You'll need to use the EMC software to gather IOPS and bandwidth metrics from the CX4 during the test. You'll also need to make sure your aggregate test data size is greater than 6GB which is 2x the size of the cache in the CX4. You need to hit the disks, hard, not the cache.
The best "test" is to simply instrument your normal user load and collect the performance data I mentioned.
Any better ideias ?
Ditch iSCSI and move to fiber channel. A Qlogic 14 port 4Gb FC switch with all SFPs included is less than $2500 USD. You already have the FC ports in your CX4. You'd instantly quadruple the bandwidth of the CX4 and that of each Xen host, from 200 to 800 MB/s and 100 to 400 MB/s respectively. Four single port 4Gb FC HBAs, one for each server, will run you $2500-3000 USD. So for about $5k USD you can quadruple your bandwidth, and lower your latency.
I don't recall if you ever told us what your user load is. How many concurrent Dovecot user sessions are you supporting on average?
Apreciate your help!
No problem. SANs are one of my passions. :)
-- Stan