Thank you for your responses Stan, I reply you below
For that many users I'm guessing you can't physically stuff enough RAM into the machines in your ESX cluster to use a ramdisk for the index files, and if you could, you probably couldn't, or wouldn't want to, afford the DIMMs required to meet the need.
Yes, I have a cluster of 4 ESX servers. I am going to do some
scriptting to see how much space we are allocating to indexes.
- In my setup I have 25.000+ users, almost 7.000.000
messages in my maildir. How much memory should I need in a ramdisk to hold that?
- What happens if something fails? I think that if I
lose the indexes (ej: kernel crash) the next time I boot the system the ramdisk will be empty, so the indexes should be recreated. Am I right?
Given the size of your mail user base, I'd probably avoid the ramdisk option, and go with a couple of striped (RAID 0) 100+ GB SSDs connected on the iSCSI SAN. This is an ESX cluster of more than one machine correct? You never confirmed this, but it seems a logical assumption based on what you've stated. If it's a single machine you should obviously go with locally attached SATA II SSDs as it's far cheaper with much greater real bandwidth by a factor of 100:1 vs iSCSI connection.
My SAN(s) (HP LeftHand Networks) do not support SSD, though. But I
have several LeftHand nodes, some of them with raid5, others with raid 1+0. Maildirs+indexes are now in raid5, maybe I can separate the indexes to raid 1+0 iscsi target in a different san
- If I buy a SSD system and export that little and fast
storage via iSCSI, does zlib compression applies to indexes?
Timo will have to answer this regarding zlib on indexes.
That would be rather interesting.
- Any additional filesystem info? I am using ext3 on RHEL 5.5, in
RHEL 5.6 ext4 will be supported. Any performance hint/tuning (I already use noatime, 4k blocksize)?
I'm shocked you're running 25K mailboxen with 7 million messages on maildir atop EXT3! On your fast iSCSI SAN array, I assume with at least 14 spindles in the RAID group LUN where the mail is stored, you should be using XFS.
I have two raid5 (7 disks+1 spare) and I have joined them via LVM
stripping. Each disk is SAS 15k rpm 450GB, and the SANs have 512 MB-battery-backed-cache. In our real workload (imapsync), each raid5 gives around 1700-1800 IOPS, combined 3.500 IOPS.
Formatted with the correct parameters, and mounted with the correct options, XFS will give you _at minimum_ a factor of 2 performance gain over EXT3 with 128 concurrent users. As you add more concurrent users, this ratio will grow even greater in XFS' favor.
Sadly, Red Hat Enterprise Linux 5 does not support natively XFS. I
can install it via CentosPlus, but we need Red Hat support if somethings goes VERY wrong. Red Hat Enterprise Linux 6 supports XFS (and gives me dovecot 2.0), but maybe it is "too early" for a RHEL6 deployment for so many users (sigh).
I will continue investigating about indexes. Any additional hint?
Regards
Javier