Re: [Dovecot] Question about "slow" storage but fast cpus, plenty of ram and dovecot

14 Dec 2010


      ...
Javier, your number and type of spindles aren't the source of your I/O
bottleneck.  Your array controller is.  Or, more specifically, its lack
of enough fast network ports is the problem.  The P4300 BK718A model of
which you speak only has two 1GbE iSCSI ports.  That's only 200 MB/s
full duplex.  That's less link bandwidth than a single 2 Gbit fiber
channel interface.  2 Gbit FC is 2 generations and over 8 years old now.
It was superseded by 4 Gbit FC and then 8 Gbit FC, which is the current
standard.  8 Gbit FC provides 800 MB/s full duplex bandwidth per link.
 Hmm... I do not fully agree with your reasoning. An imap server is 
a lot of random I/O, not too much bandwith. Our
LeftHand Networks have two 1 Gbein adaptive load balancing (a kind of
active-active connection), meaning 4 gbps full duplex.
 I attach you a screenshot of the perfomance of the lefthand: 
Average: 15 MB/seg, 1.700 IOPS. Highest load (today) is ~62 MB/seg, with
a whooping 9000 IOPS, mucho above the theorical iops of 2 raid5 of 8
disks each (SAS 15K), the cache is working as expected, and queue depth
of 226 (a bit overloaded, though)
...
There are many other fine SAN arrays on the market.  I mention Nexsan
merely because they are very affordable, fast as hell for the price,
easy to manage, and I've used them.  Due to the low price, they lack the
more advanced features of higher priced units, such as snapshots, remote
replication, etc.  As I mentioned, I'm not a big fan of relying on SAN
controllers to perform all my disaster recovery functions, so I do that
with software, such as VCB, and a dedicated backup server with a SAN
attached robotic library.  Thus, the lack of this functionality in the
Nexsan units is not an issue here.
 We have more than 65k users, and we need active-active disaster 
recovery. Half of our ESX cluster & lerfthands are in a location, the
other half is in other location, and we have tested that shutting down a
complete site we can recover in minutes (simply, the VMs start launching
in the other one). Only lefthand (in the cheap market)  gave us realtime
replication of storage, a kind of "network raid 1" for our storage...
 I still think that my problem is IOPs related, no bandwith related. 
My maximum bandwith today was 60 MB/seg, that fits entirely in 1 Gbps,
but the queue depth is high because of the lot of iops (9000) that
"only" 16 disks can not handle. I can buy better storage heads to delay
all that writes, or avoid a lot of them putting the indexes in a SSD or
in a ramdisk.
 Thank for all the info, I did not know about Nexsan.

 Regards

 Javier

Re: [Dovecot] Question about "slow" storage but fast cpus, plenty of ram and dovecot

Javier de Miguel Rodrí­guez

Javier de Miguel Rodríguez