On Wed, 2012-03-28 at 11:07 -0500, Stan Hoeppner wrote:
Locally attached/internal/JBOD storage typically offers the best application performance per dollar spent, until you get to things like backup scenarios, where off node network throughput is very low, and your backup software may suffer performance deficiencies, as is the issue titling this thread. Shipping full or incremental file backups across ethernet is extremely inefficient, especially with very large filesystems. This is where SAN arrays with snapshot capability come in really handy.
I'm a new employee at the company. I was a bit surprised they were not
using iSCSI. They claim they just can't risk the extra latency. I believe that you are right. It seems to me that offloading snapshots and backups to an iSCSI SAN would improve things. The problem is that this company has been burned on storage solutions more than once and they are a little skeptical that a product can scale to what they need. There are some SAN vendor names that are a four letter word here. So far, their newest FC SAN is performing well. I think having more, small, iSCSI boxes would be a good solution. One problem I've seen with smaller iSCSI products is that feature sets like snapshotting are not the best implementation. It works, but doing any sort of automation can be painful.
The snap takes place wholly within the array and is very fast, without the problems you see with host based snapshots such as with Linux LVM, where you must first freeze the filesystem, wait for the snapshot to complete, which could be a very long time with a 1TB FS. While this occurs your clients must wait or timeout while trying to access mailboxes. With a SAN array snapshot system this isn't an issue as the snap is transparent to hosts with little or no performance degradation during the snap. Two relatively inexpensive units that have such snapshot capability are:
How does this work? I've always had Linux create a snapshot. Would the
SAN doing a snapshot without any OS buy-in cause the filesystem to be saved in an inconsistent state? I know that ext4 is pretty good at logging, but still, wouldn't this be a problem?
http://www.equallogic.com/products/default.aspx?id=10613
http://h10010.www1.hp.com/wwpc/us/en/sm/WF04a/12169-304616-241493-241493-241...
The Equallogic units are 1/10 GbE iSCSI only IIRC, whereas the HP can be had in 8Gb FC, 1/10Gb iSCSI, or 6Gb direct attach SAS. Each offer 4 or more host/network connection ports when equipped with dual controllers. There are many other vendors with similar models/capabilities. I mention these simply because Dell/HP are very popular and many OPs are already familiar with their servers and other products.
I will take a look. I might have some convincing to do.
There are 3 flavors of ZFS: native Oracle Solaris, native FreeBSD, Linux FUSE. Which were you using? If the last, that would fully explain the suck.
There is one more that I had never used before coming on board here:
ZFSonLinux. ZFSonLinux is a real kernel level fs plugin. My understanding is that they were using it on the backup machines with the front end dovecot machines using ext4. I'm told the metadata issue is a ZFS thing and they have the same problem on Solaris/Nexenta.
I've relatively new here, but I'll ask around about XFS and see if anyone had tested it in the development environment.
If they'd tested it properly, and relatively recently, I would think they'd have already replaced EXT4 on your Dovecot server. Unless others factors prevented such a migration. Or unless I've misunderstood the size of your maildir workload.
I don't know the entire history of things. I think they really wanted
to use ZFS for everything and then fell back to ext4 because it performed well enough in the cluster. Performance becomes an issue with backups using rsync. Rsync is faster than Dovecot's native dsync by a very large margin. I know that dsync is doing more than rsync, but still, seconds compared to over five minutes? That is a significant difference. The problem is that rsync can't get a perfect backup.
...Jeff