[Dovecot] dsync redesign

Jeff Gustafson ncjeffgus at zimage.com
Wed Mar 28 23:54:01 EEST 2012

On Wed, 2012-03-28 at 11:07 -0500, Stan Hoeppner wrote:

> Locally attached/internal/JBOD storage typically offers the best
> application performance per dollar spent, until you get to things like
> backup scenarios, where off node network throughput is very low, and
> your backup software may suffer performance deficiencies, as is the
> issue titling this thread.  Shipping full or incremental file backups
> across ethernet is extremely inefficient, especially with very large
> filesystems.  This is where SAN arrays with snapshot capability come in
> really handy.

	I'm a new employee at the company. I was a bit surprised they were not
using iSCSI. They claim they just can't risk the extra latency. I
believe that you are right. It seems to me that offloading snapshots and
backups to an iSCSI SAN would improve things. The problem is that this
company has been burned on storage solutions more than once and they are
a little skeptical that a product can scale to what they need. There are
some SAN vendor names that are a four letter word here. So far, their
newest FC SAN is performing well. 
	I think having more, small, iSCSI boxes would be a good solution. One
problem I've seen with smaller iSCSI products is that feature sets like
snapshotting are not the best implementation. It works, but doing any
sort of automation can be painful.

> The snap takes place wholly within the array and is very fast, without
> the problems you see with host based snapshots such as with Linux LVM,
> where you must first freeze the filesystem, wait for the snapshot to
> complete, which could be a very long time with a 1TB FS.  While this
> occurs your clients must wait or timeout while trying to access
> mailboxes.  With a SAN array snapshot system this isn't an issue as the
> snap is transparent to hosts with little or no performance degradation
> during the snap.  Two relatively inexpensive units that have such
> snapshot capability are:

	How does this work? I've always had Linux create a snapshot. Would the
SAN doing a snapshot without any OS buy-in cause the filesystem to be
saved in an inconsistent state? I know that ext4 is pretty good at
logging, but still, wouldn't this be a problem?

> http://www.equallogic.com/products/default.aspx?id=10613
> http://h10010.www1.hp.com/wwpc/us/en/sm/WF04a/12169-304616-241493-241493-241493.html
> The Equallogic units are 1/10 GbE iSCSI only IIRC, whereas the HP can be
> had in 8Gb FC, 1/10Gb iSCSI, or 6Gb direct attach SAS.  Each offer 4 or
> more host/network connection ports when equipped with dual controllers.
>  There are many other vendors with similar models/capabilities.  I
> mention these simply because Dell/HP are very popular and many OPs are
> already familiar with their servers and other products.

	I will take a look. I might have some convincing to do. 

> There are 3 flavors of ZFS:  native Oracle Solaris, native FreeBSD,
> Linux FUSE.  Which were you using?  If the last, that would fully
> explain the suck.

	There is one more that I had never used before coming on board here:
ZFSonLinux. ZFSonLinux is a real kernel level fs plugin. My
understanding is that they were using it on the backup machines with the
front end dovecot machines using ext4. I'm told the metadata issue is a
ZFS thing and they have the same problem on Solaris/Nexenta. 

> > 	I've relatively new here, but I'll ask around about XFS and see if
> > anyone had tested it in the development environment.
> If they'd tested it properly, and relatively recently, I would think
> they'd have already replaced EXT4 on your Dovecot server.  Unless others
> factors prevented such a migration.  Or unless I've misunderstood the
> size of your maildir workload.

	I don't know the entire history of things. I think they really wanted
to use ZFS for everything and then fell back to ext4 because it
performed well enough in the cluster. Performance becomes an issue with
backups using rsync. Rsync is faster than Dovecot's native dsync by a
very large margin. I know that dsync is doing more than rsync, but
still, seconds compared to over five minutes? That is a significant
difference. The problem is that rsync can't get a perfect backup.


More information about the dovecot mailing list