[Dovecot] Best filesystem?

Frank Cusack frank+lists/dovecot at linetwo.net
Tue Feb 1 05:37:34 EET 2011


On 1/31/11 9:11 PM -0600 Stan Hoeppner wrote:
> Frank Cusack put forth on 1/31/2011 3:06 PM:
>> That's incorrect.  When you fsync() a file, all sane modern filesystems
>> guarantee no data loss, unless you tune that out administratively for
>> performance reasons.  If you use a log structured filesystem (like zfs
>> or WAFL) you can optimize the performance as well.  With other types
>> of filesystems (like xfs), performance suffers severely under heavy
>> sync write loads.
>
> This depends on how the dev does his syncs.  If done intelligently, XFS
> performance won't suffer.  In fact, the preferred write method to XFS for
> high performance applications is using O_DIRECT.  Using O_DIRECT,
> correctly, with XFS, actually _increases_ write performance versus going
> through the buffer cache.  So you get the best of both worlds:  higher
> performance and data guaranteed on disk.

Most applications don't work well with O_DIRECT.  O_DIRECT is meant
as a tunable for write-mostly applications and a few other specific
classes.  A mail store is decidedly not in that class of application.
As a data point, zfs (and all log structured filesystems) does not
support O_DIRECT because it doesn't make sense given the on-disk
layout -- there is no performance benefit to be had.

> But not all applications use fsync, O_DIRECT, et al.  The point I was
> making is that on any general system, you will likely have some
> applications/daemons writing without fsync or O_DIRECT, so you will
> likely suffer some data loss when the plug is pulled or the kernel
> crashes.  If the timing of the crash is right you can even lose data when
> using fsync.  Depends on how busy the system is and how many synced
> writes are in flight when the power drops.  There truly aren't any
> guarantees that data will always be on disk.  There are always corner
> cases where you will lose data.  Thankfully, for most of us, most of the
> time, they are _extremely_ rare.

*NO* there are not any corner cases that are not due to administrative
knobs (e.g. always buffer metadata) or simply due to bugs.  POSIX
semantics require that when you call fsync(), data makes it to disk.
Many filesystems implement this correctly, however in many cases
performance is quite poor.  So most applications do not fsync() data.

> Read Ted's article I linked.  I didn't misquote him.  The simple point he
> was making is that unless devs specifically use fsync or other calls to
> guarantee their data is on disk, they will suffer data loss with any
> modern journaling filesystem when the power goes out or the system
> crashes.  You seem to be assuming all devs use fsync.  Apparently this is
> far from reality.

No I do not assume all applications (not devs) use fsync.  Most don't.
Most mail applications do, or as in dovecot's case, have a knob.  If
an app does not use fsync, that is not what I am calling data loss.
Data loss is the expected behavior for those types of applications.
Mail generally doesn't fall into that category.

>> There are two ways to guarantee no data loss with zfs, one is to disable
>> the ZIL (low performance) and the 2nd is to use a slog (high
>> performance).
>
> And exactly how does an external log device guarantee no data loss?
> External journal logs enhance performance but I've never heard of them
> being a magic cure for data loss.  XFS allows external log devices as
> well, for performance.

I'm not going to spoon feed you.  [Sorry, I couldn't resist.]


More information about the dovecot mailing list