On 1/31/11 9:11 PM -0600 Stan Hoeppner wrote:
Frank Cusack put forth on 1/31/2011 3:06 PM:
That's incorrect. When you fsync() a file, all sane modern filesystems guarantee no data loss, unless you tune that out administratively for performance reasons. If you use a log structured filesystem (like zfs or WAFL) you can optimize the performance as well. With other types of filesystems (like xfs), performance suffers severely under heavy sync write loads.
This depends on how the dev does his syncs. If done intelligently, XFS performance won't suffer. In fact, the preferred write method to XFS for high performance applications is using O_DIRECT. Using O_DIRECT, correctly, with XFS, actually _increases_ write performance versus going through the buffer cache. So you get the best of both worlds: higher performance and data guaranteed on disk.
Most applications don't work well with O_DIRECT. O_DIRECT is meant as a tunable for write-mostly applications and a few other specific classes. A mail store is decidedly not in that class of application. As a data point, zfs (and all log structured filesystems) does not support O_DIRECT because it doesn't make sense given the on-disk layout -- there is no performance benefit to be had.
But not all applications use fsync, O_DIRECT, et al. The point I was making is that on any general system, you will likely have some applications/daemons writing without fsync or O_DIRECT, so you will likely suffer some data loss when the plug is pulled or the kernel crashes. If the timing of the crash is right you can even lose data when using fsync. Depends on how busy the system is and how many synced writes are in flight when the power drops. There truly aren't any guarantees that data will always be on disk. There are always corner cases where you will lose data. Thankfully, for most of us, most of the time, they are _extremely_ rare.
*NO* there are not any corner cases that are not due to administrative knobs (e.g. always buffer metadata) or simply due to bugs. POSIX semantics require that when you call fsync(), data makes it to disk. Many filesystems implement this correctly, however in many cases performance is quite poor. So most applications do not fsync() data.
Read Ted's article I linked. I didn't misquote him. The simple point he was making is that unless devs specifically use fsync or other calls to guarantee their data is on disk, they will suffer data loss with any modern journaling filesystem when the power goes out or the system crashes. You seem to be assuming all devs use fsync. Apparently this is far from reality.
No I do not assume all applications (not devs) use fsync. Most don't. Most mail applications do, or as in dovecot's case, have a knob. If an app does not use fsync, that is not what I am calling data loss. Data loss is the expected behavior for those types of applications. Mail generally doesn't fall into that category.
There are two ways to guarantee no data loss with zfs, one is to disable the ZIL (low performance) and the 2nd is to use a slog (high performance).
And exactly how does an external log device guarantee no data loss? External journal logs enhance performance but I've never heard of them being a magic cure for data loss. XFS allows external log devices as well, for performance.
I'm not going to spoon feed you. [Sorry, I couldn't resist.]