On 1/20/11 12:06 AM -0600 Stan Hoeppner wrote:
This is amusing considering XFS is hands down the best filesystem available on any platform, including ZFS. Others are simply ignorant and repeat what they've heard without looking for current information.
Not to be overly brusque, but that's a laugh. The two "best" filesystems out there today are vxfs and zfs, for almost any enterprise workload that exists. I won't argue that xfs won't stand out for specific workloads such as sequential write, it might and I don't know quite enough about it to be sure, but for general workloads including a mail store zfs is leaps ahead. I'd include WAFL in the top 3 but it's only accessible via NFS. Well there is a SAN version but it doesn't really give you access to the best of the filesystem feature set (tradeoff for other features of the hardware).
Your pronouncement that others are simply ignorant is telling.
Your data isn't safe until it hits the disk. There are plenty of ways to spool data to ram rather than committing it, but they are all vulnerable to data loss until the data is written to disk.
The delayed logging code isn't a "ram spooler", although that is a mild side effect. Apparently I didn't explain it fully, or precisely. And keep in mind, I'm not the dev who wrote the code. So I'm merely repeating my recollection of the description from the architectural document and what was stated on the XFS list by the author, Dave Chinner of Red Hat. ... In my desire to be brief I didn't fully/correctly explain how delayed logging works. I attempted a simplified explanation that I thought most would understand. Here is the design document: http://oss.sgi.com/archives/xfs/2010-05/msg00329.html
I guess I understand your championing of it if you consider that a design document. That brief piece of email hardly describes it at all, and the performance numbers are pretty worthless (due to the caveat that barriers are disabled).
Given the paragraph in the "design document":
The best IO behaviour comes from the delayed logging version of XFS, with the lowest bandwidth and iops to sustain the highest performance. All the IO is to the log - no metadata is written to disk at all, which is the way this test should execute. As a result, the delayed logging code was the only configuration not limited by the IO subsystem - instead it was completely CPU bound (8 CPUs worth)...
it is indeed a "ram spooler", for metadata, which is a standard (and good) approach. That's not a side effect, that's the design. AFAICT from the brief description anyway.
This is guaranteed to lose data on power loss or drive failure.