Frank Cusack put forth on 1/20/2011 2:30 PM:
On 1/20/11 12:06 AM -0600 Stan Hoeppner wrote:
This is amusing considering XFS is hands down the best filesystem available on any platform, including ZFS. Others are simply ignorant and repeat what they've heard without looking for current information.
Your pronouncement that others are simply ignorant is telling.
So is your intentionally quoting me out of context. In context:
Me: "Prior to 2007 there was a bug in XFS that caused filesystem corruption upon power loss under some circumstances--actual FS corruption, not simply zeroing of files that hadn't been fully committed to disk. Many (uneducated) folk in the Linux world still to this day tell others to NOT use XFS because "Power loss will always corrupt your file system." Some probably know better but are EXT or JFS (or god forbid, BTRFS) fans and spread fud regarding XFS. This is amusing considering XFS is hands down the best filesystem available on any platform, including ZFS. Others are simply ignorant and repeat what they've heard without looking for current information."
The "ignorant" are those who blindly accept the false words of others regarding 4+ year old "XFS corruption on power fail" as being true today. They accept but without verification. Hence the "rumor" persists in many places.
In my desire to be brief I didn't fully/correctly explain how delayed logging works. I attempted a simplified explanation that I thought most would understand. Here is the design document: http://oss.sgi.com/archives/xfs/2010-05/msg00329.html
I guess I understand your championing of it if you consider that a design document. That brief piece of email hardly describes it at all, and the performance numbers are pretty worthless (due to the caveat that barriers are disabled).
You quoted me out of context again, intentionally leaving out the double paste error I made of the same URL.
Me: "In my desire to be brief I didn't fully/correctly explain how delayed logging works. I attempted a simplified explanation that I thought most would understand. Here is the design document: http://oss.sgi.com/archives/xfs/2010-05/msg00329.html
Early performance numbers: http://oss.sgi.com/archives/xfs/2010-05/msg00329.html"
Note the double URL paste error? Frank? Why did you twist an honest mistake into something it's not? Here's the correct link:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Do...
Given the paragraph in the "design document":
Stop being an ass. Or get off yours and Google instead of requiring me to spoon feed you.
The best IO behaviour comes from the delayed logging version of XFS, with the lowest bandwidth and iops to sustain the highest performance. All the IO is to the log - no metadata is written to disk at all, which is the way this test should execute. As a result, the delayed logging code was the only configuration not limited by the IO subsystem - instead it was completely CPU bound (8 CPUs worth)...
it is indeed a "ram spooler", for metadata, which is a standard (and good) approach. That's not a side effect, that's the design. AFAICT from the brief description anyway.
As you'll see in the design doc, that's not the intention of the patch. XFS already had a delayed metadata update design, but it was terribly inefficient in implementation. Dave increased the efficiency several fold. The reason I mentioned it on Dovecot is that it directly applies to large/busy maildir style mail stores.
XFS just clobbers all other filesystems in parallel workload performance, but historically its metadata performance was pretty anemic, about half that of other FSes. Thus, parallel creates and deletes of large numbers of small files were horrible. This patch fixes that issue, and brings the metadata performance of XFS up to the level of EXT3/4, Reiser, and others, for single process/thread workloads, and far surpasses their performance with large parallel process/thread workloads, as is shown in the email I linked.
This now makes XFS the perfect Linux FS for maildir and [s/m]dbox on moderate to heavy load IMAP servers. Actually it's now the perfect filesystem for all Linux server workloads. Previously it was for all workloads but metadata heavy ones.
This is guaranteed to lose data on power loss or drive failure.
On power loss, on a busy system, yes. Due to a single drive failure? That's totally incorrect. How are you coming to that conclusion?
As with with every modern Linux filesystem that uses the kernel buffer cache, which is, all of them, you will lose in flight data that's in the buffer cache when power drops.
Performance always has a trade off. The key here is that the filesystem isn't corrupted due to this metadata loss. Solaris with ZFS has the same issues. One can't pipeline anything in a block device queue and not have some data loss on power failure, period. If one syncs every write then you have no performance. Solaris and ZFS included.
-- Stan