On 17/01/2011 02:20, Stan Hoeppner wrote:
Ed W put forth on 1/16/2011 4:11 PM:
Using XFS with delayed logging mount option (requires kernel 2.6.36 or later).
XFS has natively used delayed allocation for quite some time, coalescing multiple pending writes before pushing them into the buffer cache. This not only decreases physical IOPS, but it also decreases filesystem fragmentation by packing more files into each extent. Decreased fragmentation means fewer disk seeks required per file read, which also decreases physical IOPS. This also greatly reduces the wasted space typical of small file storage. Works very well with maildir, but also with the other mail storage formats. What happens if you pull out the wrong cable in the rack, kernel lockup/oops, power failure, hot swap disk pulled, or something else which causes an unexpected loss of a few seconds of written data? Read the XFS FAQ. These questions have been answered hundreds of times since XFS was released in Irix in 1994. I'm not your personal XFS tutor.
Why the hostile reply?
The question was deeper than your response?
Surely your IOPs are hard limited by the number of fsyncs (and size of any battery backed ram)? Depends on how your applications are written and how often they call fsync. Do you mean BBWC? WRT delayed logging BBWC is mostly irrelevant. Keep in mind that for delayed logging to have a lot of metadata writes in memory someone, or many someones, must be doing something like an 'rm -rf' or equivalent on a large dir with many thousands of files. Even in this case, the processing is _very_ fast.
You have completely missed my point.
Your data isn't safe until it hits the disk. There are plenty of ways to spool data to ram rather than committing it, but they are all vulnerable to data loss until the data is written to disk.
You wrote: "filesystem metadata write operations are pushed almost entirely into RAM", but if the application requests an fsync then you still have to write it to disk? As such you are again limited by disk IO, which itself is limited by the performance of the device (and temporarily accelerated by any persistent write cache). Hence my point that your IOPs are generally limited by the number of fsyncs and any persistent write cache?
As I write this email I'm struggling with getting a server running again that has just been rudely powered down due to a UPS failing (power was fine, UPS failed...). This isn't such a rare event (IMHO) and hence I think we do need to assume that at some point every machine will suffer a rude and unexpected event which looses all in progress write cache. I have no complaints at XFS in general, but I think it's important that filesystem designers in general have give some thought to this event and recovering from it?
Please try not to be so hostile in your email construction - we aren't all idiots here, and even if we were, your writing style is not conducive to us wanting to learn from your apparent wealth of experience?
Regards
Ed W