On Tue, 09 May 2006 16:49:33 -0500 Les Mikesell lesmikesell@gmail.com wrote:
On Tue, 2006-05-09 at 16:32, Wouter Van Hemel wrote:
In real life on general purpose servers, the gains have been quite marginal, though. Filesystem change isn't a miracle cure for performance problems, obviously; if that's the problem, more disks to spread the transactions over make a much bigger difference I/O wise.
If you put a huge number of files in the same directory, the filesystem type can make a big difference in access time. Remember that before you can create a new file you must scan the current list first to see if that name already exists and the whole operation has to happen atomically with the directory locked. Filesystems that index the directories can help compared to a linear scan although there are some tradeoffs. Also some never shrink a directory when files are removed so you continue to scan all the empty slots.
IIRC all typical filesystems for Linux (ext3, xfs, jfs, reiserfs) use directory indexing, usually by means of a b-tree.
It's important to note that these filesystems each have their own strengths, and performance will depend on many factors such as the size and number of files, parallellism, number and type of disks, fragmentation, i/o load, possibly even cpu load. Are we talking about a relaying mailserver or end-user storage? Do the users move or delete a lot of files? Do they rather use imap, or pop3? What other activities run on the machine? How do you see the reliability/performance trade-off?
In real life, things aren't as clean-cut as in most of those generic benchmarks, and people tend to attach too much importance to them and then usually get into silly flamewars. :)
Long ago I used a benchmark program called 'postmark' to test the speed of file creation/deletion operations that are typical in maildir environments. I haven't been able to find it recently although the last time I mentioned it someone said it was in the debian repositories and available via apt-get.
I seem to remember to have used it once too, also in a very vague past.
Now, I'm not sure how valid the results would be if, for instance, there's a webserver serving dynamic webmail pages in the same time...
In the past, I've spent (wasted) quite some time benchmarking things like FreeBSD vs Linux, Perl vs PHP, template systems, etc. Now I believe that people should just pick what they feel comfortable with, because the differences are often not that large and it's rarely worth their time and money.
(Though, that's often not what people want to hear. :) )