Timo Sirainen skrev:
On Wed, 2009-10-14 at 23:59 +0200, Mikkel wrote:
In case of mdbox wouldn't you have the very same problem since larger files may be fragmented all over the disk just like many small files in a directory might?
I guess this depends on filesystem. But the files would typically be about 2 MB of size. I think filesystems usually copy more data around to avoid fragmentation.
In any case if there are expunged messages, files containing them would be recreated (nightly or something). That'll unfragment the files.
It would be nice if this recreation interval (nightly, weekly, monthly?) was made tunable. Some users would have mailboxes a several hundred megabytes and having to recreate thousands of these every night because of a single mail getting expunged a day could result in a huge performance hit.
And finally one thing I've also been thinking about has been that perhaps new mails could be created into separate individual files. A nightly run would then gather them together into a larger file.
I think this could be a great idea in some setups and a pretty bad one in others. In my setup for instance incoming emails account for about half the disk write activity and though activity is somewhat lower during the night there still is a lot of activity around the clock.
In the proposed design all the incoming emails would have to be written to the disk twice (if I got it right?) and at a time when there would still be a relatively high activity (because there always is).
So if this should increase performance overall there would have to be a somewhat large initial gain in order to justify the double writing. Also most emails are either read shortly after arriving or not at all so the primary access to the mails would happen while the emails are still located in single files.
My point is that there could be many reasons why such a design might actually lead to poorer performance so this should probably be tested extensively before being implemented.
But this could be a really nice solution if implemented in such a way the incoming mails are stored in single files on a separate configurable device (which would optimally be a flash device) and then moved to the actual storage at night.
So I can definitely see the point in mdbox but I better stay away from it, using NFS... :/
What kind of index related errors have you seen in logs? Dovecot can handle most index corruptions without losing much (if any) data. Everything related to dovecot.index.cache can at least be ignored.
The errors I get are like these two:
dovecot: Oct 01 15:13:05 Error: POP3(account@domain): Transaction log file /local/account_homedir/Maildir/dovecot.index.log: marked corrupted
dovecot: Oct 01 15:13:57 Error: IMAP(another_account@domain): Corrupted transaction log file /local/another_homedir/Maildir/dovecot.index.log seq 229: duplicate transaction log sequence (229) (sync_offset=32860)
Regards, Mikkel