[Dovecot] Mailbox Hashing
Kyle Wheeler
kyle-dovecot at memoryhole.net
Fri Nov 14 01:43:23 EET 2008
On Thursday, November 13 at 05:20 PM, quoth Justin Krejci:
> Is there any method for hashing the inbox automatically after say
> 5,000 messages are stored? Example
>
> $Maildir/in/0/message0
> $Maildir/in/0/message1
> $Maildir/in/0/message2
Not in Maildir. The Maildir format does not allow that, so... It may
be possible to do with something like dbox, since that's a
Dovecot-specific format.
In general, though, that kind of hashing is usually a workaround for a
lousy filesystem (such as ext2), rather than something you'd really
*want* to do.
The one exception might be if you want to split someone's inbox over
several filesystems, but even that could be accomplished using
something like UnionFS. Of course, we're getting outside the realm of
production-tested options here, and it would probably introduce all
kinds of potential problems with locking and such.
> I am not currently using Dovecot but am interested to know if this
> is available or does running with 20,000+ messages in a single inbox
> not affect the performance much?
It all depends on the filesystem and what operations you're doing.
Dovecot does a *lot* of caching to avoid hitting the filesystem
whenever it can. However, randomly accessing messages in your mailbox
*will* cause a filesystem access, and the speed of that depends on
having a halfway decent filesystem.
> I have looked into other file system tuning techniques such as
> enabling ext3 dir_index or using ReiserFS (maybe not ReiserFS
> anymore). There will likely be 15,000 to 20,000 accounts spread out
> on one or more servers using a 6-drive RAID10 setup. Most accounts
> are not expected to have high message quantities but there will be
> lots of concurrent connections via pop and imap (and webmail imap).
You should be fine. I'd probably encourage something more stable like
ext3 with dir_index (ReiserFS is often viewed as a purely experimental
filesystem, and not reliable for production systems). The ext3
documentation suggests that 100k-1M+ files in a single directory
should not pose a significant performance problem when using
dir_index. I haven't tried it with directories that are *that* big,
but I regularly use mailboxes with over 5k messages without problems.
~Kyle
--
A woman is like a tea bag. It's only when she's in hot water that you
realize how strong she is.
-- either Eleanor Roosevelt or Carl Sandberg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 204 bytes
Desc: not available
Url : http://dovecot.org/pipermail/dovecot/attachments/20081113/d7ff1b39/attachment.bin
More information about the dovecot
mailing list