On 7/19/2011 11:35 AM, Ricardo Branco wrote:
I agree with yr points on TBird, moving large amounts of messages can cause it to hang with CPU pegged at max for ages. TBird v2 was nice and nippy, v3 acceptable, v4/v5 are just awfully slow overall. TBird uses mbox storage format which probably stuffs it up on large deletes/moves etc.
It's strictly a UI issue in TBird. They changed the code for drag-n-drop in v3 betas, I reported a performance regression bug, they never really fixed it. It's just bad code in the TBird UI because the time required to drag-n-drop N messages grows much faster then O(N) or O(log N). So once you get past 2000-3000 messages, the time required is climbing into the stratosphere.
(Fortunately, there are other, less easy to use ways of moving messages via the right-click, move-to menu - or the "File" menu in the search window. None of them are as convenient as drag-n-drop would be.)
Dovecot itself has no issue with the bigger mailboxes, the problems are mostly either client-side or in running backups.
Just did a count on our server, 350G of email (largest single mailbox is 40G, that is 350k messages), total messages is 3.6mil+, biggest problem is on backup, ive read that the latest rsync has fast start now rather than wait to finish scanning. Im intrested in the latest mdbox format to reduce how many files we have. Try backing up small files fast enough to LTO5, tar it all up first before backup I think. Ile move all our maildirs to 10k SAS soon hopefully to lower the load on the SATA disks.
We backup our Maildir users to another machine on the same network using rdiff-backup. Each user's folder gets processed individually, which keeps memory usage down and it goes faster on the little mailboxes and doesn't choke as hard on the big mailboxes. Currently we keep 27 weeks of snapshots (rdiff-backup only stores deltas each week, so it's not that much space).
We randomize the order of processing so that in case it breaks halfway through then at least a different set of accounts will have been backed up this time.
Takes about 20 minutes to backup that 6GB / 800,000 message mailbox. Other mailboxes take a few minutes or only a few seconds, total backup window is under 2 hours for about 50GB of mail.
Just make sure on the destination volume for an rdiff-backup that you allow lots of extra inodes. Which also holds true for the Maildir store.
(code snippet)
# since RHEL5/CentOS5 don't have "sort -R" option to
# randomize, use the following example
# echo -e "2\n1\n3\n5\n4" |
# perl -MList::Util -e 'print List::Util::shuffle <>'
# yes, there's probably a better way to find MailDirs
DIRS=$FIND $BASE -maxdepth 3 -name subscriptions | \ $GREP '/var/vmail' | \ $SED 's:^/var/vmail/::' | $SED 's:subscriptions$::' | \ perl -MList::Util -e 'print List::Util::shuffle <>'
for DIR in ${DIRS}
do
rdiff-backup -v3 --print-statistics
--create-full-path /var/vmail/$DIR
${BKPHOST}::${BKPBASE}${DIR}
rdiff-backup -v3 --force --remove-older-than 27W \
${BKPHOST}::${BKPBASE}${DIR}
done