On 24.1.2011, at 23.17, Sven Hartge wrote:
I take this thread and jump in, since we (TH Mittelhessen, Germany) are also investigating the move to Dovecot and we also have the same situation as Javier: Courier with Maildir and Bacula as backup solution, we even have about the same amount of mails in our system.
And I was also wondering which storage format to use: stay at Maildir (no need to worry about indexes, just restore straight to the users $HOME/Maildir and be done with it), use sdbox or use mdbox.
Probably a good idea to switch to Dovecot+Maildir first, and then when everything seems to be working fine switch to mdbox or sdbox.
"Expunging a message only decreases the message's refcount. The space is later freed in "purge" step. This is typically done in a nightly cronjob when there's less disk I/O activity. The purging first finds all files that have refcount=0 mails. Then it goes through each file and copies the refcount>0 mails to other mdbox files (to the same files as where newly saved messages would also go), updates the map index and finally deletes the original file."
For example, we got m.1, m.2 and m.3 and all files have deleted mails in it. During expunge, all undeleted mails would go to m.4 and m.5 for example.
Typically only new messages are deleted, so typically it would be only m.3 file that had deleted mails.
Now Bacula backups the mailstorage and has 2 new files to backup and 3 old ones to "delete/forget" (using the accurate backup option).
Wouldn't this massivly increase the size of the backup because I end up backing many mails multiple times?
Yes, but if you use mdbox_rotate_interval=1d and run the purging before backups, I think there's a good chance that most of the backed up mails will be new files that bacula hasn't seen before.
I thought of limiting the amount of mails inside the mdbox to one, thus of course defeating the benefit of having multiple mails inside one file, but gaining a stable file name over the whole lifetime of a mail which will never change, even if the file is moved to a different folder or its state changes.
Then you'd want to use sdbox, but that won't decrease the backup time compared to maildir, since there's the same number of files.
Problem: I my end up with hundred thousands of m.* files inside a users storage area (Don't ask, we really have this kind of user. And no, there are uneducable about this.), even if the user neatly sorted them into different IMAP folders.
I don't really understand what you're trying to say with this. m.* files anyway aren't folder-specific, all of the user's mails are in the same m.* files. And users can't really affect how m.* files are created, other than deleting messages all around the mailbox.