convert mdbox to maildir

Peter peter at pajamian.dhs.org
Sun Aug 14 11:05:00 UTC 2022


On 14/08/22 21:47, Marc wrote:
> So? that is why you have this lmtp not? Afaik was mdbox created to solve the (performance) issues with mbox and maildir etc. So I just wonder what the logics is behind chosing maildir current day.

maildir is probably what most people use and should continue to use. 
There are cases where mbox is still viable but nowadays they are rare 
edge cases.  Basically put mbox was one file for all mail in the 
mailbox, it served us well in the days of POP when clients would 
download all the mail from the server and it would be deleted right away 
on the server-side with no folders and very rarely leaving mail on the 
server at all.  This worked because when a client downloaded the 
messages they could for the most part just basically stream the entire 
mbox file straight through the TCP POP connection and then simply delete 
or truncate the file.

Nowadays IMAP is prevalent and so we have multiple folders stored 
server-side and mail is largely left on the server, so messages need to 
be accessed in a sort of random-access style instead of just streaming 
the whole lot of them down at once as used to be done in the POP days. 
This makes Maildir (where messages are stored one-per-file) much more 
efficient for storage and access.  Most people should probably be using 
Maildir nowadays, it's a good format and is extremely portable so that 
other tools can easily recognize and work directly with the Maildir files.

There is, however, one major issue with Maildir.  Filesystems store 
files in clusters on disk (and even, I believe on SSD drives), and these 
cluster sizes have been growing over the years in order to accommodate 
increasingly bigger file and filesystem sizes.  The problem is that when 
you have 10,000 messages all approximately 500 bytes in size and a 4k 
cluster size, those messages don't take up 5MB on disk, but rather they 
take up approximately 40MB on disk because each file (which correspeonds 
to each message) takes up at least a full cluster on disk.

To solve this we now have mdbox which stores many messages (by default 
10M worth) in one file, but not so many as to make the file unwieldy for 
random access of messages.  The idea is that we can store way more 
messages that way in the same given space because we're not wasting most 
of the disk space on the filesystem having to use a full cluster per 
message.  10M is not, however, a huge amount of memory to allocate in 
RAM to manipulate one file with, so storing 10M worth of messages to 
each file tends to bedome a good compromise between storing all of the 
messages in one file vs storing one message per file.

At the end of the day, though, the storage benefits of mdbox should be 
weighed against the sheer simplicity and widespread use of Maildir.  If 
you have really huge mailboxes (like ones that contain 50,000 or more 
messages) then mdbox may be the right solution for you, but most people 
will be fine with Maildir.


Peter


More information about the dovecot mailing list