I've tried to write the new mbox code in a way that it could be moved into generic flat-file library with mbox-specific code being implemented on top of that.
The code would be useful for all formats that require modifying message headers to store the metadata. How many such are there?
I suppose all such formats could be implemented by separating mails into 3 logical parts:
- header ("From ...\n")
- mail data (headers and body)
- footer ("\n")
Except keeping a separate header and footer is kind of annoying, my current code uses:
- mail separator ("\nFrom ...\n")
- first message header's skip counter: how many characters to remove from mail separator for the first message (1, for \n before the "From ..")
- last message's footer
Somewhat uglier, but I think all needed formats could be implemented with it? It's also simper and more efficient to implement..
Besides mbox, I know at least two variations which are used:
AAAA-box:
- I remember seeing this somewhere years ago, but I'm not sure if it's still used or if it has a real name..
- separator is four ^A characters (ascii 1) and LF
- header skip counter is 5
- last message footer would be the same ^A^A^A^A\n
dotbox: prefixed with another '.'. the '.' prefixes will have to be removed for
- mails are stored in SMTP format, ie. lines beginning with '.' are
IMAP, so this isn't as simple as others to implement.
- separator is ".\n"
- header skip counter is 2
- last message footer is ".\n"
Any thoughts?