On 12.9.2011, at 19.10, Dave Stubbs wrote:
I'm watching how my mail system works, and I see that procmail creates a new file in the <folder>/new directory, each time an email is received. This file is some complex combination of UIDs and things, suffixed by the server name. So far, the filename has alphanumerics, a couple underscores, and a dot or two only.
But once dovecot gets it's hands on the file and moves it to the <folder>/cur directory, it starts doing "terrible" things to the file name. Now, the filename starts to have "evil" things in it, like colons and commas. Is there a way to change this?
That's how Maildir works to store message flags. If you don't like it, use something else.
I'm asking this primarily because I use dovecot as a massive long-term email archiving system. One of the things one needs to be able to do when running a long-term archive like this is keep things as simple and accessible as possible. The reason I use maildir is that I totally buy into the "one email, one file" idea - it means I don't have to store messages in big consolidated database files that are changeable with each new version of the vendor's software release (such as exchange DBs or Outlook PST files) or that are horrible performers (such as mbox).
Dovecot v2.0's sdbox format could work for you.
One of the nice things about the maildir "each email is a separate file" idea is that you are not limited to maildir or dovecot or any other piece of software to handle, read, and process the files.
Well, sdbox isn't good for that then anymore. Cydir backend could possibly work, although it is missing some features that dbox has and was mainly intended as an example code for super simple mailbox format.
For instance, I would like to backup my maildir by using rsync to synchronize my dovecot-managed maildir to a Windows server running NFS. From there the files are synchronized via Windows DFS ( to which there is no open source solution that is even close) to several other servers around the continent. Only thing: The evil commas and colons in the filenames are anathema to Windows. So instead I tar the maildir folders to tgz files on the windows server, and the tgz's are synchronized to other DR sites.
You could patch Dovecot's maildir code to use something else than commas and colons in maildir-storage.h:
#define MAILDIR_INFO_SEP ':' #define MAILDIR_EXTRA_SEP ',' #define MAILDIR_FLAGS_SEP ','
#define MAILDIR_INFO_SEP_S ":" #define MAILDIR_EXTRA_SEP_S "," #define MAILDIR_FLAGS_SEP_S ","
If I could do without the need for tar (mandated solely because of the colons and commas in the dovecot filename scheme) I could minimize the time to backup (only synchronizing changes) and suddenly I would have a lot of other benefits opened up. One simple one would be that if I could configure dovecot to append the .eml extension to the end of every file (technically each file in a maildir is an eml file regardless of the extension being present or not - eml is just a raw mail file just like what you'd find in a maildir) I would have instant access to them using Search Server Express, which can read eml files but strongly prefers to use extentions to recognize files as such.
The message flags should still be stored somewhere if not in the filename. dbox and cydir stores them in Dovecot's index files.
To be clear: I'm not requesting that dovecot's file naming convention be changed to match my quirky requirements - I'm just asking if it could be made configurable, so I could change it to match my needs and others could change it to match theirs. In the interests of REALLY being able to use the elegantly simple idea of each mail being a separate file, I'm trying to get more out of that great pile of folders and files I'm amassing in my mail archive server. The more use I can make of them with other software other than dovecot (i.e. data crawling, indexing, easy recovery in a catastrophe, etc) the more valuable this format is.
Is this possible?
One last possibility is to create your own mailbox format that works exactly like you want.