At 10PM +0000 on 26/03/13 you (Andre Rodier) wrote:
The perl script to transform mbox files into maildirs in the dovecot distribution is old, and crashed many times in the middle of the process. I had a look to the script, and gave up trying to fix it.
I found a python script that was supposed to crawl this folder structure, and to replicate it using IMAP commands, but it crashed as well, and restarting the process would import twice the same messages. The script is here: http://costela.net/2011/06/importing-an-outlook-pst-into-imap/
I found another python script that was working better, and seemed to be well written, but with one mbox to one IMAP folder only. It can be found here: http://imap-upload.svn.sourceforge.net/viewvc/imap-upload/trunk/ I have modified, and I have added some minor fixes:
- Recursively traverse a folder structure, and replicate it using IMAP commands on the server.
- Properly manage folder names with special characters. (dovecot can manage these characters using the listescape plugin).
- Avoid taking all the resources of the server (A quirty hack that can change).
If I'm reading this right, it's reading a tree of mboxes? You should be able to convert this to any format Dovecot understands (maildir, dbox) with dsync, without having to go through IMAP. You would need to configure dsync to read the mboxes just as you would have configured Dovecot; for a sync from temporary mboxes you probably want to use INDEX=MEMORY to avoid having to mess about creating index files.
I am not an expert in Python, and the script has been quickly writter to fit my needs. However I think it can be modified easily to any configuration. In the future, maybe this script can use the libpst python bindings to import the emails directly. The last version of the script, modified is here: https://github.com/arodier/EmailTools/tree/master/Migration. Do not hesitate to help me, to make the script as generic as possible, particularly if you are a python expert.
Well, on my quick look, I don't much like this line:
ad = float(open("/proc/loadavg").readline().split(" ")[:3][0])
I would be surprised if Python didn't provide a portable way to get at that information... let's see (I don't really speak Python)... oh yes, os.getloadavg().
I post this on this list, because I think you maybe interested if you are in the same case as me. The license is not specified, but I will probably use GPLv3.
Without wishing to get into a licence war, there are a lot of people who object to the GPLv3, for good reasons. Do you have a good reason for changing it from the MIT licence used by the original?
Ben