[Dovecot] Converting Outlook .PST's (was: Suggested IMAP Directory Size..)

Benjamin R. Haskell dovecot at benizi.com
Mon Oct 15 19:17:59 EEST 2007


On Mon, 15 Oct 2007, Ilo Lorusso wrote:

> [...]
> I know the users also have large OUT LOOK pst files 4.5GIGs and wondering if
> I could also intergrate that into IMAP?

It can be done, but it is a nightmare. For post-2003(?) Outlook .PST's, 
the only sensible, non-commercial path I could find was through 
Thunderbird's import. Uploading directly to the server (Even if you ran a 
local server!) was horrendously, painstakingly slow, and rendered the 
Outlook user's computer unusable for that time.

(If you're feeling lucky, Google libpst. Maybe your Outlook is old 
enough that it supports the format.)

Via Thunderbird:

1. Open all the .PST's you want to convert in Outlook, and, if possible, 
make sure those were the only .PST's open.

2. Be sure to 'compact'/'compress' each one, to get rid of deleted 
messages (excluding those in 'Deleted Items'. Uggh.).

3. Make sure Outlook is completely closed, and not accessing any .PST's.

4. Open Thunderbird.

5. Import mail from Outlook.

This gets you mbox files with the same hierarchy that you had in Outlook. 
I then wrote some Perl scripts to deal with these. In my case, I was 
combining several users' folders into a single shared hierarchy. Maybe you 
can run some mbox2maildir program and be done with it.

Caveats:

- If possible, change the location of Thunderbird's profile directory to a 
short path name. (e.g. C:\convert) The default path to local folders:
C:\Documents and Settings\%USER%\Application Data\Thunderbird\Profiles\(random string)\Mail\Local Folders
means that approximately 100 of your 255-character limit for filenames are 
chewed up.

- Thunderbird will mangle folder names that contain 'odd' characters. I 
never figured out what characters caused trouble, but the following were 
definitely OK: [A-Za-z0-9. ]
(I found the odd foldernames running:
find (dirname) -type d | perl -lnwe 'print if /[\da-f]{8}/'
They always ended in a string of hexadecimal digits.)

- Thunderbird doesn't seem to like non-Latin-1 headers. (I didn't find 
this out until someone noticed it a while after the conversion.) This 
means QP-encoded headers. (In my case, ISO-2022-JP.)

Best of luck. I don't envy your task. :-)

-- Ben


More information about the dovecot mailing list