[Dovecot] Converting Outlook .PST's (was: Suggested IMAP Directory Size..)
Benjamin R. Haskell
dovecot at benizi.com
Mon Oct 15 19:17:59 EEST 2007
On Mon, 15 Oct 2007, Ilo Lorusso wrote:
> [...]
> I know the users also have large OUT LOOK pst files 4.5GIGs and wondering if
> I could also intergrate that into IMAP?
It can be done, but it is a nightmare. For post-2003(?) Outlook .PST's,
the only sensible, non-commercial path I could find was through
Thunderbird's import. Uploading directly to the server (Even if you ran a
local server!) was horrendously, painstakingly slow, and rendered the
Outlook user's computer unusable for that time.
(If you're feeling lucky, Google libpst. Maybe your Outlook is old
enough that it supports the format.)
Via Thunderbird:
1. Open all the .PST's you want to convert in Outlook, and, if possible,
make sure those were the only .PST's open.
2. Be sure to 'compact'/'compress' each one, to get rid of deleted
messages (excluding those in 'Deleted Items'. Uggh.).
3. Make sure Outlook is completely closed, and not accessing any .PST's.
4. Open Thunderbird.
5. Import mail from Outlook.
This gets you mbox files with the same hierarchy that you had in Outlook.
I then wrote some Perl scripts to deal with these. In my case, I was
combining several users' folders into a single shared hierarchy. Maybe you
can run some mbox2maildir program and be done with it.
Caveats:
- If possible, change the location of Thunderbird's profile directory to a
short path name. (e.g. C:\convert) The default path to local folders:
C:\Documents and Settings\%USER%\Application Data\Thunderbird\Profiles\(random string)\Mail\Local Folders
means that approximately 100 of your 255-character limit for filenames are
chewed up.
- Thunderbird will mangle folder names that contain 'odd' characters. I
never figured out what characters caused trouble, but the following were
definitely OK: [A-Za-z0-9. ]
(I found the odd foldernames running:
find (dirname) -type d | perl -lnwe 'print if /[\da-f]{8}/'
They always ended in a string of hexadecimal digits.)
- Thunderbird doesn't seem to like non-Latin-1 headers. (I didn't find
this out until someone noticed it a while after the conversion.) This
means QP-encoded headers. (In my case, ISO-2022-JP.)
Best of luck. I don't envy your task. :-)
-- Ben
More information about the dovecot
mailing list