Adam McDougall wrote:
Timo Sirainen wrote:
On Aug 28, 2009, at 8:38 PM, Adam McDougall wrote:
Early next week I need to upload over 100,000 emails to an IMAP server as quickly as possible from an Outlook client. I am looking for any methods I can use to (temporarily?) speed up the rate at which dovecot can accept and store IMAP uploads, whether it be storing on local disk, ram disk, etc. I can setup a temporary server on a laptop for example and once the upload has finished I can use standard file copying methods to transfer the mail to stable, permanent storage. I haven't been able to see over about 7 msgs/sec upload speed from a local folder in any mail client to dovecot (only NFS or ZFS backend tested so far with Maildir). Is there something horribly wrong with the speed I am seeing or are there just tricks I can try? Any tips? I'll be working on it all weekend until I find something satisfactory. It seems like I can upload mails to an Exchange server quicker. I'll setup just about anything that my experience allows me to, I can be very resourceful with adhoc hardware and software.
From Dovecot's side the only thing you can do is fsync_disable=yes. The main problem is probably network latency, because Outlook doesn't support MULTIAPPEND extension (and perhaps not even LITERAL+ extension?) Did you already try running Dovecot on the same computer as Outlook (some virtual thingy or maybe it works in cygwin)?
I just tried fsync_disable=yes but with NFS and had to turn off mail_nfs_index = yes as well but the speed was the same. Do you think it would be different with a UFS or ZFS backend with fsync_disable? I have not tried running dovecot on the same computer. When you mention dovecot+cygwin I think of the reported issues in the past on the mailing list and don't know if they were resolved. I could try dovecot in virtualbox I suppose (I put it on my list to try). I did a lot of testing today and found some things.
The two biggest real bottlenecks:
- Thunderbird is just slow at uploading to IMAP. With a bunch of small msgs it only does a few per second and you can tell the server is waiting for something to do. Outlook is considerably faster. Other clients not tested.
- Perdition (IMAP proxy), at least in my current setup, slows down the mail upload speed around 50%.
Non-bottlenecks:
- fsync (I can't measure the difference at the client, but on the server I can see the behavior change)
- filesystem (nfs/ufs/zfs all performed about the same)
- server cpu
- imap server being over the local network as opposed to running inside virtualbox on the same pc
I think I am satisfied with the speeds I am seeing now for the needs I have next week. Depending on the resulting speed across campus, I may run dovecot on a portable laptop for the upload; I'll just go around the perdition proxy (plan to retire that in a few weeks).
During my testing I did notice an issue with Outlook 2003 on dovecot 1.2 that I don't have with 1.1, I cannot delete an IMAP folder (maybe after clicking on it first). I get an error about 'folder is open in another session'. It happens on a Maildir store on a local filesystem or NFS and I only have one client accessing it. I might have time to look into it properly tomorrow, but if not, probably not for a few days at least.
Unrelated: Outlook 2003 running on Windows 7 seems to abort the upload after just a few hundred messages with an error message. Works on XP.
Alternatively I'll take a fast way of converting Exchange email to a tree of local mbox files which I can then run mb2md on.
If the mails are in Exchange, can't you connect to it using IMAP?
In theory yes, but I don't have access to the actual Exchange server until Monday at the earliest, and the user is using "cached exchange mode" which in past experience leaves the possibility of local mail which is not actually on the server due to a desync. Unless I am sure it is perfectly in sync, I've seen a second Outlook connect to Exchange using the native protocols and it initiated a massive deletion of mail which we had to toil to recover from obscure cache files on the original client. I don't know if an IMAP connection might trigger the same issue. For performance testing's sake, I'll see if I can upload some mail to our own Exchange server and see how fast an mbox capable mail client can download it. I can do some limited testing in the real environment on Monday but I'm expected to do the real migration on Tuesday unless I have to cancel. Thanks for the ideas.