[Dovecot] Quick and dirty server optimized for IMAP upload speed?
Early next week I need to upload over 100,000 emails to an IMAP server as quickly as possible from an Outlook client. I am looking for any methods I can use to (temporarily?) speed up the rate at which dovecot can accept and store IMAP uploads, whether it be storing on local disk, ram disk, etc. I can setup a temporary server on a laptop for example and once the upload has finished I can use standard file copying methods to transfer the mail to stable, permanent storage. I haven't been able to see over about 7 msgs/sec upload speed from a local folder in any mail client to dovecot (only NFS or ZFS backend tested so far with Maildir). Is there something horribly wrong with the speed I am seeing or are there just tricks I can try? Any tips? I'll be working on it all weekend until I find something satisfactory. It seems like I can upload mails to an Exchange server quicker. I'll setup just about anything that my experience allows me to, I can be very resourceful with adhoc hardware and software.
Alternatively I'll take a fast way of converting Exchange email to a tree of local mbox files which I can then run mb2md on. I tried using Thunderbird to Import the mails from Outlook and while it was fast, it messed up the formatting of some of the mails so I don't think I can use that. I tried readpst briefly from libpst but it took a long time to run, took alot of cpu, and was spewing lots of errors so I canceled it.
Thanks for any input!
On Aug 28, 2009, at 8:38 PM, Adam McDougall wrote:
Early next week I need to upload over 100,000 emails to an IMAP
server as quickly as possible from an Outlook client. I am looking
for any methods I can use to (temporarily?) speed up the rate at
which dovecot can accept and store IMAP uploads, whether it be
storing on local disk, ram disk, etc. I can setup a temporary
server on a laptop for example and once the upload has finished I
can use standard file copying methods to transfer the mail to
stable, permanent storage. I haven't been able to see over about 7
msgs/sec upload speed from a local folder in any mail client to
dovecot (only NFS or ZFS backend tested so far with Maildir). Is
there something horribly wrong with the speed I am seeing or are
there just tricks I can try? Any tips? I'll be working on it all
weekend until I find something satisfactory. It seems like I can
upload mails to an Exchange server quicker. I'll setup just about
anything that my experience allows me to, I can be very resourceful
with adhoc hardware and software.
From Dovecot's side the only thing you can do is fsync_disable=yes.
The main problem is probably network latency, because Outlook doesn't
support MULTIAPPEND extension (and perhaps not even LITERAL+
extension?) Did you already try running Dovecot on the same computer
as Outlook (some virtual thingy or maybe it works in cygwin)?
Alternatively I'll take a fast way of converting Exchange email to a
tree of local mbox files which I can then run mb2md on.
If the mails are in Exchange, can't you connect to it using IMAP?
Timo Sirainen wrote:
On Aug 28, 2009, at 8:38 PM, Adam McDougall wrote:
Early next week I need to upload over 100,000 emails to an IMAP server as quickly as possible from an Outlook client. I am looking for any methods I can use to (temporarily?) speed up the rate at which dovecot can accept and store IMAP uploads, whether it be storing on local disk, ram disk, etc. I can setup a temporary server on a laptop for example and once the upload has finished I can use standard file copying methods to transfer the mail to stable, permanent storage. I haven't been able to see over about 7 msgs/sec upload speed from a local folder in any mail client to dovecot (only NFS or ZFS backend tested so far with Maildir). Is there something horribly wrong with the speed I am seeing or are there just tricks I can try? Any tips? I'll be working on it all weekend until I find something satisfactory. It seems like I can upload mails to an Exchange server quicker. I'll setup just about anything that my experience allows me to, I can be very resourceful with adhoc hardware and software.
From Dovecot's side the only thing you can do is fsync_disable=yes. The main problem is probably network latency, because Outlook doesn't support MULTIAPPEND extension (and perhaps not even LITERAL+ extension?) Did you already try running Dovecot on the same computer as Outlook (some virtual thingy or maybe it works in cygwin)?
I just tried fsync_disable=yes but with NFS and had to turn off mail_nfs_index = yes as well but the speed was the same. Do you think it would be different with a UFS or ZFS backend with fsync_disable? I have not tried running dovecot on the same computer. When you mention dovecot+cygwin I think of the reported issues in the past on the mailing list and don't know if they were resolved. I could try dovecot in virtualbox I suppose (I put it on my list to try).
Alternatively I'll take a fast way of converting Exchange email to a tree of local mbox files which I can then run mb2md on.
If the mails are in Exchange, can't you connect to it using IMAP?
In theory yes, but I don't have access to the actual Exchange server until Monday at the earliest, and the user is using "cached exchange mode" which in past experience leaves the possibility of local mail which is not actually on the server due to a desync. Unless I am sure it is perfectly in sync, I've seen a second Outlook connect to Exchange using the native protocols and it initiated a massive deletion of mail which we had to toil to recover from obscure cache files on the original client. I don't know if an IMAP connection might trigger the same issue. For performance testing's sake, I'll see if I can upload some mail to our own Exchange server and see how fast an mbox capable mail client can download it. I can do some limited testing in the real environment on Monday but I'm expected to do the real migration on Tuesday unless I have to cancel. Thanks for the ideas.
On Aug 28, 2009, at 9:10 PM, Adam McDougall wrote:
From Dovecot's side the only thing you can do is fsync_disable=yes.
The main problem is probably network latency, because Outlook
doesn't support MULTIAPPEND extension (and perhaps not even LITERAL
- extension?) Did you already try running Dovecot on the same
computer as Outlook (some virtual thingy or maybe it works in
cygwin)?I just tried fsync_disable=yes but with NFS and had to turn off
mail_nfs_index = yes as well but the speed was the same. Do you
think it would be different with a UFS or ZFS backend with
fsync_disable?
Could be. NFS clients typically wait for NFS server to reply to writes
before continuing. I think it "async" mount option might disable that,
but I'm not sure.
Adam McDougall wrote:
Timo Sirainen wrote:
On Aug 28, 2009, at 8:38 PM, Adam McDougall wrote:
Early next week I need to upload over 100,000 emails to an IMAP server as quickly as possible from an Outlook client. I am looking for any methods I can use to (temporarily?) speed up the rate at which dovecot can accept and store IMAP uploads, whether it be storing on local disk, ram disk, etc. I can setup a temporary server on a laptop for example and once the upload has finished I can use standard file copying methods to transfer the mail to stable, permanent storage. I haven't been able to see over about 7 msgs/sec upload speed from a local folder in any mail client to dovecot (only NFS or ZFS backend tested so far with Maildir). Is there something horribly wrong with the speed I am seeing or are there just tricks I can try? Any tips? I'll be working on it all weekend until I find something satisfactory. It seems like I can upload mails to an Exchange server quicker. I'll setup just about anything that my experience allows me to, I can be very resourceful with adhoc hardware and software.
From Dovecot's side the only thing you can do is fsync_disable=yes. The main problem is probably network latency, because Outlook doesn't support MULTIAPPEND extension (and perhaps not even LITERAL+ extension?) Did you already try running Dovecot on the same computer as Outlook (some virtual thingy or maybe it works in cygwin)?
I just tried fsync_disable=yes but with NFS and had to turn off mail_nfs_index = yes as well but the speed was the same. Do you think it would be different with a UFS or ZFS backend with fsync_disable? I have not tried running dovecot on the same computer. When you mention dovecot+cygwin I think of the reported issues in the past on the mailing list and don't know if they were resolved. I could try dovecot in virtualbox I suppose (I put it on my list to try). I did a lot of testing today and found some things.
The two biggest real bottlenecks:
- Thunderbird is just slow at uploading to IMAP. With a bunch of small msgs it only does a few per second and you can tell the server is waiting for something to do. Outlook is considerably faster. Other clients not tested.
- Perdition (IMAP proxy), at least in my current setup, slows down the mail upload speed around 50%.
Non-bottlenecks:
- fsync (I can't measure the difference at the client, but on the server I can see the behavior change)
- filesystem (nfs/ufs/zfs all performed about the same)
- server cpu
- imap server being over the local network as opposed to running inside virtualbox on the same pc
I think I am satisfied with the speeds I am seeing now for the needs I have next week. Depending on the resulting speed across campus, I may run dovecot on a portable laptop for the upload; I'll just go around the perdition proxy (plan to retire that in a few weeks).
During my testing I did notice an issue with Outlook 2003 on dovecot 1.2 that I don't have with 1.1, I cannot delete an IMAP folder (maybe after clicking on it first). I get an error about 'folder is open in another session'. It happens on a Maildir store on a local filesystem or NFS and I only have one client accessing it. I might have time to look into it properly tomorrow, but if not, probably not for a few days at least.
Unrelated: Outlook 2003 running on Windows 7 seems to abort the upload after just a few hundred messages with an error message. Works on XP.
Alternatively I'll take a fast way of converting Exchange email to a tree of local mbox files which I can then run mb2md on.
If the mails are in Exchange, can't you connect to it using IMAP?
In theory yes, but I don't have access to the actual Exchange server until Monday at the earliest, and the user is using "cached exchange mode" which in past experience leaves the possibility of local mail which is not actually on the server due to a desync. Unless I am sure it is perfectly in sync, I've seen a second Outlook connect to Exchange using the native protocols and it initiated a massive deletion of mail which we had to toil to recover from obscure cache files on the original client. I don't know if an IMAP connection might trigger the same issue. For performance testing's sake, I'll see if I can upload some mail to our own Exchange server and see how fast an mbox capable mail client can download it. I can do some limited testing in the real environment on Monday but I'm expected to do the real migration on Tuesday unless I have to cancel. Thanks for the ideas.
On 8/30/2009, Adam McDougall (mcdouga9@egr.msu.edu) wrote:
The two biggest real bottlenecks:
- Thunderbird is just slow at uploading to IMAP. With a bunch of small msgs it only does a few per second and you can tell the server is waiting for something to do.
Did you try TBird 3.0b3? It has many, many IMAP improvements...
--
Best regards,
Charles
On Sun, Aug 30, 2009 at 12:33:43PM -0400, Charles Marcus wrote:
On 8/30/2009, Adam McDougall (mcdouga9@egr.msu.edu) wrote:
The two biggest real bottlenecks:
- Thunderbird is just slow at uploading to IMAP. With a bunch of small msgs it only does a few per second and you can tell the server is waiting for something to do.
Did you try TBird 3.0b3? It has many, many IMAP improvements...
--
Best regards,
Charles
Wow it sure is faster at uploading, thanks for mentioning it!
On 8/30/2009 11:34 PM, Adam McDougall wrote:
The two biggest real bottlenecks:
- Thunderbird is just slow at uploading to IMAP. With a bunch of small msgs it only does a few per second and you can tell the server is waiting for something to do.
Did you try TBird 3.0b3? It has many, many IMAP improvements...
Wow it sure is faster at uploading, thanks for mentioning it!
Any chance you can follow-up with some comparisons, even if they are rough guesstimates on your part?
Some of the other fixes include not downloading attachments every time you click on a message, allowing you to store messages offline 'on demand' (as you click on them), as opposed to forcing you to download an entire folder, controls for people with limited bandwidth and/or storage, etc...
There are some things I really don't like about the UI, and none of my extensions I need work, so I don't use it yet for anything other than occasional testing, but I'm really looking forward to the release and some maturity...
It combined with dovecot will provide a really most excellent imap experience. :)
--
Best regards,
Charles
participants (3)
-
Adam McDougall
-
Charles Marcus
-
Timo Sirainen