[Dovecot] Outlook (2010) -> Dovecot (IMAP) >10x slower with high network load and many folders
Hi,
I am seeing a >10x as slow performance when trying to complete a "send/receive" from an Outlook 2010 client to Dovecot via IMAP, but only when the LAN is fully loaded with other traffic, EG file copying. It seems the problem is when outlook is trying to identify folders that have changed since last "send/receive" thus traversing the hierachy.
Observations:
- Apple's Mail.app does not have problems when exposed to the same environment.
- Eliminating Outlook is not an option.
- The tests have been performed during the easter vacation, thus almost no other clients are using the network/mailserver. Under normal load there are 2-300 users using the network.
Description of the environment: Server sw: Mac OS X Server 10.6.8 running dovecot: 1.1.20apple0.5 (OS = fully updated 10.6.x) Server hw: Xserve Quadcore intel Xeon 2.26 GHz 12 GB RAM
- Never having a cpu load > 20%, mail is stored on a Promise vtrack RAID connected via Fibre Channel
Client: Windows 7, Outlook 2010 The client have maybe 50 folders (and each folder about a handfull of subfolders) the size of the mailbox is around 3-5 GB.
Test results: CLIENT-1 is having the problems when CLIENT-2 is using all the (100Mbps) bandwidth eg. copying files to MAIL-SRV. If I move CLIENT-1 to CLIENT-3 then almost all the delay is gone. NB.: I have not (yet) tested if the problem also exists when CLIENT-2 generates traffic to MAIL-SRV as opposed to OTHER-SRV (but I am expecting the same problems).
When dumping the traffic on CLIENT-1 (with Wireshark) one thing is catching my eyes for the 'bad' case: There is a very long delay after each "REQUEST IDLE" until the next "REQUEST DONE" ca. 0,3 seconds. Comparing to the 'good' setup the pause at the same place in the communication is less than 1 ms !
So why this delay? Where/how shall I continue my debugging?
- Run dtruss/dtrace scripts on the server?
- Get dovecot to output more debug info (I guess it's doing it's best so no problems will be seen there…)?
Are there any dovecot configurations that can be altered to workaround [what I think is a limitation in Outlook's IMAP implementation]? The current dovecot configuration is Apple's defaults (+POP3 disabled))
I am seing one warning from dovecotd -n though:
- But I see the same warning on a clean installed not-yet-configured OS X Server so I guess it's not 'that' bad: Warning: fd limit 256 is lower than what Dovecot can use under full load (more than 456). Either grow the limit or change login_max_processes_count and max_mail_processes settings
Physical setup: +---------------+ +------------------------------+ +-----------------+ | CLIENT-1 | | CLIENT-2 | | CLIENT-3 | | Outlook "bad" | | Traffic generator, eg. Samba | | Outlook "good" | +---------------+ +------------------------------+ +-----------------+ | | | +-------------------------------------------------+ | | 100Mbps switch (clients) | | +-------------------------------------------------+ | | | +------------------------------------------------------------------------+ | 1Gbps switch (for servers) | +------------------------------------------------------------------------+ | | +-----------------------------+ +-----------+ | MAIL-SRV: Dovecot, AFP, SMB | | OTHER-SRV | +-----------------------------+ +-----------+
Wireshark dump - IMAP communication: |Time | 10.211.55.3 | | | | 10.0.0.10 | |9.851 | Response: gatq OK L |IMAP: Response: gatq OK Logged in. | |(49433) <------------------ (143) | |9.852 | Request: o47u SELEC |IMAP: Request: o47u SELECT "1_GROUPS" | |(49433) ------------------> (143) | |9.853 | Response: * FLAGS ( |IMAP: Response: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft) | |(49433) <------------------ (143) | |9.854 | Request: 3y4b NOOP |IMAP: Request: 3y4b NOOP | |(49433) ------------------> (143) | |9.854 | Response: 3y4b OK N |IMAP: Response: 3y4b OK NOOP completed. | |(49433) <------------------ (143) | |9.855 | Request: 4vlj IDLE |IMAP: Request: 4vlj IDLE | |(49433) ------------------> (143) | |9.856 | Response: + idling |IMAP: Response: + idling | |(49433) <------------------ (143) | |10.108 | Request: DONE |IMAP: Request: DONE | |(49433) ------------------> (143) | |10.108 | Response: 4vlj OK I |IMAP: Response: 4vlj OK Idle completed. | |(49433) <------------------ (143) | |10.108 | Request: wh89 SELEC |IMAP: Request: wh89 SELECT "1_GROUPS.Adm"
BR Thomas von Eyben
It seems that you have packets lost in the network. MAC and Windows have different network stacks so this may count for the different behavior.
-- Best regards, Adrian Minta
On 4/6/2012 3:52 AM, Thomas von Eyben wrote:
Test results: CLIENT-1 is having the problems when CLIENT-2 is using all the (100Mbps) bandwidth eg. copying files to MAIL-SRV. If I move CLIENT-1 to CLIENT-3 then almost all the delay is gone. NB.: I have not (yet) tested if the problem also exists when CLIENT-2 generates traffic to MAIL-SRV as opposed to OTHER-SRV (but I am expecting the same problems).
So the link between your 100 Mbps switch and the 1 Gbps switch is saturated by CLIENT-2, so CLIENT-1 is just getting the leftovers?
Since CLIENT-3 doesn't go through that 100 Mbps switch, it obviously doesn't see that issue.
On Sat, Apr 7, 2012 at 3:16 AM, Willie Gillespie <wgillespie+dovecot@es2eng.com> wrote:
On 4/6/2012 3:52 AM, Thomas von Eyben wrote:
Test results: CLIENT-1 is having the problems when CLIENT-2 is using all the (100Mbps) bandwidth eg. copying files to MAIL-SRV. If I move CLIENT-1 to CLIENT-3 then almost all the delay is gone. NB.: I have not (yet) tested if the problem also exists when CLIENT-2 generates traffic to MAIL-SRV as opposed to OTHER-SRV (but I am expecting the same problems).
So the link between your 100 Mbps switch and the 1 Gbps switch is saturated by CLIENT-2, so CLIENT-1 is just getting the leftovers?
Since CLIENT-3 doesn't go through that 100 Mbps switch, it obviously doesn't see that issue.
Yes - that's my current "workaround" (perhaps also solution), I'm wondering if the performance is really expected to be _so_ bad when other users are utilizing the LAN. (You seem to indicate that what I am observing is expected and is "just" caused by [un-intended] semi-bad behavior from other users…)
BR TvE
On 2012-04-06, Thomas von Eyben <thomasvoneyben@gmail.com> wrote:
I am seeing a >10x as slow performance when trying to complete a "send/receive" from an Outlook 2010 client to Dovecot via IMAP, but only when the LAN is fully loaded with other traffic, EG file copying. It seems the problem is when outlook is trying to identify folders that have changed since last "send/receive" thus traversing the hierachy.
Not sure why it would only affect Outlook clients, but if your switches are managed, you might like to check if flow control is enabled and, if so, try disabling it.
participants (4)
-
Adrian Minta
-
Stuart Henderson
-
Thomas von Eyben
-
Willie Gillespie