Re: [Dovecot] Ongoing performance issues with 2.0.x

9 Nov 2010

      Ralf Hildebrandt put forth on 11/8/2010 12:44 PM:
...

Stan Hoeppner <stan@hardwarefreak.com>:

...
Does this machine have more than 4GB of RAM?  You do realize that merely
utilizing PAE will cause an increase in context switching, whether on
bare medal or in a VM guest.  It will probably actually be much higher
with a VM guest running a PAE kernel.  Also, please tell me the ESX
kernel you're running is native 64 bit, not 32 bit.  If the VMWare
kernel itself is doing PAE, as well as the guest Linux kernel, this may
fully explain the performance disaster you have on your hands, if it is
indeed due to context switching.
It sure work with 1.2.x now, so that's not really the problem
I'm not so sure we can make that assumption.  I'm leaning toward
something other than context switches, as they are obviously very high
with VMWare, always.
...
...
The bigger question is, why does this problem surface so readily while
running Dovecot 2.0.x and not while running Dovecot 1.2.x?
EXACTLY
...
Is 1.2.x merely tickling the dragon's chin, whereas 2.0.x is sticking
it's head into the dragon's mouth?
I'd say the difference between 1.2 and 2.0 is so dramatic that it's
probably something else.
Given what we know, that the increase in CPU time is in guest kernel
space, or at least appears so, I'm guessing that Dovecot 2.x is making a
system or library call(s) which your kernel is racing with for extended
time yet still releasing.  Your best bet I'm thinking is to put a trace
on each Dovecot process and find which one(s) are waiting the longest
for system call returns.  Once you know which process is triggering the
problem you can start to narrow down the code segment, obviously with
Timo's help.  I'm starting to get out of my element at this point.
...
...
This very well may be the case.  You need to also look at the CONFIG_HZ=
value of the Linux kernel of the guest.  If it's a tickless kernel you
should be fine.  If tickless, IIRC, you should see CONFIG_NO_HZ=y.
fgrep HZ config-2.6.32-23-generic-pae
CONFIG_NO_HZ=y
CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
CONFIG_HZ_300 is not set
CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_MACHZ_WDT=m
I can't tell from that which is being used as both tickless and 250 are
configured.  If it's 250 that should still be fine.  That will generate
in the neighborhood of 2000 interrupts/sec with 8 vCPUs, which is the
same as a "workstation" kernel on two vCPUs, which would be configured
with CONFIG_HZ=1000.
--
Stan