[Dovecot] load increase after upgrade to 2.0.8
I upgraded most of our servers from 1.2.x to 2.0.8 and am noticing a really big increase in server load. This is across the board, not any specific server. Check this screenshot of a load graph: http://grab.by/7N8V
Is there anything i should be looking at that could cause such a drastic load increase? Conf file is available here: http://wa.ter.net/download/doveconf.txt
Regards,
Cor
- Cor Bosman cor@xs4all.nl:
I upgraded most of our servers from 1.2.x to 2.0.8 and am noticing a really big increase in server load. This is across the board, not any specific server. Check this screenshot of a load graph: http://grab.by/7N8V
Yes, this looks like my graphs. Same increase. Factor 5.
Is there anything i should be looking at that could cause such a drastic load increase? Conf file is available here: http://wa.ter.net/download/doveconf.txt
We need to find out WHAT is common in our two config files!
Could you post "doveconf -n" output (I think it shows only non-default values), the current output is a bit longish.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
- Ralf Hildebrandt Ralf.Hildebrandt@charite.de:
We need to find out WHAT is common in our two config files!
My config is attached
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
It would be useful to know if the problem is with the imap process startup or after that. If you apply the attached patch, you can set:
service imap { service_count = 0 }
This causes the imap processes to be reused for future connections. With the patch enabled the processes keep the ability to switch back to running as root and then switch to another user, so this works with multiple UIDs.
Hope im doing what you want :) Getting kinda confusing.
Right now I have running:
3 servers with just a new kernel, no other changes 2 servers with service_count = 0, no other changes 1 server with service_count = 0, and src/imap/main.c patch
Is that ok?
I wont be seeing much impact until tomorrow morning, not many users on at night.
Cor
On 8.12.2010, at 22.52, Cor Bosman wrote:
Right now I have running:
3 servers with just a new kernel, no other changes 2 servers with service_count = 0, no other changes 1 server with service_count = 0, and src/imap/main.c patch
Is that ok?
Looks good!
I wont be seeing much impact until tomorrow morning, not many users on at night.
I'm hoping that at least the last server will see the load shrink back.
1 server with service_count = 0, and src/imap/main.c patch
Is that ok?
Looks good!
I had to revert this patch because it's causes permission errors on our filesystem. Directories are being created for user X with the euid of user Y (which fails, so at least i didnt get corruption, just that service failed on that server).
Cor
On Wed, Dec 8, 2010 at 3:03 PM, Timo Sirainen tss@iki.fi wrote:
On 8.12.2010, at 22.52, Cor Bosman wrote:
1 server with service_count = 0, and src/imap/main.c patch
By this you mean service_count=0 for both service imap-login and service imap blocks, right?
Speaking from my own experience, the system loads on our dovecot boxes went up *substantially* when we upgraded kernels from 2.6.32.x and 2.6.33.x to newer ones (late 2.6.35.x and 2.6.36 -- haven't tried 2.6.36.1 yet). But I also saw loads on all sort of other types of boxes grow when moved to 2.6.35.x and 2.6.36, so it's not necessarily dovecot-related. Though you've got plenty to choose from between 2.6.27.x and up.
Getting 'imap-login' and 'pop3-login' set to service_count=0 and 'pop3' and 'imap' set to service_count=1000 (as per Timo's suggestion) helped keep the boxes from spinning into oblivion. To reduce the enormous amount of context switches, I've got 'pop3's client_limit set to 4. I played around with 'imap's client_limit between 1 and 5 but haven't quite found the sweet spot yet. pop3 with client_limit 4 seems to work pretty good. That brought context switches down from 10,000-30,000 to sub-10,000. These are somewhat crummy Dell 850s (2 cores, 2 gigs) but lots of them, dedicated to just POP/IMAP, storage on NFS (indexes on local disk though) running 2.0.7 across the board (60 boxes or so).
- Mark Moseley moseleymark@gmail.com:
On Wed, Dec 8, 2010 at 3:03 PM, Timo Sirainen tss@iki.fi wrote:
On 8.12.2010, at 22.52, Cor Bosman wrote:
1 server with service_count = 0, and src/imap/main.c patch
By this you mean service_count=0 for both service imap-login and service imap blocks, right?
Speaking from my own experience, the system loads on our dovecot boxes went up *substantially* when we upgraded kernels from 2.6.32.x and 2.6.33.x to newer ones (late 2.6.35.x and 2.6.36 -- haven't tried 2.6.36.1 yet). But I also saw loads on all sort of other types of boxes grow when moved to 2.6.35.x and 2.6.36, so it's not necessarily dovecot-related. Though you've got plenty to choose from between 2.6.27.x and up.
We're on 2.6.32 and the load only goes up when I change dovecot (not when I change the kernel, which I didn't do so far)
Getting 'imap-login' and 'pop3-login' set to service_count=0 and 'pop3' and 'imap' set to service_count=1000 (as per Timo's suggestion) helped keep the boxes from spinning into oblivion. To reduce the enormous amount of context switches, I've got 'pop3's client_limit set to 4. I played around with 'imap's client_limit between 1 and 5 but haven't quite found the sweet spot yet. pop3 with client_limit 4 seems to work pretty good. That brought context switches down from 10,000-30,000 to sub-10,000.
Interesting. Would that spawn a lot of pop3 processes? On the other hand, almost nobody is using pop3 here
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
On Wed, Dec 8, 2010 at 11:54 PM, Ralf Hildebrandt Ralf.Hildebrandt@charite.de wrote:
- Mark Moseley moseleymark@gmail.com:
On Wed, Dec 8, 2010 at 3:03 PM, Timo Sirainen tss@iki.fi wrote:
On 8.12.2010, at 22.52, Cor Bosman wrote:
1 server with service_count = 0, and src/imap/main.c patch
By this you mean service_count=0 for both service imap-login and service imap blocks, right?
Speaking from my own experience, the system loads on our dovecot boxes went up *substantially* when we upgraded kernels from 2.6.32.x and 2.6.33.x to newer ones (late 2.6.35.x and 2.6.36 -- haven't tried 2.6.36.1 yet). But I also saw loads on all sort of other types of boxes grow when moved to 2.6.35.x and 2.6.36, so it's not necessarily dovecot-related. Though you've got plenty to choose from between 2.6.27.x and up.
We're on 2.6.32 and the load only goes up when I change dovecot (not when I change the kernel, which I didn't do so far)
If you at some point upgrade to >2.6.35, I'd be interested to hear if the load skyrockets on you. I also get the impression that the load average calculation in these recent kernels is 'touchier' than in pre-2.6.35. Even with similar CPU and I/O utilization, the load average on a >2.6.35 both is much higher than pre- and it also seems to react more quickly; more jitter I guess. That's based on nothing scientific though.
Getting 'imap-login' and 'pop3-login' set to service_count=0 and 'pop3' and 'imap' set to service_count=1000 (as per Timo's suggestion) helped keep the boxes from spinning into oblivion. To reduce the enormous amount of context switches, I've got 'pop3's client_limit set to 4. I played around with 'imap's client_limit between 1 and 5 but haven't quite found the sweet spot yet. pop3 with client_limit 4 seems to work pretty good. That brought context switches down from 10,000-30,000 to sub-10,000.
Interesting. Would that spawn a lot of pop3 processes? On the other hand, almost nobody is using pop3 here
Upping the client_limit actually results in less processes, since a single process can service up to #client_limit connections. When I bumped up the client_limit for imap, my context switches plummeted. Though as Timo pointed out on another thread the other day when I was asking about this, when that proc blocks on I/O, it's blocking all the connections that the process is servicing.Timo, correct me if I'm wildly off here -- I didn't even know this existed before a week or two ago. So you can then end up creating a bottleneck, thus why I've been playing with finding a sweet spot for imap. I figure that enough of a process's imap connections must be sitting in IDLE at any given moment, so setting client_limit to like 4 or 5 isn't too bad. Though it's not impossible that by putting multiple connections on a single process, I'm actually throttiling the system, resulting in fewer context switches (though I'd imagine bottlenecked procs would be blocked on I/O and do a lot of volcs's).
- Mark Moseley moseleymark@gmail.com:
We're on 2.6.32 and the load only goes up when I change dovecot (not when I change the kernel, which I didn't do so far)
If you at some point upgrade to >2.6.35, I'd be interested to hear if the load skyrockets on you.
You mean even more? I'm still hoping it would decrease at some point :) I updated to 2.6.32-27-generic-pae today. I wonder what happens.
I also get the impression that the load average calculation in these recent kernels is 'touchier' than in pre-2.6.35. Even with similar CPU and I/O utilization, the load average on a >2.6.35 both is much higher than pre- and it also seems to react more quickly; more jitter I guess. That's based on nothing scientific though.
Interesting.
Upping the client_limit actually results in less processes, since a single process can service up to #client_limit connections. When I bumped up the client_limit for imap, my context switches plummeted.
Which setting are you using now?
Though as Timo pointed out on another thread the other day when I was asking about this, when that proc blocks on I/O, it's blocking all the connections that the process is servicing.Timo, correct me if I'm wildly off here -- I didn't even know this existed before a week or two ago. So you can then end up creating a bottleneck, thus why I've been playing with finding a sweet spot for imap.
Blocking on /proc? Never heard that before.
I figure that enough of a process's imap connections must be sitting in IDLE at any given moment, so setting client_limit to like 4 or 5 isn't too bad. Though it's not impossible that by putting multiple connections on a single process, I'm actually throttiling the system, resulting in fewer context switches (though I'd imagine bottlenecked procs would be blocked on I/O and do a lot of volcs's).
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
On Thu, Dec 9, 2010 at 11:13 AM, Ralf Hildebrandt Ralf.Hildebrandt@charite.de wrote:
- Mark Moseley moseleymark@gmail.com:
We're on 2.6.32 and the load only goes up when I change dovecot (not when I change the kernel, which I didn't do so far)
If you at some point upgrade to >2.6.35, I'd be interested to hear if the load skyrockets on you.
You mean even more? I'm still hoping it would decrease at some point :) I updated to 2.6.32-27-generic-pae today. I wonder what happens.
I also get the impression that the load average calculation in these recent kernels is 'touchier' than in pre-2.6.35. Even with similar CPU and I/O utilization, the load average on a >2.6.35 both is much higher than pre- and it also seems to react more quickly; more jitter I guess. That's based on nothing scientific though.
Interesting.
Interesting, and a major PITA. If I had more time, I'd go back and iterate through each kernel leading up to 2.6.35 to see where things start to go downhill. You'd like to think each kernel version would make things just a little bit faster (or at least, nothing worse than the same). Though if it really is just more jittery and is showing higher loads but not actually working any harder than an older kernel, then it's just my perception only.
Upping the client_limit actually results in less processes, since a single process can service up to #client_limit connections. When I bumped up the client_limit for imap, my context switches plummeted.
Which setting are you using now?
At the moment, I'm using client_limit=5 for imap, but I keep playing with it. I have a feeling that's too high though. If I had faster cpus and more memory on these boxes, it wouldn't be so painful to put it back to 1.
Though as Timo pointed out on another thread the other day when I was asking about this, when that proc blocks on I/O, it's blocking all the connections that the process is servicing.Timo, correct me if I'm wildly off here -- I didn't even know this existed before a week or two ago. So you can then end up creating a bottleneck, thus why I've been playing with finding a sweet spot for imap.
Blocking on /proc? Never heard that before.
I was just being lazy. I meant 'process' :) So, if I'm understanding it correctly, assume you've got client_limit=2 and you've got connection A and connection B serviced by a single process. If A does a file operation that blocks, then B is effectively blocked too. So I imagine if you get enough I/O backlog, you can create a downward spiral where you can't service the connections faster than they're coming in and you top out at the #process_limit. Which, btw, I set to 300 for imap with client_limit=5.
I figure that enough of a process's imap connections must be sitting in IDLE at any given moment, so setting client_limit to like 4 or 5 isn't too bad. Though it's not impossible that by putting multiple connections on a single process, I'm actually throttiling the system, resulting in fewer context switches (though I'd imagine bottlenecked procs would be blocked on I/O and do a lot of volcs's).
On Thu, 2010-12-09 at 10:18 -0800, Mark Moseley wrote:
Upping the client_limit actually results in less processes, since a single process can service up to #client_limit connections. When I bumped up the client_limit for imap, my context switches plummeted. Though as Timo pointed out on another thread the other day when I was asking about this, when that proc blocks on I/O, it's blocking all the connections that the process is servicing.
Yeah. And it's even worse if it's blocking on waiting for a lock.
BTW. Do you have these kind of error messages in your log:
net_connect_unix(pop3) failed: Resource temporarily unavailable net_connect_unix(imap) failed: Resource temporarily unavailable
I think those are sometimes happening when not using client_limit=1, because all the processes are busy at that time and can't accept a new connection (while with client_limit=1 a new process would be created to handle it).
On Thu, Dec 9, 2010 at 12:58 PM, Timo Sirainen tss@iki.fi wrote:
On Thu, 2010-12-09 at 10:18 -0800, Mark Moseley wrote:
Upping the client_limit actually results in less processes, since a single process can service up to #client_limit connections. When I bumped up the client_limit for imap, my context switches plummeted. Though as Timo pointed out on another thread the other day when I was asking about this, when that proc blocks on I/O, it's blocking all the connections that the process is servicing.
Yeah. And it's even worse if it's blocking on waiting for a lock.
BTW. Do you have these kind of error messages in your log:
net_connect_unix(pop3) failed: Resource temporarily unavailable net_connect_unix(imap) failed: Resource temporarily unavailable
I think those are sometimes happening when not using client_limit=1, because all the processes are busy at that time and can't accept a new connection (while with client_limit=1 a new process would be created to handle it).
Yeah, I do get small bursts of those, but not enough to get too worried about. I was assuming it was hitting process_limit for imap. On one box, for all of today I see two clusters of those errors, both lasting about 15-20 seconds apiece.
The problem is that the smaller I set client_limit (including =1) on imap, the more likely the mass of imap processes will push these box into swap (and these are old 2gig boxes; hoping to get replace them soon with newer Dells, like PE 450's).
Mark Moseley put forth on 12/9/2010 12:18 PM:
If you at some point upgrade to >2.6.35, I'd be interested to hear if the load skyrockets on you. I also get the impression that the load average calculation in these recent kernels is 'touchier' than in pre-2.6.35.
This thread may be of value in relation to this issue:
http://groups.google.com/group/linux.kernel/browse_thread/thread/eb5cb488b74...
It seems there are some load issues regarding recent Linux kernels, from 2.6.34 (maybe earlier?) on up. The commit of the patch to fix it was Dec 8th--yesterday. So it'll be a while before distros get this patch out.
However, this still doesn't seem to explain Ralf's issue, where the kernel stays the same, but the Dovecot version changes, with 2.0.x causing the high load and 1.2.x being normal. Maybe 2.0.x simply causes this bug to manifest itself more loudly?
This Linux kernel bug doesn't explain the high load reported with 2.0.x on FreeBSD either. But it is obviously somewhat at play for people running these kernels versions and 2.0.x. To what degree it adds to the load I cannot say.
-- Stan
On 10.12.2010, at 0.54, Stan Hoeppner wrote:
However, this still doesn't seem to explain Ralf's issue, where the kernel stays the same, but the Dovecot version changes, with 2.0.x causing the high load and 1.2.x being normal. Maybe 2.0.x simply causes this bug to manifest itself more loudly?
Cor's debugging has so far shown that a single epoll_wait() call can sometimes generate a few thousand voluntary context switches. I can't really understand how that's possible. Those epoll_wait() calls about half of the total voluntary context switches generated by imap processes. We'll see tomorrow if poll() works better or if a small patch I made makes it better.
This Linux kernel bug doesn't explain the high load reported with 2.0.x on FreeBSD either.
Who has reported high load on FreeBSD? So far I know of only Ralf and Cor and both are using Linux.
- Timo Sirainen tss@iki.fi:
Cor's debugging has so far shown that a single epoll_wait() call can sometimes generate a few thousand voluntary context switches. I can't really understand how that's possible. Those epoll_wait() calls about half of the total voluntary context switches generated by imap processes. We'll see tomorrow if poll() works better or if a small patch I made makes it better.
That sounds promising :)
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
Ralf Hildebrandt put forth on 12/10/2010 1:39 AM:
- Timo Sirainen tss@iki.fi:
Cor's debugging has so far shown that a single epoll_wait() call can sometimes generate a few thousand voluntary context switches. I can't really understand how that's possible. Those epoll_wait() calls about half of the total voluntary context switches generated by imap processes. We'll see tomorrow if poll() works better or if a small patch I made makes it better.
That sounds promising :)
Has anyone considered a compiler issue? A gcc optimization flags issue? A gcc version issue? Something along these lines?
To generate the number of context switches/sec that has been mentioned, that would seem to indicate a fast loop running more times than it should but without actually doing much, thus not utilizing its full time slice, getting sidelined by the scheduler, but ready to run on each kernel tick. Thus it runs on each kernel tick. Could something like "-funroll-all-loops" be having problems with the 2.0.x source?
I'm talking a bit out of my element here. I know just enough about compiler technology and machine code to make me really dangerous, but not very useful. :) I'm just trying to look outside the box I guess, or leave no stone unturned. Pick you favorite analogy that applies.
Cor and Ralf, are you two using the same gcc version? Same flags? Are they different than the switches/flags most other 2.0.x OPs are using when they make the source?
-- Stan
- Stan Hoeppner stan@hardwarefreak.com:
Has anyone considered a compiler issue? A gcc optimization flags issue? A gcc version issue? Something along these lines?
Using gcc: gcc version 4.4.5 (Debian 4.4.5-8)
Cor and Ralf, are you two using the same gcc version? Same flags? Are they different than the switches/flags most other 2.0.x OPs are using when they make the source?
I'm using: ./configure --prefix=/usr/dovecot-2 --enable-maintainer-mode && make
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
- Ralf Hildebrandt Ralf.Hildebrandt@charite.de:
I'm using: ./configure --prefix=/usr/dovecot-2 --enable-maintainer-mode && make
and export LDFLAGS="-Wl,--as-needed" export CPPFLAGS='-Wl,--as-needed' but omitting that makes no difference
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
Attached patch should workaround this. The main problem with it is that Dovecot doesn't die very easily after this. You have to kill all the processes manually. I'll probably have to add yet another pipe just for this.
But I wonder if this could be considered a kernel bug too? I think I'll write a small test program and see what lkml thinks about it. Oh, and seeing if other operating systems handle it better would be interesting too.
BTW. I actually considered this already last week but then I thought it can't be it since v1.2 did the same thing. But no, v1.2 doesn't really do the same thing.
Attached patch should workaround this. The main problem with it is that Dovecot doesn't die very easily after this. You have to kill all the processes manually. I'll probably have to add yet another pipe just for this.
Yes. Plenty of killall's are now needed!
So what does this patch actually do?
All client processes used to check on the master's status FD to check if it is still alive. And you disabled that?
But I wonder if this could be considered a kernel bug too? I think I'll write a small test program and see what lkml thinks about it. Oh, and seeing if other operating systems handle it better would be interesting too.
BTW. I actually considered this already last week but then I thought it can't be it since v1.2 did the same thing. But no, v1.2 doesn't really do the same thing.
:)
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
On Wed, 2010-12-15 at 17:43 +0100, Ralf Hildebrandt wrote:
Attached patch should workaround this. The main problem with it is that Dovecot doesn't die very easily after this. You have to kill all the processes manually. I'll probably have to add yet another pipe just for this.
Yes. Plenty of killall's are now needed!
Since it's now pretty obvious that this is the problem and this has to be fixed for Linux users, I added a proper patch to hg: http://hg.dovecot.org/dovecot-2.0/rev/b0ec48006d57
So what does this patch actually do?
All client processes used to check on the master's status FD to check if it is still alive. And you disabled that?
Yes. And the new patch uses a new pipe fd.
BTW. Postfix's behavior is similar (unsurprisingly, since I basically copied its design for v2.0). I wonder how much of a problem this is with Postfix. Is a similar patch or a kernel fix going to drop context switches 10x there too?
Am 15.12.2010 20:16, schrieb Timo Sirainen:
BTW. Postfix's behavior is similar (unsurprisingly, since I basically copied its design for v2.0). I wonder how much of a problem this is with Postfix. Is a similar patch or a kernel fix going to drop context switches 10x there too?
however, youre patch should be tested strongly by users with problems to make sure it doesnt break something on other points you may contact Wietse Venema for design questions
-- Best Regards
MfG Robert Schetterer
Germany/Munich/Bavaria
On Wed, 15 Dec 2010 19:16:45 +0000 Timo Sirainen tss@iki.fi articulated:
BTW. Postfix's behavior is similar (unsurprisingly, since I basically copied its design for v2.0). I wonder how much of a problem this is with Postfix. Is a similar patch or a kernel fix going to drop context switches 10x there too?
Interesting. On a FreeBSD-8.2/amd64 system with Postfix (2.8-20101108) installed I have not noticed any difference. I am assuming that this problem does not affect FreeBSD systems.
-- Jerry ✌ Dovecot.user@seibercom.net
Disclaimer: off-list followups get on-list replies or get ignored. Please do not ignore the Reply-To header.
On 15.12.2010, at 19.42, Jerry wrote:
BTW. Postfix's behavior is similar (unsurprisingly, since I basically copied its design for v2.0). I wonder how much of a problem this is with Postfix. Is a similar patch or a kernel fix going to drop context switches 10x there too?
Interesting. On a FreeBSD-8.2/amd64 system with Postfix (2.8-20101108) installed I have not noticed any difference. I am assuming that this problem does not affect FreeBSD systems.
Correct. Only Linux is affected.
Anyway, Postfix's design has been like this forever, so no one would have ever noticed context switch counts increasing. But changing this might make them notice that context switches are dropping.
Correct. Only Linux is affected.
Anyway, Postfix's design has been like this forever, so no one would have ever noticed context switch counts increasing. But changing this might make them notice that context switches are dropping.
Im a little surprised we havent seen more reports on dovecot as well. The impact was noticeable for us.
Thanks for looking into the issue!
Cor
On Qua, 2010-12-15 at 21:06 +0100, Cor Bosman wrote:
Correct. Only Linux is affected.
Anyway, Postfix's design has been like this forever, so no one would have ever noticed context switch counts increasing. But changing this might make them notice that context switches are dropping.
Im a little surprised we havent seen more reports on dovecot as well. The impact was noticeable for us.
We had very little change on the load, no impact on client performance or anything to alarm us, after all it was a major upgrade.
Definitely there's wasn't a 20x increase on the load as on the cs's.
I guess that would depend on the CPUs, caches, etc.
-- Jose Celestino | http://japc.uncovering.org/files/japc-pgpkey.asc
"Assumption is the Mother of Screw-Up" -- Mr. John Elwood Hale
- Jose Celestino japc@co.sapo.pt:
I guess that would depend on the CPUs, caches, etc.
No it doesn't :)
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
It looks like Timo's patch fixes the problem. Context switches are now back to normal, and the load graph is smooth and lower. I saw someone also posted a patch to the LKML.
Cor
- Cor Bosman cor@xs4all.nl:
It looks like Timo's patch fixes the problem. Context switches are now back to normal, and the load graph is smooth and lower.
Same here.
I saw someone also posted a patch to the LKML.
I guess I missed that one
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
On Qui, 2010-12-16 at 12:56 +0100, Ralf Hildebrandt wrote:
- Cor Bosman cor@xs4all.nl:
I saw someone also posted a patch to the LKML.
I guess I missed that one
http://lkml.org/lkml/2010/12/15/470
-- Jose Celestino | http://japc.uncovering.org/files/japc-pgpkey.asc
"Assumption is the Mother of Screw-Up" -- Mr. John Elwood Hale
2010/12/16 Jose Celestino japc@co.sapo.pt:
On Qui, 2010-12-16 at 12:56 +0100, Ralf Hildebrandt wrote:
- Cor Bosman cor@xs4all.nl:
I saw someone also posted a patch to the LKML.
I guess I missed that one
Timo, if we apply the above kernel patch, do we still need to patch dovecot or is it just an either-or thing?
I'm guessing the reason I saw so many less context switches when using a client_limit of more than one due to having more connections per process == less children to wake up.
For those interested, graph showing the difference before and after patch:
context switches: http://grab.by/7W9u load: http://grab.by/7W9x
Cor
You mean these past two months of grief - concern over Dovecot 2.0's underlying design, exploring various configurations and settings, and general hysteria came down to...a two line patch?!?!?!
I know it was more than that - and that in fact this exposed a Linux kernel flaw - but I'm still laughing over this (since my tiny installations never had a significant load that would have been affected, I can afford to laugh).
As always - sincere thanks for everything Timo!
Daniel
On 16.12.2010, at 22.09, Daniel L. Miller wrote:
You mean these past two months of grief - concern over Dovecot 2.0's underlying design, exploring various configurations and settings, and general hysteria came down to...a two line patch?!?!?!
At least now I know much more about context switches than I ever did before :)
- Timo Sirainen tss@iki.fi:
On 16.12.2010, at 22.09, Daniel L. Miller wrote:
You mean these past two months of grief - concern over Dovecot 2.0's underlying design, exploring various configurations and settings, and general hysteria came down to...a two line patch?!?!?!
At least now I know much more about context switches than I ever did before :)
That's the spirit!
Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
- Daniel L. Miller dmiller@amfes.com:
You mean these past two months of grief - concern over Dovecot 2.0's underlying design, exploring various configurations and settings, and general hysteria came down to...a two line patch?!?!?!
Yes. It's like that. With Windows, nobody would have ever fixed that.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
hi cor,
For those interested, graph showing the difference before and after patch:
context switches: http://grab.by/7W9u load: http://grab.by/7W9x
which patch, timo's patch on dovecot or the also posted lkml patch
christoph
On 16.12.2010, at 18.33, Mark Moseley wrote:
Timo, if we apply the above kernel patch, do we still need to patch dovecot or is it just an either-or thing?
Either-or.
I'm guessing the reason I saw so many less context switches when using a client_limit of more than one due to having more connections per process == less children to wake up.
Yes.
Oops, now I finally understand why Mail.app kept asking for my password for each mail I sent: it helpfully decided to start signing mails with the only client cert I had, without asking me.. Forget about those signatures in the last two mails :)
Oops, now I finally understand why Mail.app kept asking for my password for each mail I sent: it helpfully decided to start signing mails with the only client cert I had, without asking me.. Forget about those signatures in the last two mails :)
Heh, is that the key you used to get into our test server for this problem? :)
Cor
On 2010-12-16 6:53 AM, Cor Bosman wrote:
It looks like Timo's patch fixes the problem. Context switches are now back to normal, and the load graph is smooth and lower. I saw someone also posted a patch to the LKML.
Out of curiosity, when you say lower - how does the load actually compare now to the load you had with 1.2 (same? little higher but now steady? lower?)?
--
Best regards,
Charles
- Charles Marcus CMarcus@Media-Brokers.com:
On 2010-12-16 6:53 AM, Cor Bosman wrote:
It looks like Timo's patch fixes the problem. Context switches are now back to normal, and the load graph is smooth and lower. I saw someone also posted a patch to the LKML.
Out of curiosity, when you say lower - how does the load actually compare now to the load you had with 1.2 (same? little higher but now steady? lower?)?
Identical
Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
Timo Sirainen put forth on 12/9/2010 7:10 PM:
On 10.12.2010, at 0.54, Stan Hoeppner wrote:
However, this still doesn't seem to explain Ralf's issue, where the kernel stays the same, but the Dovecot version changes, with 2.0.x causing the high load and 1.2.x being normal. Maybe 2.0.x simply causes this bug to manifest itself more loudly?
Cor's debugging has so far shown that a single epoll_wait() call can sometimes generate a few thousand voluntary context switches. I can't really understand how that's possible. Those epoll_wait() calls about half of the total voluntary context switches generated by imap processes. We'll see tomorrow if poll() works better or if a small patch I made makes it better.
This Linux kernel bug doesn't explain the high load reported with 2.0.x on FreeBSD either.
Who has reported high load on FreeBSD? So far I know of only Ralf and Cor and both are using Linux.
Oh, my bad Timo. I thought someone said they were running Dovecot 2.0.x atop FreeBSD on bare metal and having the high load issue. I must have got two different posts crossed up in the gray matter.
-- Stan
- Stan Hoeppner stan@hardwarefreak.com:
Mark Moseley put forth on 12/9/2010 12:18 PM:
If you at some point upgrade to >2.6.35, I'd be interested to hear if the load skyrockets on you. I also get the impression that the load average calculation in these recent kernels is 'touchier' than in pre-2.6.35.
This thread may be of value in relation to this issue:
http://groups.google.com/group/linux.kernel/browse_thread/thread/eb5cb488b74...
It seems there are some load issues regarding recent Linux kernels, from 2.6.34 (maybe earlier?) on up. The commit of the patch to fix it was Dec 8th--yesterday. So it'll be a while before distros get this patch out.
I'm using 2.6.32
However, this still doesn't seem to explain Ralf's issue, where the kernel stays the same, but the Dovecot version changes, with 2.0.x causing the high load and 1.2.x being normal. Maybe 2.0.x simply causes this bug to manifest itself more loudly?
That could be!
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
On Thu, Dec 9, 2010 at 4:54 PM, Stan Hoeppner stan@hardwarefreak.com wrote:
Mark Moseley put forth on 12/9/2010 12:18 PM:
If you at some point upgrade to >2.6.35, I'd be interested to hear if the load skyrockets on you. I also get the impression that the load average calculation in these recent kernels is 'touchier' than in pre-2.6.35.
This thread may be of value in relation to this issue:
http://groups.google.com/group/linux.kernel/browse_thread/thread/eb5cb488b74...
It seems there are some load issues regarding recent Linux kernels, from 2.6.34 (maybe earlier?) on up. The commit of the patch to fix it was Dec 8th--yesterday. So it'll be a while before distros get this patch out.
Glad you brought up that thread. I've seen it before but not any time lately (and certainly not since 2 days ago!). Hopefully that'll be in 2.6.37, instead of not till 2.6.38 (or that I can patch it into 2.6.36.2). I roll my own kernels, so no need to wait on distros.
However, this still doesn't seem to explain Ralf's issue, where the kernel stays the same, but the Dovecot version changes, with 2.0.x causing the high load and 1.2.x being normal. Maybe 2.0.x simply causes this bug to manifest itself more loudly?
This Linux kernel bug doesn't explain the high load reported with 2.0.x on FreeBSD either. But it is obviously somewhat at play for people running these kernels versions and 2.0.x. To what degree it adds to the load I cannot say.
Yeah, my comment on the kernel thing was just in reply to one of Cor's 3 debugging tracks, 1 of which was to try upgrading the kernel. I figured I should mention the load issue if he might be upgrading to the latest/greatest, since it could make things worse (or make things *look* worse).
This is probably a really dumb question, but here goes...
Is it possible to run the imap-login process from 1.2 against a 2.0 system?
Daniel
Mark Moseley put forth on 12/10/2010 2:25 PM:
Yeah, my comment on the kernel thing was just in reply to one of Cor's 3 debugging tracks, 1 of which was to try upgrading the kernel. I figured I should mention the load issue if he might be upgrading to the latest/greatest, since it could make things worse (or make things *look* worse).
Maybe I need to re-read that thread again (twice). At this point I'm still not sure if the load is actually higher, or if the bug is that's it's simply being *reported* as higher when it really isn't.
-- Stan
- Stan Hoeppner stan@hardwarefreak.com:
Mark Moseley put forth on 12/10/2010 2:25 PM:
Yeah, my comment on the kernel thing was just in reply to one of Cor's 3 debugging tracks, 1 of which was to try upgrading the kernel. I figured I should mention the load issue if he might be upgrading to the latest/greatest, since it could make things worse (or make things *look* worse).
Maybe I need to re-read that thread again (twice). At this point I'm still not sure if the load is actually higher, or if the bug is that's it's simply being *reported* as higher when it really isn't.
The load is definitely higher. Yesterday I kept 2.0 running until noon. Half an hour before noon users were reporting their logins were timing out. At the same time the load skyrocketed (system was at almost 100 percent)
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
Here's the doveconf -n output: http://wa.ter.net/download/doveconf-n.txt
Cor
On 12/8/2010 9:55 AM, Cor Bosman wrote:
Here's the doveconf -n output: http://wa.ter.net/download/doveconf-n.txt
Cor
It could be that you both are running a different Kernel from the Standard Lenny Kernel 2.6.26..... (this could be a clue ..)
M.A.
It could be that you both are running a different Kernel from the Standard Lenny Kernel 2.6.26..... (this could be a clue ..)
It would be interesting to hear from people that aren't seeing a big load increase. My initial guess was some kind of NFS problem, but since Ralf isn't doing NFS, that's probably not it.
Our CPUs are actually pretty much idle (these are 8 and 16 core machines). And memory is plentiful.
Cor
- Cor Bosman cor@xs4all.nl:
It would be interesting to hear from people that aren't seeing a big load increase.
Indeed. Tomi was mentioning some big company; what kind of setup were the using, Timo?
My initial guess was some kind of NFS problem, but since Ralf isn't doing NFS, that's probably not it.
Yep.
Our CPUs are actually pretty much idle (these are 8 and 16 core machines). And memory is plentiful.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
Cor Bosman put forth on 12/8/2010 9:45 AM:
It could be that you both are running a different Kernel from the Standard Lenny Kernel 2.6.26..... (this could be a clue ..)
It would be interesting to hear from people that aren't seeing a big load increase. My initial guess was some kind of NFS problem, but since Ralf isn't doing NFS, that's probably not it.
Our CPUs are actually pretty much idle (these are 8 and 16 core machines). And memory is plentiful.
Cor, Ralf is running his Dovecot servers as VMware ESX guests. Are you running your servers in a VM environment, or on bare metal?
I'm guessing the common thread WRT this load problem is going to be relatively busy Dovecot servers running in virtual machine environments.
-- Stan
On Wed, 2010-12-08 at 23:37 +0100, Cor Bosman wrote:
We're running on bare metal, no VM involved.
Cor
Missed most of this list due to some noise levels and my lack of time to sit through hundreds messages on every list I'm on, so apologies if this was already suggested, due to errors in recent kernel time computations, append nohz=off to your lilo config (or whatever boot loader you use)
- Cor Bosman cor@xs4all.nl:
Here's the doveconf -n output: http://wa.ter.net/download/doveconf-n.txt
Except for your storage being on NFS, it looks fairly identical! masteruser, passdb pam, userdb passwd.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
- David Ford david@blue-labs.org:
Ralf, did you do the profiling yet?
With gprof or what exactly is on your mind?
On 12/08/10 09:50, Ralf Hildebrandt wrote:
[...] Yes, this looks like my graphs. Same increase. Factor 5.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
gprof for detail, and even with simple strace timing. i.e. strace -c.
if load is going up significantly, there should be one or more functions
significantly fatter than the rest. you can pick either to run it on
the whole group, or just attach to certain processes. (master, imap,
lda, etc)
when you run 'ps aux|grep dovecot', what seems to collect the most cpu time?
-d
On 12/08/10 10:54, Ralf Hildebrandt wrote:
- David Forddavid@blue-labs.org:
Ralf, did you do the profiling yet? With gprof or what exactly is on your mind?
- David Ford david@blue-labs.org:
gprof for detail, and even with simple strace timing. i.e. strace -c. if load is going up significantly, there should be one or more functions significantly fatter than the rest. you can pick either to run it on the whole group, or just attach to certain processes. (master, imap, lda, etc)
Timo and I found excessive numbers of context switches, factor 20-30. But it's unclear why the IMAP process would do/cause this.
when you run 'ps aux|grep dovecot', what seems to collect the most cpu time?
-d
On 12/08/10 10:54, Ralf Hildebrandt wrote:
- David Forddavid@blue-labs.org:
Ralf, did you do the profiling yet? With gprof or what exactly is on your mind?
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
what's your task switch HZ compiled at? CONFIG_HZ_1000? you would probably be better at 300 or 250. have you tried tickless? is your kernel compiled for precisely your cpu type and smp/hyper options set correctly? what about CONFIG_PREEMPT? definitely don't use realtime, server is appropriate.
-d
On 12/08/10 11:05, Ralf Hildebrandt wrote:
Timo and I found excessive numbers of context switches, factor 20-30. But it's unclear why the IMAP process would do/cause this.
On 08-12-10 17:11, David Ford wrote:
what's your task switch HZ compiled at? CONFIG_HZ_1000? you would probably be better at 300 or 250. have you tried tickless? is your kernel compiled for precisely your cpu type and smp/hyper options set correctly? what about CONFIG_PREEMPT? definitely don't use realtime, server is appropriate.
-d
On 12/08/10 11:05, Ralf Hildebrandt wrote:
Timo and I found excessive numbers of context switches, factor 20-30. But it's unclear why the IMAP process would do/cause this. Hi,
If I'm not mistaken uses XS4ALL FreeBSD on there imap servers. Cor ?
Thnx,
Diederick
On Dec 8, 2010, at 5:11 PM, David Ford wrote:
what's your task switch HZ compiled at? CONFIG_HZ_1000? you would probably be better at 300 or 250. have you tried tickless? is your kernel compiled for precisely your cpu type and smp/hyper options set correctly? what about CONFIG_PREEMPT? definitely don't use realtime, server is appropriate.
You're probably asking Ralf, but I might as well answer as well..
CONFIG_HZ_250=y CONFIG_PREEMPT_NONE=y
Not using realtime, all cpu/smp options look fine.
CPU is usually pretty idle. Hovering around 90-99% idle. But the load graphs clearly show a radical increase, where before things were pretty smooth, now they're erratic and much higher.
Cor
ps: we use linux, not freebsd.
- David Ford david@blue-labs.org:
what's your task switch HZ compiled at? CONFIG_HZ_1000? you would probably be better at 300 or 250. have you tried tickless?
# fgrep HZ /boot/config-2.6.32-25-generic-pae CONFIG_NO_HZ=y CONFIG_HZ_250=y CONFIG_HZ=250
is your kernel compiled for precisely your cpu type and smp/hyper options set correctly?
It's an Ubuntu kernel (config-2.6.32-25-generic-pae)
what about CONFIG_PREEMPT? definitely don't use realtime, server is appropriate.
# fgrep PREEMPT /boot/config-2.6.32-25-generic-pae # CONFIG_TREE_PREEMPT_RCU is not set CONFIG_PREEMPT_NOTIFIERS=y # CONFIG_PREEMPT_NONE is not set CONFIG_PREEMPT_VOLUNTARY=y # CONFIG_PREEMPT is not set
It's the same kernel 1.2.x is running on, without problems.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
Timo and I found excessive numbers of context switches, factor 20-30. But it's unclear why the IMAP process would do/cause this.
Im seeing this as well... http://grab.by/7Nni
Cor
On 8.12.2010, at 17.56, Cor Bosman wrote:
Timo and I found excessive numbers of context switches, factor 20-30. But it's unclear why the IMAP process would do/cause this.
Im seeing this as well... http://grab.by/7Nni
So same problem as with Ralf. What kernel version are you using? Maybe try a newer one to see if that happens to fix it?
I see you're using userdb passwd. Do your users have unique UIDs? If they have, maybe it has to do with that..
Yes, we have about 1 million unique UIDs in the passwd file (actually NIS). I did upgrade 4 machines to the latest kernel, but it's hard to tell if that changed much as our usage peak is during the day, So I'll know more tomorrow. But so far it looks as though the context switches are still high.
Cor
service imap-login { service_count = 1 }
So you have tons of imap-login processes. Did you have that in v1.2 too? Try setting service_count=0 (and same for pop3-login)
So you have tons of imap-login processes. Did you have that in v1.2 too? Try setting service_count=0 (and same for pop3-login)
Yes, we have tons of imap-login processes. I'll set service_count=0 for imap (we dont do pop with dovecot) on a few servers.
Cor
Oh, and I dont know if we did in 1.2. I think so, but cant be positive. I tried making the config the same. I have the config still around if you want to see it.
Cor
- Cor Bosman cor@xs4all.nl:
login_process_per_connection = yes
So seems like we did have that set in 1.2.
We didn't use that (in both 1.2 and 2.0)
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
- Cor Bosman cor@xs4all.nl:
I see you're using userdb passwd. Do your users have unique UIDs? If they have, maybe it has to do with that..
Yes, we have about 1 million unique UIDs in the passwd file (actually NIS).
And we have about 15.000 unique UIDs in the passwd file (no NIS!)
I did upgrade 4 machines to the latest kernel, but it's hard to tell if that changed much as our usage peak is during the day, So I'll know more tomorrow. But so far it looks as though the context switches are still high.
Let me know!
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
On Qua, 2010-12-08 at 15:39 +0100, Cor Bosman wrote:
I upgraded most of our servers from 1.2.x to 2.0.8 and am noticing a really big increase in server load. This is across the board, not any specific server. Check this screenshot of a load graph: http://grab.by/7N8V
Is there anything i should be looking at that could cause such a drastic load increase? Conf file is available here: http://wa.ter.net/download/doveconf.txt
We also had a load increase (about 10x) when going from 1.1.10 to 2.0.7 a couple of weeks ago.
It ended up being due to a slight increase in memory usage by the imap processes that made the servers start using swap and the load to spike.
A memory upgrade brought it all to normal loads.
Wonder if this is your case (vmstat, check for si and so).
-- Jose Celestino | http://japc.uncovering.org/files/japc-pgpkey.asc
"Assumption is the Mother of Screw-Up" -- Mr. John Elwood Hale
- Jose Celestino japc@co.sapo.pt:
It ended up being due to a slight increase in memory usage by the imap processes that made the servers start using swap and the load to spike.
My machine is not swapping; it doesn't even HAVE swap, it still has lots of free memory
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
trace looks like an interesting new tool: http://lwn.net/Articles/415728/
Wonder if that would tell something about this problem.
On 9.12.2010, at 1.52, Timo Sirainen wrote:
trace looks like an interesting new tool: http://lwn.net/Articles/415728/
Wonder if that would tell something about this problem.
Like the answer to question: Is the number of voluntary context switches by imap processes close to the number of called syscalls? And if so, which system calls does v2.0 call more often than v1.2?
On 9.12.2010, at 2.24, Timo Sirainen wrote:
Like the answer to question: Is the number of voluntary context switches by imap processes close to the number of called syscalls?
Oh, right, no. Syscalls don't increase context switch counts. So the voluntary context switch basically seems to mean that the process is doing a read(), poll() or something else (what?) that waits for input from elsewhere..
So with Ralf's previous statistics for one hour:
v2.0: 6887 imap logouts (processes) 26672928 voluntary context switches (3872 / process) 1313631 involuntary context switches (190 / process)
v1.2: 6832 imap logouts (processes) 1003200 voluntary context switches (146 / process) 155566 involuntary context switches (22 / process)
The v1.2 values look pretty good. v2.0's involuntary context switches isn't too bad either. So where do all the 3700 new voluntary context switches come from? The new method of initializing user logins can't add more than a few more of those. Running imaptest locally I get similar volcs values for v1.2 and v2.0. Wonder if you get different values?
If you don't mind some huge logs, you could also try the attached patch that logs the voluntary context switch count for every executed IMAP command. Maybe some command shows up that generates them much more than others.
lcs values for v1.2 and v2.0. Wonder if you get different values?
If you don't mind some huge logs, you could also try the attached patch that logs the voluntary context switch count for every executed IMAP command. Maybe some command shows up that generates them much more than others. <diff>
I have applied this patch to 1 server, and it's logging away. Let me know what kind of output you'd like to see. Or i can send you the raw logfile. As I said previously, im not longer running the imap server=0 patch because it caused these errors:
Dec 9 06:54:32 userimap30 dovecot: imap(cor): Error: user cor: Initialization failed: Namespace '': mkdir(/var/spool/mail/.4a/L/0/14/corbosch) failed: Permission denied (euid=1000(cor) egid=50(staff) missing +w perm: /var/spool/mail/.4a/L/0/14, euid is not dir owner).
After I removed the patch these errors disappeared.
Cor
Just for thoroughness ive started 2 servers with the logging patch. One with service_count=0 and one with service_count=1
On 9.12.2010, at 6.50, Cor Bosman wrote:
lcs values for v1.2 and v2.0. Wonder if you get different values?
If you don't mind some huge logs, you could also try the attached patch that logs the voluntary context switch count for every executed IMAP command. Maybe some command shows up that generates them much more than others. <diff>
I have applied this patch to 1 server, and it's logging away. Let me know what kind of output you'd like to see. Or i can send you the raw logfile.
Are the process pids also logged in the messages, so it's clear which messages belong to which imap process? If not, add %p to mail_log_prefix.
As I said previously, im not longer running the imap server=0 patch because it caused these errors:
Dec 9 06:54:32 userimap30 dovecot: imap(cor): Error: user cor: Initialization failed: Namespace '': mkdir(/var/spool/mail/.4a/L/0/14/corbosch) failed: Permission denied (euid=1000(cor) egid=50(staff) missing +w perm: /var/spool/mail/.4a/L/0/14, euid is not dir owner).
After I removed the patch these errors disappeared.
The patch was supposed to fix this.. Is uid 1000 wrong for this user? Didn't you already exist, why is it complaining about mkdir()?
Just for thoroughness ive started 2 servers with the logging patch. One with service_count=0 and one with service_count=1
What, are you saying that without the patch service imap { service_count=0 } works? Or are you talking about imap-login?
Are the process pids also logged in the messages, so it's clear which messages belong to which imap process? If not, add %p to mail_log_prefix.
Done. It wasnt logging this, now it is.
As I said previously, im not longer running the imap server=0 patch because it caused these errors:
Dec 9 06:54:32 userimap30 dovecot: imap(cor): Error: user cor: Initialization failed: Namespace '': mkdir(/var/spool/mail/.4a/L/0/14/corbosch) failed: Permission denied (euid=1000(cor) egid=50(staff) missing +w perm: /var/spool/mail/.4a/L/0/14, euid is not dir owner).
After I removed the patch these errors disappeared.
The patch was supposed to fix this.. Is uid 1000 wrong for this user? Didn't you already exist, why is it complaining about mkdir()?
It's very strange. Im user 'cor'. Maybe it's logging my name because I started the process (su-d to root obviously)? My uid is 1000. But im not the one doing this login, these are customer logins. The mkdir fails because the target dir is only writable by root. Normally dovecot creates a directory as root, and chowns it to the owner it belongs to. At least, thats what I assume happens as normally this all works fine.
It's like dovecot was unable to change the process to euid root but instead is doing everything as euid cor.
Just for thoroughness ive started 2 servers with the logging patch. One with service_count=0 and one with service_count=1
What, are you saying that without the patch service imap { service_count=0 } works? Or are you talking about imap-login?
Im talking about imap-login. Currently im not running any servers with imap { service_count=0 } due to the permission errors. Im more than happy to give it another go if you feel i did something wrong.
Cor
On 9.12.2010, at 8.24, Cor Bosman wrote:
Are the process pids also logged in the messages, so it's clear which messages belong to which imap process? If not, add %p to mail_log_prefix.
Done. It wasnt logging this, now it is.
Great. If the logs aren't too huge I could look at the raw ones, or you could try to write a script yourself to parse them. I'm basically interested in things like:
How large are the first volcs entries for processes? (= Is the initial process setup cost high?)
Find the last volcs entry for each process. Sum them up for some time period. Is the total about the same as the system's entire volcs during that time?
What kind of a volcs distribution do the processes have? A graph might be nice :)
Most importantly: Are there some commands that trigger a high increase in volcs numbers?
As I said previously, im not longer running the imap server=0 patch because it caused these errors:
Dec 9 06:54:32 userimap30 dovecot: imap(cor): Error: user cor: Initialization failed: Namespace '': mkdir(/var/spool/mail/.4a/L/0/14/corbosch) failed: Permission denied (euid=1000(cor) egid=50(staff) missing +w perm: /var/spool/mail/.4a/L/0/14, euid is not dir owner).
After I removed the patch these errors disappeared.
The patch was supposed to fix this.. Is uid 1000 wrong for this user? Didn't you already exist, why is it complaining about mkdir()?
It's very strange. Im user 'cor'. Maybe it's logging my name because I started the process (su-d to root obviously)? My uid is 1000. But im not the one doing this login, these are customer logins.
Well, that clearly says it's trying to initialize user "cor".. Is the "corbosch" directory you or someone else?
The mkdir fails because the target dir is only writable by root.
I'd have guessed that the above corbosch already existed.
Normally dovecot creates a directory as root, and chowns it to the owner it belongs to. At least, thats what I assume happens as normally this all works fine.
Uh. No.. Privileges are always dropped before Dovecot creates any directories. I don't know what creates the directories for you. I'd guess the authentication code?
Just for thoroughness ive started 2 servers with the logging patch. One with service_count=0 and one with service_count=1
What, are you saying that without the patch service imap { service_count=0 } works? Or are you talking about imap-login?
Im talking about imap-login. Currently im not running any servers with imap { service_count=0 } due to the permission errors. Im more than happy to give it another go if you feel i did something wrong.
OK. I hope the volcs logging gets us somewhere.
Great. If the logs aren't too huge I could look at the raw ones, or you could try to write a script yourself to parse them. I'm basically interested in things like:
How large are the first volcs entries for processes? (= Is the initial process setup cost high?)
Find the last volcs entry for each process. Sum them up for some time period. Is the total about the same as the system's entire volcs during that time?
What kind of a volcs distribution do the processes have? A graph might be nice :)
Most importantly: Are there some commands that trigger a high increase in volcs numbers?
If you want to have a quick look already, im mailing you the locations of 2 files, 1 with service_count=0 and one with service_count=1.
Well, that clearly says it's trying to initialize user "cor".. Is the "corbosch" directory you or someone else?
corbosch is actually a non-existing user, and im suspecting a local bug. I mailed you about that in private.
I'd have guessed that the above corbosch already existed.
Yeah, it probably should be targetting 'cor' but for some reason chars get appended to the dir location. Not sure if its our local modification or something in dovecot.
Normally dovecot creates a directory as root, and chowns it to the owner it belongs to. At least, thats what I assume happens as normally this all works fine.
Uh. No.. Privileges are always dropped before Dovecot creates any directories. I don't know what creates the directories for you. I'd guess the authentication code?
Something in dovecot does :) If a user without a maildir logs in for the first time, the dir is created.
Cor
On 9.12.2010, at 9.13, Cor Bosman wrote:
If you want to have a quick look already, im mailing you the locations of 2 files, 1 with service_count=0 and one with service_count=1.
I see that about half the commands that do hundreds or thousands of volcses are IDLE. Wonder if that is the problem. But there are others .. I see for example that one "LOGOUT" command takes 2000 volcses, which I doubt is true. More likely it's counting something that happened before the LOGOUT, like probably doing a mailbox synchronization. Attached a new version of the patch that also logs volcs for syncs.
Also STATUS and SELECT seem to have high volcses, which do a mailbox sync.. Hmm. So if it is about mailbox sync, where in the sync then is it slow? Added more debugging for this in the attached patch.
Normally dovecot creates a directory as root, and chowns it to the owner it belongs to. At least, thats what I assume happens as normally this all works fine.
Uh. No.. Privileges are always dropped before Dovecot creates any directories. I don't know what creates the directories for you. I'd guess the authentication code?
Something in dovecot does :) If a user without a maildir logs in for the first time, the dir is created.
Probably would be good to find out what exactly, in case that's caused by some security hole :)
On Dec 9, 2010, at 10:41 AM, Timo Sirainen wrote:
On 9.12.2010, at 9.13, Cor Bosman wrote:
If you want to have a quick look already, im mailing you the locations of 2 files, 1 with service_count=0 and one with service_count=1.
I see that about half the commands that do hundreds or thousands of volcses are IDLE. Wonder if that is the problem. But there are others .. I see for example that one "LOGOUT" command takes 2000 volcses, which I doubt is true. More likely it's counting something that happened before the LOGOUT, like probably doing a mailbox synchronization. Attached a new version of the patch that also logs volcs for syncs.
Also STATUS and SELECT seem to have high volcses, which do a mailbox sync.. Hmm. So if it is about mailbox sync, where in the sync then is it slow? Added more debugging for this in the attached patch.
Ok, i'll apply this patch to both running servers doing logging.
Something in dovecot does :) If a user without a maildir logs in for the first time, the dir is created.
Probably would be good to find out what exactly, in case that's caused by some security hole :)
Sorry, my mistake. It's not dovecot, it's our LDA (not dovecot) thats making the dir (which means we now get an error if a user logs in and they have never had email.. but that never happens since we mail them on signup..interesting little problem :)
Cor
Some preliminary findings..
Changing the kernel seems to have a positive effect on the load. I changed from 2.6.27.46 to 2.6.27.54 (sorry, im bound by locally available kernels due to a kernel patch we created to fix some NFS problems in the linux kernel. Patch should be available in the stock linux kernel as of 2.6.36, but we dont have that kernel locally available yet), At similar user levels the load is now back to what it was. It doesnt however influence the context switches. Still very high. Here's a graph of a server where I changed the kernel and nothing else:
load: http://grab.by/7Odt (i rebooted the server with a new kernel at the point where the high load drops to 0)
cs: http://grab.by/7Odw
Here's a server with the old kernel, but imap-login { service_count=0 }. It seems this doesnt have much impact. cs are lower (but higher than 1.2), probably because user levels havent gone up to normal levels yet due to loadbalancer distribution issues after a reboot)
load: http://grab.by/7OdL cs: http://grab.by/7OdN
Ive now also set up a server with new kernel and service_count = 0, just to cover all the bases. But not enough data on that server yet. It takes a while for user levels to creep up once a server gets rebooted as users are sticky to a server once they're linked to it.
Cor
- Timo Sirainen tss@iki.fi:
So with Ralf's previous statistics for one hour:
v2.0: 6887 imap logouts (processes) 26672928 voluntary context switches (3872 / process) 1313631 involuntary context switches (190 / process)
v1.2: 6832 imap logouts (processes) 1003200 voluntary context switches (146 / process) 155566 involuntary context switches (22 / process)
The v1.2 values look pretty good. v2.0's involuntary context switches isn't too bad either. So where do all the 3700 new voluntary context switches come from? The new method of initializing user logins can't add more than a few more of those. Running imaptest locally I get similar volcs values for v1.2 and v2.0. Wonder if you get different values?
How exactly would I run imaptest...?
If you don't mind some huge logs, you could also try the attached patch that logs the voluntary context switch count for every executed IMAP command. Maybe some command shows up that generates them much more than others.
I can try that, but today and friday I'm busy, and then it's weekend :(
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
On 9.12.2010, at 7.57, Ralf Hildebrandt wrote:
The v1.2 values look pretty good. v2.0's involuntary context switches isn't too bad either. So where do all the 3700 new voluntary context switches come from? The new method of initializing user logins can't add more than a few more of those. Running imaptest locally I get similar volcs values for v1.2 and v2.0. Wonder if you get different values?
How exactly would I run imaptest...?
Create a test account for it (or you're more or less accidentally destroy your INBOX) and then run e.g.:
imaptest user=testuser pass=testpass logout=0 secs=10 clients=1
Then compare the voluntary context switch counts logged by the imap process.
http://imapwiki.org/ImapTest has more info about installing and running. Especially note http://imapwiki.org/ImapTest/Running#Append_mbox
participants (15)
-
Charles Marcus
-
ckubu
-
Cor Bosman
-
Daniel L. Miller
-
David Ford
-
Diederick van Dijk
-
Jerry
-
Jose Celestino
-
Mario Antonio
-
Mark Moseley
-
Noel Butler
-
Ralf Hildebrandt
-
Robert Schetterer
-
Stan Hoeppner
-
Timo Sirainen