Hi Javier,
thanks for your help.
Am 20.05.2012 13:58, schrieb Javier Miguel RodrÃguez:
I know that you are NOT running RHEL / CentOS, but this problem with
1000 child processes bit us hard, read this red hat kernel bugzilla (Timo has comments inside):
https://bugzilla.redhat.com/show_bug.cgi?id=681578
Maybe you are hitting the same limit?
yes maybe. The only strange thing is that I don't see any erros in my dovecot logs. I don't see erros like "Panic: epoll_ctl" ore something else.
I checked my kernel and the patch mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=681578
(comment 31) is not applied. It comes in version 3.0.30 and 3.2.17.
I will see what tomorrow happens under more load. If I have the problem again, I give 3.2.17 a chance.
thanks Urban
Regards
Javier
El 20/05/2012 11:59, Urban Loesch escribió:
Am 19.05.2012 21:05, schrieb Timo Sirainen:
On Wed, 2012-05-16 at 08:59 +0200, Urban Loesch wrote:
The
Server was running about 1 year without any problems. 15Min Load was between 0,5 and max 8. No high IOWAIT. CPU Idletime about 98%.
..
# iostat -k Linux 3.0.28-vs2.3.2.3-rol-em64t (mailstore4) 16.05.2012 _x86_64_ (24 CPU) Did you change the kernel just before it broke? I'd try another version.
The first time it brokes with kernel 2.6.38.8-vs2.3.0.37-rc17. Then I tried it with 3.0.28 and it brokes again. On friday evening I disabled the cgroup feature compleetly and until now it seems to work normally. But this could be because we have weekend and now there are not many connections active. So I have to wait until monday. If it happens again I will try version 3.2.17.
On the other side it could be that the server is overloaded, because this problem happens only when there are more than 1000 tasks active. Sounds strange for me, because it has been
and we made no changes. Also
working without problems since 1 year there were almost more than 1000 tasks
active over the last year and we had no problems.
thanks Urban