[Dovecot] FreeBSD 6.1-RELEASE/kqueue high CPU load
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка +7 (495) 589 68 81 +7 (926) 575 22 11 http://www.elantech.ru info@elantech.ru
Taras Savchuk wrote:
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
I have also experienced this behavior, which is quite bothersome because I sold my supervisor on Dovecot by talking about performance and low resource requirements...
--Paul Lathrop Systems Administrator SquareTrade, Inc.
Paul Lathrop wrote:
Taras Savchuk wrote:
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
I have also experienced this behavior, which is quite bothersome because I sold my supervisor on Dovecot by talking about performance and low resource requirements...
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
Xin LI delphij@delphij.net http://www.delphij.net/ FreeBSD - The Power to Serve!
LI Xin пишет:
Paul Lathrop wrote:
Taras Savchuk wrote:
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
I have also experienced this behavior, which is quite bothersome because I sold my supervisor on Dovecot by talking about performance and low resource requirements...
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
I cought imap while eating much CPU (~80%) and run ktrace -t + -p <bad process pid>. It's kdump -E output, many gettimeofday() calls, is it normal? :
mail2# kdump -E -f ktrace.out | less 26208 imap 0.000000 RET kevent 0 26208 imap 0.000042 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.000046 RET gettimeofday 0 26208 imap 0.000061 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.000064 RET gettimeofday 0 26208 imap 0.000078 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.000082 RET kevent 0 26208 imap 0.000100 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.000103 RET gettimeofday 0 26208 imap 0.000116 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.000119 RET gettimeofday 0 26208 imap 0.000131 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.474998 RET kevent 0 26208 imap 0.475010 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.475014 RET gettimeofday 0 26208 imap 0.475027 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.475030 RET gettimeofday 0 26208 imap 0.475043 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.475046 RET kevent 0 26208 imap 0.475058 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.475062 RET gettimeofday 0 26208 imap 0.475076 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.475079 RET gettimeofday 0 26208 imap 0.475098 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.550005 RET kevent 0 26208 imap 0.550017 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.550020 RET gettimeofday 0 26208 imap 0.550033 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.550037 RET gettimeofday 0 26208 imap 0.550050 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.550054 RET kevent 0 26208 imap 0.550066 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.550070 RET gettimeofday 0 26208 imap 0.550083 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.550087 RET gettimeofday 0 26208 imap 0.550104 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 1.001030 RET kevent 0 26208 imap 1.001047 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 1.001051 RET gettimeofday 0 26208 imap 1.001065 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 1.001068 RET gettimeofday 0 26208 imap 1.001080 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 1.001083 RET kevent 0 26208 imap 1.001096 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 1.001100 RET gettimeofday 0 26208 imap 1.001113 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 1.001117 RET gettimeofday 0 26208 imap 1.001133 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 1.551061 RET kevent 0 26208 imap 1.551076 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 1.551079 RET gettimeofday 0 26208 imap 1.551093 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 1.551096 RET gettimeofday 0 26208 imap 1.551108 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 1.551112 RET kevent 0 26208 imap 1.551125 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 1.551129 RET gettimeofday 0 26208 imap 1.551142 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 1.551146 RET gettimeofday 0 26208 imap 1.551162 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 2.002105 RET kevent 0 26208 imap 2.002128 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 2.002132 RET gettimeofday 0 26208 imap 2.002148 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 2.002151 RET gettimeofday 0 26208 imap 2.002165 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 2.002168 RET kevent 0 26208 imap 2.002182 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 2.002186 RET gettimeofday 0 26208 imap 2.002204 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 2.002207 RET gettimeofday 0 26208 imap 2.002237 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 2.551119 RET kevent 0 26208 imap 2.551142 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 2.551146 RET gettimeofday 0 26208 imap 2.551160 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 2.551164 RET gettimeofday 0 26208 imap 2.551177 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 2.551180 RET kevent 0 26208 imap 2.551194 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 2.551198 RET gettimeofday 0 26208 imap 2.551216 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 2.551219 RET gettimeofday 0 26208 imap 2.551231 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 3.003152 RET kevent 0 26208 imap 3.003171 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 3.003175 RET gettimeofday 0 26208 imap 3.003189 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 3.003192 RET gettimeofday 0 26208 imap 3.003204 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 3.003208 RET kevent 0 26208 imap 3.003221 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 3.003225 RET gettimeofday 0 26208 imap 3.003240 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 3.003243 RET gettimeofday 0 26208 imap 3.003260 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 3.552168 RET kevent 0 26208 imap 3.552190 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 3.552193 RET gettimeofday 0 26208 imap 3.552207 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 3.552211 RET gettimeofday 0 26208 imap 3.552224 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 3.552227 RET kevent 0 26208 imap 3.552241 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 3.552244 RET gettimeofday 0 26208 imap 3.552282 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 3.552285 RET gettimeofday 0 26208 imap 3.552298 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 4.004222 RET kevent 0 26208 imap 4.004248 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 4.004252 RET gettimeofday 0 26208 imap 4.004267 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 4.004271 RET gettimeofday 0 26208 imap 4.004283 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 4.004287 RET kevent 0 26208 imap 4.004306 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 4.004310 RET gettimeofday 0 26208 imap 4.004322 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 4.004325 RET gettimeofday 0 26208 imap 4.004337 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 4.553216 RET kevent 0 26208 imap 4.553239 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 4.553243 RET gettimeofday 0 26208 imap 4.553258 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 4.553261 RET gettimeofday 0 26208 imap 4.553274 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 4.553278 RET kevent 0 26208 imap 4.553291 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 4.553295 RET gettimeofday 0 26208 imap 4.553315 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 4.553318 RET gettimeofday 0 26208 imap 4.553330 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 5.005249 RET kevent 0 26208 imap 5.005275 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 5.005278 RET gettimeofday 0 26208 imap 5.005293 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 5.005296 RET gettimeofday 0 26208 imap 5.005309 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 5.005313 RET kevent 0 26208 imap 5.005327 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 5.005330 RET gettimeofday 0 26208 imap 5.005371 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 5.005375 RET gettimeofday 0 26208 imap 5.005387 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 5.554266 RET kevent 0 26208 imap 5.554288 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 5.554292 RET gettimeofday 0 26208 imap 5.554307 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 5.554311 RET gettimeofday 0 26208 imap 5.554324 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 5.554327 RET kevent 0 26208 imap 5.554341 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 5.554345 RET gettimeofday 0 26208 imap 5.554363 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 5.554366 RET gettimeofday 0 26208 imap 5.554378 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 6.006322 RET kevent 0 26208 imap 6.006347 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 6.006350 RET gettimeofday 0 26208 imap 6.006366 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 6.006369 RET gettimeofday 0 26208 imap 6.006383 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 6.006387 RET kevent 0 26208 imap 6.006400 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 6.006404 RET gettimeofday 0 26208 imap 6.006423 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 6.006426 RET gettimeofday 0 26208 imap 6.006439 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 6.555345 RET kevent 0 26208 imap 6.555369 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 6.555372 RET gettimeofday 0 26208 imap 6.555387 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 6.555390 RET gettimeofday 0 26208 imap 6.555403 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 6.555407 RET kevent 0 26208 imap 6.555445 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 6.555448 RET gettimeofday 0 26208 imap 6.555462 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 6.555466 RET gettimeofday 0 26208 imap 6.555478 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 7.007380 RET kevent 0 26208 imap 7.007401 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 7.007405 RET gettimeofday 0
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка +7 (495) 589 68 81 +7 (926) 575 22 11 http://www.elantech.ru info@elantech.ru
Taras Savchuk wrote:
LI Xin пишет:
Paul Lathrop wrote:
Taras Savchuk wrote:
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
I have also experienced this behavior, which is quite bothersome because I sold my supervisor on Dovecot by talking about performance and low resource requirements...
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
I cought imap while eating much CPU (~80%) and run ktrace -t + -p <bad process pid>. It's kdump -E output, many gettimeofday() calls, is it normal? :
mail2# kdump -E -f ktrace.out | less 26208 imap 0.000000 RET kevent 0 26208 imap 0.000042 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.000046 RET gettimeofday 0 26208 imap 0.000061 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.000064 RET gettimeofday 0 26208 imap 0.000078 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.000082 RET kevent 0 26208 imap 0.000100 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.000103 RET gettimeofday 0 26208 imap 0.000116 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.000119 RET gettimeofday 0 26208 imap 0.000131 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.474998 RET kevent 0 [snip]
This looks quite normal... I am not really sure about what is happening. Is the CPU consumed by in-kernel operation? (systat -vm or top's output would be helpful)
Cheers,
Xin LI delphij@delphij.net http://www.delphij.net/ FreeBSD - The Power to Serve!
LI Xin пишет:
Taras Savchuk wrote:
LI Xin пишет:
Paul Lathrop wrote:
Taras Savchuk wrote:
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
I have also experienced this behavior, which is quite bothersome because I sold my supervisor on Dovecot by talking about performance and low resource requirements...
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
I cought imap while eating much CPU (~80%) and run ktrace -t + -p <bad process pid>. It's kdump -E output, many gettimeofday() calls, is it normal? :
mail2# kdump -E -f ktrace.out | less 26208 imap 0.000000 RET kevent 0 26208 imap 0.000042 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.000046 RET gettimeofday 0 26208 imap 0.000061 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.000064 RET gettimeofday 0 26208 imap 0.000078 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.000082 RET kevent 0 26208 imap 0.000100 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.000103 RET gettimeofday 0 26208 imap 0.000116 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.000119 RET gettimeofday 0 26208 imap 0.000131 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.474998 RET kevent 0
[snip]
This looks quite normal... I am not really sure about what is happening. Is the CPU consumed by in-kernel operation? (systat -vm or top's output would be helpful)
Cheers,
0,5 second for kevent call is normal?
26208 imap 0.000131 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.474998 RET kevent 0
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка +7 (495) 589 68 81 +7 (926) 575 22 11 http://www.elantech.ru info@elantech.ru
Taras Savchuk wrote:
LI Xin пишет:
Taras Savchuk wrote:
LI Xin пишет:
Paul Lathrop wrote:
Taras Savchuk wrote:
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
I have also experienced this behavior, which is quite bothersome because I sold my supervisor on Dovecot by talking about performance and low resource requirements...
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
I cought imap while eating much CPU (~80%) and run ktrace -t + -p <bad process pid>. It's kdump -E output, many gettimeofday() calls, is it normal? :
mail2# kdump -E -f ktrace.out | less 26208 imap 0.000000 RET kevent 0 26208 imap 0.000042 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.000046 RET gettimeofday 0 26208 imap 0.000061 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.000064 RET gettimeofday 0 26208 imap 0.000078 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.000082 RET kevent 0 26208 imap 0.000100 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.000103 RET gettimeofday 0 26208 imap 0.000116 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.000119 RET gettimeofday 0 26208 imap 0.000131 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.474998 RET kevent 0
[snip]
This looks quite normal... I am not really sure about what is happening. Is the CPU consumed by in-kernel operation? (systat -vm or top's output would be helpful)
Cheers,
0,5 second for kevent call is normal?
26208 imap 0.000131 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.474998 RET kevent 0
It seems that the call is from lib/ioloop-kqueue.c's io_loop_handler_run(), but I did not really follow the logic there. Personally I think io_loop_get_wait_time() can be improved in some (tricky?) way so we get better performance on FreeBSD without getting the penalty of its expensive gettimeofday() call.
Cheers,
Xin LI delphij@delphij.net http://www.delphij.net/ FreeBSD - The Power to Serve!
LI Xin пишет:
Taras Savchuk wrote:
LI Xin пишет:
Paul Lathrop wrote:
Taras Savchuk wrote:
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
I have also experienced this behavior, which is quite bothersome because I sold my supervisor on Dovecot by talking about performance and low resource requirements...
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
I cought imap while eating much CPU (~80%) and run ktrace -t + -p <bad process pid>. It's kdump -E output, many gettimeofday() calls, is it normal? :
mail2# kdump -E -f ktrace.out | less 26208 imap 0.000000 RET kevent 0 26208 imap 0.000042 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.000046 RET gettimeofday 0 26208 imap 0.000061 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.000064 RET gettimeofday 0 26208 imap 0.000078 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.000082 RET kevent 0 26208 imap 0.000100 CALL gettimeofday(0x80c2c60,0x80c2c68) 26208 imap 0.000103 RET gettimeofday 0 26208 imap 0.000116 CALL gettimeofday(0xbfbfe9a0,0) 26208 imap 0.000119 RET gettimeofday 0 26208 imap 0.000131 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 26208 imap 0.474998 RET kevent 0
[snip]
This looks quite normal... I am not really sure about what is happening. Is the CPU consumed by in-kernel operation? (systat -vm or top's output would be helpful)
Cheers,
Here are top's ans systat -vm 's output:
mail2# top
last pid: 68406; load averages: 1.09, 0.55,
0.38 up 1+09:40:06
03:18:34
139 processes: 2 running, 137 sleeping
CPU states: 7.1% user, 0.0% nice, 43.0% system, 0.0% interrupt, 49.9%
idle
Mem: 942M Active, 777M Inact, 175M Wired, 85M Cache, 112M Buf, 24M Free
Swap: 2022M Total, 2022M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 61291 email 1 114 0 3528K 2120K RUN 0 3:09 85.50% imap 62547 email 1 4 0 6168K 2644K kqread 0 0:32 3.81% imap 60522 email 1 4 0 3416K 1860K kqread 0 0:23 0.24% imap 2083 10005 1 4 0 198M 95412K accept 0 27:40 0.00% httpd 2085 10005 1 4 0 198M 93952K accept 0 16:09 0.00% httpd 2084 10005 1 4 0 198M 94912K accept 0 16:01 0.00% httpd 2231 mysql 1 4 14 524M 409M sbwait 0 9:37 0.00% mysqld 1569 10003 1 4 0 153M 31936K accept 0 7:47 0.00% httpd 1570 10003 1 4 0 159M 38560K accept 0 7:46 0.00% httpd 1573 mysql 1 4 14 524M 409M sbwait 0 6:09 0.00% mysqld 2232 mysql 1 4 14 524M 409M sbwait 0 6:08 0.00% mysqld 2229 mysql 1 4 14 524M 409M sbwait 0 3:13 0.00% mysqld 1565 mysql 1 4 14 524M 409M sbwait 0 3:11 0.00% mysqld 1213 10001 1 4 0 171M 61252K accept 0 1:54 0.00% httpd 1212 10001 1 4 0 153M 43660K accept 0 1:51 0.00% httpd 1211 10001 1 4 0 153M 42052K accept 0 1:51 0.00% httpd 1010 elantech 1 20 0 155M 77364K lockf 0 1:38 0.00% httpd 1009 elantech 1 4 0 168M 93660K kqread 0 1:37 0.00% httpd 1011 elantech 1 20 0 161M 86424K lockf 0 1:33 0.00% httpd 2234 mysql 1 4 14 524M 409M sbwait 0 1:01 0.00% mysqld 2184 mysql 1 4 14 524M 409M sbwait 0 0:57 0.00% mysqld 2227 mysql 1 4 14 524M 409M sbwait 0 0:56 0.00% mysqld 1366 10002 1 20 0 161M 48900K lockf 0 0:39 0.00% httpd 1367 10002 1 4 0 153M 41364K kqread 0 0:38 0.00% httpd 662 mysql 1 20 14 524M 409M pause 0 0:13 0.00% mysqld 658 mysql 1 110 14 524M 409M select 0 0:09 0.00% mysqld 2230 mysql 1 4 14 524M 409M sbwait 0 0:09 0.00% mysqld 2233 mysql 1 4 14 524M 409M sbwait 0 0:08 0.00% mysqld 657 mysql 1 110 14 524M 409M select 0 0:08 0.00% mysqld 611 mysql 1 20 14 524M 409M pause 0 0:06 0.00% mysqld 405 bind 1 4 0 5672K 4420K select 0 0:04 0.00% named 613 root 1 4 0 1396K 964K kqread 0 0:03 0.00% dovecot 65753 root 1 96 0 2632K 1680K CPU0 0 0:03 0.00% top 632 root 1 4 0 2412K 1624K kqread 0 0:02 0.00% dovecot-auth 610 mysql 1 20 14 524M 409M pause 0 0:02 0.00% mysqld mail2# systat -vm 2 users Load 1.08 0.56 0.39 19 сен 03:18
Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER
Tot Share Tot Share Free in out in out
Act 967396 18100 1858368 36796 111372 count
All 2021620 26420 83531252 51592 pages
Interrupts
Proc:r p d s w Csw Trp Sys Int Sof Flt cow 4078 total
2 136 266 177 4433 89 133 179244 wire
16: bge
964268 act
20: ata
50.0%Sys 0.0%Intr 0.0%User 0.0%Nice 50.0%Idl 795616 inact 2039
cpu0: time
| | | | | | | | | | 86660 cache 2039
cpu1: time
========================= 24712 free
daefr
Namei Name-cache Dir-cache prcfr
Calls hits % hits % react
4 4 100 pdwake
133 zfod pdpgs
Disks ad4 ozfod intrn
KB/t 0.00 %slo-z 114464 buf
tps 0 tfree 97 dirtybuf
MB/s 0.00 100000 desiredvnodes
% busy 0 87066 numvnodes
24995 freevnodes
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка +7 (495) 589 68 81 +7 (926) 575 22 11 http://www.elantech.ru info@elantech.ru
Taras Savchuk wrote: [snip]
Here are top's ans systat -vm 's output:
mail2# top last pid: 68406; load averages: 1.09, 0.55, 0.38 up 1+09:40:06 03:18:34 139 processes: 2 running, 137 sleeping CPU states: 7.1% user, 0.0% nice, 43.0% system, 0.0% interrupt, 49.9% idle Mem: 942M Active, 777M Inact, 175M Wired, 85M Cache, 112M Buf, 24M Free Swap: 2022M Total, 2022M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 61291 email 1 114 0 3528K 2120K RUN 0 3:09 85.50% imap 62547 email 1 4 0 6168K 2644K kqread 0 0:32 3.81% imap 60522 email 1 4 0 3416K 1860K kqread 0 0:23 0.24% imap 2083 10005 1 4 0 198M 95412K accept 0 27:40 0.00% httpd 2085 10005 1 4 0 198M 93952K accept 0 16:09 0.00% httpd 2084 10005 1 4 0 198M 94912K accept 0 16:01 0.00% httpd 2231 mysql 1 4 14 524M 409M sbwait 0 9:37 0.00% mysqld 1569 10003 1 4 0 153M 31936K accept 0 7:47 0.00% httpd 1570 10003 1 4 0 159M 38560K accept 0 7:46 0.00% httpd 1573 mysql 1 4 14 524M 409M sbwait 0 6:09 0.00% mysqld 2232 mysql 1 4 14 524M 409M sbwait 0 6:08 0.00% mysqld 2229 mysql 1 4 14 524M 409M sbwait 0 3:13 0.00% mysqld 1565 mysql 1 4 14 524M 409M sbwait 0 3:11 0.00% mysqld 1213 10001 1 4 0 171M 61252K accept 0 1:54 0.00% httpd 1212 10001 1 4 0 153M 43660K accept 0 1:51 0.00% httpd 1211 10001 1 4 0 153M 42052K accept 0 1:51 0.00% httpd 1010 elantech 1 20 0 155M 77364K lockf 0 1:38 0.00% httpd 1009 elantech 1 4 0 168M 93660K kqread 0 1:37 0.00% httpd 1011 elantech 1 20 0 161M 86424K lockf 0 1:33 0.00% httpd 2234 mysql 1 4 14 524M 409M sbwait 0 1:01 0.00% mysqld 2184 mysql 1 4 14 524M 409M sbwait 0 0:57 0.00% mysqld 2227 mysql 1 4 14 524M 409M sbwait 0 0:56 0.00% mysqld 1366 10002 1 20 0 161M 48900K lockf 0 0:39 0.00% httpd 1367 10002 1 4 0 153M 41364K kqread 0 0:38 0.00% httpd 662 mysql 1 20 14 524M 409M pause 0 0:13 0.00% mysqld 658 mysql 1 110 14 524M 409M select 0 0:09 0.00% mysqld 2230 mysql 1 4 14 524M 409M sbwait 0 0:09 0.00% mysqld 2233 mysql 1 4 14 524M 409M sbwait 0 0:08 0.00% mysqld 657 mysql 1 110 14 524M 409M select 0 0:08 0.00% mysqld 611 mysql 1 20 14 524M 409M pause 0 0:06 0.00% mysqld 405 bind 1 4 0 5672K 4420K select 0 0:04 0.00% named 613 root 1 4 0 1396K 964K kqread 0 0:03 0.00% dovecot 65753 root 1 96 0 2632K 1680K CPU0 0 0:03 0.00% top 632 root 1 4 0 2412K 1624K kqread 0 0:02 0.00% dovecot-auth 610 mysql 1 20 14 524M 409M pause 0 0:02 0.00% mysqld mail2# systat -vm 2 users Load 1.08 0.56 0.39 19 сен 03:18
Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 967396 18100 1858368 36796 111372 count All 2021620 26420 83531252 51592 pages Interrupts Proc:r p d s w Csw Trp Sys Int Sof Flt cow 4078 total 2 136 266 177 4433 89 133 179244 wire 16: bge 964268 act 20: ata 50.0%Sys 0.0%Intr 0.0%User 0.0%Nice 50.0%Idl 795616 inact 2039 cpu0: time | | | | | | | | | | 86660 cache 2039 cpu1: time ========================= 24712 free daefr Namei Name-cache Dir-cache prcfr Calls hits % hits % react 4 4 100 pdwake 133 zfod pdpgs Disks ad4 ozfod intrn KB/t 0.00 %slo-z 114464 buf tps 0 tfree 97 dirtybuf MB/s 0.00 100000 desiredvnodes % busy 0 87066 numvnodes 24995 freevnodes
Do you actually have 4 CPUs, or are they hyperthreads from two CPU (or cores)? Trying disabling hyperthreading from BIOS may help reducing overall interrupts, and this looks like some bad thing happens in kernel.
Cheers,
Xin LI delphij@delphij.net http://www.delphij.net/ FreeBSD - The Power to Serve!
LI Xin пишет:
Taras Savchuk wrote: [snip]
Here are top's ans systat -vm 's output:
mail2# top last pid: 68406; load averages: 1.09, 0.55, 0.38 up 1+09:40:06 03:18:34 139 processes: 2 running, 137 sleeping CPU states: 7.1% user, 0.0% nice, 43.0% system, 0.0% interrupt, 49.9% idle Mem: 942M Active, 777M Inact, 175M Wired, 85M Cache, 112M Buf, 24M Free Swap: 2022M Total, 2022M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 61291 email 1 114 0 3528K 2120K RUN 0 3:09 85.50% imap 62547 email 1 4 0 6168K 2644K kqread 0 0:32 3.81% imap 60522 email 1 4 0 3416K 1860K kqread 0 0:23 0.24% imap 2083 10005 1 4 0 198M 95412K accept 0 27:40 0.00% httpd 2085 10005 1 4 0 198M 93952K accept 0 16:09 0.00% httpd 2084 10005 1 4 0 198M 94912K accept 0 16:01 0.00% httpd 2231 mysql 1 4 14 524M 409M sbwait 0 9:37 0.00% mysqld 1569 10003 1 4 0 153M 31936K accept 0 7:47 0.00% httpd 1570 10003 1 4 0 159M 38560K accept 0 7:46 0.00% httpd 1573 mysql 1 4 14 524M 409M sbwait 0 6:09 0.00% mysqld 2232 mysql 1 4 14 524M 409M sbwait 0 6:08 0.00% mysqld 2229 mysql 1 4 14 524M 409M sbwait 0 3:13 0.00% mysqld 1565 mysql 1 4 14 524M 409M sbwait 0 3:11 0.00% mysqld 1213 10001 1 4 0 171M 61252K accept 0 1:54 0.00% httpd 1212 10001 1 4 0 153M 43660K accept 0 1:51 0.00% httpd 1211 10001 1 4 0 153M 42052K accept 0 1:51 0.00% httpd 1010 elantech 1 20 0 155M 77364K lockf 0 1:38 0.00% httpd 1009 elantech 1 4 0 168M 93660K kqread 0 1:37 0.00% httpd 1011 elantech 1 20 0 161M 86424K lockf 0 1:33 0.00% httpd 2234 mysql 1 4 14 524M 409M sbwait 0 1:01 0.00% mysqld 2184 mysql 1 4 14 524M 409M sbwait 0 0:57 0.00% mysqld 2227 mysql 1 4 14 524M 409M sbwait 0 0:56 0.00% mysqld 1366 10002 1 20 0 161M 48900K lockf 0 0:39 0.00% httpd 1367 10002 1 4 0 153M 41364K kqread 0 0:38 0.00% httpd 662 mysql 1 20 14 524M 409M pause 0 0:13 0.00% mysqld 658 mysql 1 110 14 524M 409M select 0 0:09 0.00% mysqld 2230 mysql 1 4 14 524M 409M sbwait 0 0:09 0.00% mysqld 2233 mysql 1 4 14 524M 409M sbwait 0 0:08 0.00% mysqld 657 mysql 1 110 14 524M 409M select 0 0:08 0.00% mysqld 611 mysql 1 20 14 524M 409M pause 0 0:06 0.00% mysqld 405 bind 1 4 0 5672K 4420K select 0 0:04 0.00% named 613 root 1 4 0 1396K 964K kqread 0 0:03 0.00% dovecot 65753 root 1 96 0 2632K 1680K CPU0 0 0:03 0.00% top 632 root 1 4 0 2412K 1624K kqread 0 0:02 0.00% dovecot-auth 610 mysql 1 20 14 524M 409M pause 0 0:02 0.00% mysqld mail2# systat -vm 2 users Load 1.08 0.56 0.39 19 сен 03:18
Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 967396 18100 1858368 36796 111372 count All 2021620 26420 83531252 51592 pages Interrupts Proc:r p d s w Csw Trp Sys Int Sof Flt cow 4078 total 2 136 266 177 4433 89 133 179244 wire 16: bge 964268 act 20: ata 50.0%Sys 0.0%Intr 0.0%User 0.0%Nice 50.0%Idl 795616 inact 2039 cpu0: time | | | | | | | | | | 86660 cache 2039 cpu1: time ========================= 24712 free daefr Namei Name-cache Dir-cache prcfr Calls hits % hits % react 4 4 100 pdwake 133 zfod pdpgs Disks ad4 ozfod intrn KB/t 0.00 %slo-z 114464 buf tps 0 tfree 97 dirtybuf MB/s 0.00 100000 desiredvnodes % busy 0 87066 numvnodes 24995 freevnodes
Do you actually have 4 CPUs, or are they hyperthreads from two CPU (or cores)? Trying disabling hyperthreading from BIOS may help reducing overall interrupts, and this looks like some bad thing happens in kernel.
Cheers,
It's P4-3.06Ghz with HT.
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка +7 (495) 589 68 81 +7 (926) 575 22 11 http://www.elantech.ru info@elantech.ru
LI Xin wrote:
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
My ktrace/kdump output is huge. Here is a typical excerpt:
44146 imap 0.005657 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005661 RET gettimeofday 0 44146 imap Events dropped. 44146 imap 0.005667 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap 0.005671 RET kevent 1 44146 imap 0.005674 CALL gettimeofday(0xbfbfe8f0,0) 44146 imap 0.005677 RET gettimeofday 0 44146 imap 0.005680 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap 0.005683 RET kevent 1 44146 imap 0.005686 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005689 RET gettimeofday 0 44146 imap 0.005692 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005695 RET gettimeofday 0 44146 imap 0.005697 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap 0.005701 RET kevent 1 44146 imap 0.005704 CALL gettimeofday(0xbfbfe8f0,0) 44146 imap 0.005708 RET gettimeofday 0 44146 imap 0.005710 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap 0.005713 RET kevent 1 44146 imap 0.005716 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005720 RET gettimeofday 0 44146 imap 0.005722 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005725 RET gettimeofday 0 44146 imap 0.005728 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap Events dropped. 44146 imap 0.005763 RET kevent 1 44146 imap Events dropped. 44146 imap 0.005768 RET gettimeofday 0 44146 imap 0.005771 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap 0.005774 RET kevent 1 44146 imap 0.005777 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005780 RET gettimeofday 0 44146 imap 0.005782 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005786 RET gettimeofday 0 44146 imap 0.005789 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap 0.005793 RET kevent 1 44146 imap 0.005795 CALL gettimeofday(0xbfbfe8f0,0) 44146 imap 0.005798 RET gettimeofday 0 44146 imap 0.005800 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap Events dropped.
I haven't been able to get a .core file yet. More info as I catch it...
--Paul
Paul Lathrop wrote:
LI Xin wrote:
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
My ktrace/kdump output is huge. Here is a typical excerpt:
44146 imap 0.005657 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005661 RET gettimeofday 0 44146 imap Events dropped. 44146 imap 0.005667 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap 0.005671 RET kevent 1 44146 imap 0.005674 CALL gettimeofday(0xbfbfe8f0,0) 44146 imap 0.005677 RET gettimeofday 0 44146 imap 0.005680 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap 0.005683 RET kevent 1 44146 imap 0.005686 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005689 RET gettimeofday 0 44146 imap 0.005692 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005695 RET gettimeofday 0 44146 imap 0.005697 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap 0.005701 RET kevent 1 44146 imap 0.005704 CALL gettimeofday(0xbfbfe8f0,0) 44146 imap 0.005708 RET gettimeofday 0 44146 imap 0.005710 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap 0.005713 RET kevent 1 44146 imap 0.005716 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005720 RET gettimeofday 0 44146 imap 0.005722 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005725 RET gettimeofday 0 44146 imap 0.005728 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap Events dropped. 44146 imap 0.005763 RET kevent 1 44146 imap Events dropped. 44146 imap 0.005768 RET gettimeofday 0 44146 imap 0.005771 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap 0.005774 RET kevent 1 44146 imap 0.005777 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005780 RET gettimeofday 0 44146 imap 0.005782 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005786 RET gettimeofday 0 44146 imap 0.005789 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap 0.005793 RET kevent 1 44146 imap 0.005795 CALL gettimeofday(0xbfbfe8f0,0) 44146 imap 0.005798 RET gettimeofday 0 44146 imap 0.005800 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap Events dropped.
I haven't been able to get a .core file yet. More info as I catch it...
Er... EPERM here is not normal here. Is there any log indicating some problem when this occour?
Cheers,
Xin LI delphij@delphij.net http://www.delphij.net/ FreeBSD - The Power to Serve!
LI Xin wrote:
My ktrace/kdump output is huge. Here is a typical excerpt:
44146 imap 0.005657 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005661 RET gettimeofday 0 44146 imap Events dropped. 44146 imap 0.005667 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap 0.005671 RET kevent 1 44146 imap 0.005674 CALL gettimeofday(0xbfbfe8f0,0) 44146 imap 0.005677 RET gettimeofday 0 44146 imap 0.005680 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap 0.005683 RET kevent 1 44146 imap 0.005686 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005689 RET gettimeofday 0 44146 imap 0.005692 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005695 RET gettimeofday 0 44146 imap 0.005697 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap 0.005701 RET kevent 1 44146 imap 0.005704 CALL gettimeofday(0xbfbfe8f0,0) 44146 imap 0.005708 RET gettimeofday 0 44146 imap 0.005710 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap 0.005713 RET kevent 1 44146 imap 0.005716 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005720 RET gettimeofday 0 44146 imap 0.005722 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005725 RET gettimeofday 0 44146 imap 0.005728 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap Events dropped. 44146 imap 0.005763 RET kevent 1 44146 imap Events dropped. 44146 imap 0.005768 RET gettimeofday 0 44146 imap 0.005771 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap 0.005774 RET kevent 1 44146 imap 0.005777 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005780 RET gettimeofday 0 44146 imap 0.005782 CALL gettimeofday(0x80d00a0,0x80d00a8) 44146 imap 0.005786 RET gettimeofday 0 44146 imap 0.005789 CALL kevent(0x5,0,0,0xbfbfe890,0x1,0xbfbfe888) 44146 imap 0.005793 RET kevent 1 44146 imap 0.005795 CALL gettimeofday(0xbfbfe8f0,0) 44146 imap 0.005798 RET gettimeofday 0 44146 imap 0.005800 CALL kevent(0x4,0,0,0x80df000,0x4,0xbfbfe8e8) 44146 imap Events dropped.
I haven't been able to get a .core file yet. More info as I catch it...
Er... EPERM here is not normal here. Is there any log indicating some problem when this occour?
Cheers,
Sorry, EPERM? Where do you see that? What does it mean.
As for logs, the only errors I see (on a daily basis, though) are:
+pid 42498 (imap), uid 1016: exited on signal 11 +pid 38323 (imap), uid 1016: exited on signal 11 +pid 43138 (imap), uid 1016: exited on signal 11 +pid 43135 (imap), uid 1016: exited on signal 11 +pid 43804 (imap), uid 1016: exited on signal 11 +pid 38340 (imap), uid 1016: exited on signal 11
Last night, I got this:
+Out of ktrace request objects. +pid 45376 (imap), uid 1016: exited on signal 11 +pid 45403 (imap), uid 1016: exited on signal 11 +pid 45518 (imap), uid 1016: exited on signal 11 +pid 46201 (imap), uid 1016: exited on signal 11 +pid 47169 (imap), uid 1016: exited on signal 11 +pid 47197 (imap), uid 1016: exited on signal 11
And dovecot just died completely. I restarted it this morning.
Hope that helps track this down.
--Paul
44146 imap 0.005793 RET kevent 1
I think what LI Xin was looking at is the return value of kevent(2) here. EPERM is the 1 in the last column.
Sorry, EPERM? Where do you see that? What does it mean.
From man intro(2):
1 EPERM Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resources.
Looking at man kevent(2) there are a selection of errors that can be returned from this system call but EPERM isn't one of them.
-- Dominic
On Wed, 2006-09-20 at 10:45 +0100, Dominic Marks wrote:
44146 imap 0.005793 RET kevent 1
I think what LI Xin was looking at is the return value of kevent(2) here. EPERM is the 1 in the last column.
Sorry, EPERM? Where do you see that? What does it mean.
From man intro(2):
1 EPERM Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resources.
Looking at man kevent(2) there are a selection of errors that can be returned from this system call but EPERM isn't one of them.
Why would the 1 mean EPERM? See kevent()'s man page:
"The kevent() system call returns the number of events placed in the eventlist, up to the value given by nevents."
Timo Sirainen wrote:
On Wed, 2006-09-20 at 10:45 +0100, Dominic Marks wrote:
44146 imap 0.005793 RET kevent 1
I think what LI Xin was looking at is the return value of kevent(2) here. EPERM is the 1 in the last column.
Sorry, EPERM? Where do you see that? What does it mean.
From man intro(2):
1 EPERM Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resources.
Looking at man kevent(2) there are a selection of errors that can be returned from this system call but EPERM isn't one of them.
Why would the 1 mean EPERM? See kevent()'s man page:
"The kevent() system call returns the number of events placed in the eventlist, up to the value given by nevents."
Hmm, well I was speculating. Thanks for pointing that out.
Dominic
Timo Sirainen wrote:
On Wed, 2006-09-20 at 10:45 +0100, Dominic Marks wrote:
44146 imap 0.005793 RET kevent 1 I think what LI Xin was looking at is the return value of kevent(2) here. EPERM is the 1 in the last column.
Sorry, EPERM? Where do you see that? What does it mean. From man intro(2):
1 EPERM Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resources.
Looking at man kevent(2) there are a selection of errors that can be returned from this system call but EPERM isn't one of them.
Why would the 1 mean EPERM? See kevent()'s man page:
"The kevent() system call returns the number of events placed in the eventlist, up to the value given by nevents."
You are right. Sorry for confusion...
Cheers,
Xin LI delphij@delphij.net http://www.delphij.net/ FreeBSD - The Power to Serve!
LI Xin пишет:
Paul Lathrop wrote:
Taras Savchuk wrote:
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
I have also experienced this behavior, which is quite bothersome because I sold my supervisor on Dovecot by talking about performance and low resource requirements...
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
~20000 calls to gettimeofday() per second. Is it normall?
mail2# cat k.dov.l.txt | grep CALL | grep gettimeofday | wc -l 1195552 mail2# head -n 1 k.dov.l.txt 20910 imap 0.000000 CALL gettimeofday(0x80c2c60,0x80c2c68) mail2# tail -n 1 k.dov.l.txt 20910 imap 51.826065 RET gettimeofday 0
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка +7 (495) 589 68 81 +7 (926) 575 22 11 http://www.elantech.ru info@elantech.ru
Taras Savchuk wrote: [snip]
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
~20000 calls to gettimeofday() per second. Is it normall?
No! What's more, gettimeofday() is much more expensive on FreeBSD than its Linux counterpart, as it provides higher resolution time which is not useful for many applications.
My guess is that your imap instance is stuck in a state where either kevent() is called with a zero ts structure (which is effectively a poll) every time, or some other weird thing happend.
mail2# cat k.dov.l.txt | grep CALL | grep gettimeofday | wc -l 1195552 mail2# head -n 1 k.dov.l.txt 20910 imap 0.000000 CALL gettimeofday(0x80c2c60,0x80c2c68) mail2# tail -n 1 k.dov.l.txt 20910 imap 51.826065 RET gettimeofday 0
Are these gettimeofday() calls called one by one? Or is there any other calls between gettimeofday's?
Cheers,
Xin LI delphij@delphij.net http://www.delphij.net/ FreeBSD - The Power to Serve!
LI Xin пишет:
Taras Savchuk wrote: [snip]
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
~20000 calls to gettimeofday() per second. Is it normall?
No! What's more, gettimeofday() is much more expensive on FreeBSD than its Linux counterpart, as it provides higher resolution time which is not useful for many applications.
My guess is that your imap instance is stuck in a state where either kevent() is called with a zero ts structure (which is effectively a poll) every time, or some other weird thing happend.
mail2# cat k.dov.l.txt | grep CALL | grep gettimeofday | wc -l 1195552 mail2# head -n 1 k.dov.l.txt 20910 imap 0.000000 CALL gettimeofday(0x80c2c60,0x80c2c68) mail2# tail -n 1 k.dov.l.txt 20910 imap 51.826065 RET gettimeofday 0
Are these gettimeofday() calls called one by one? Or is there any other calls between gettimeofday's?
Cheers,
It's a statistics for the same ktrace file (52 seconds):
mail2# cat k.dov.l.txt | wc -l 3991592 mail2# cat k.dov.l.txt | grep kevent | grep CALL | wc -l 797033 mail2# cat k.dov.l.txt | grep kevent | grep RET | wc -l 797033 mail2# cat k.dov.l.txt | grep gettimeofday | grep CALL | wc -l 1195552 mail2# cat k.dov.l.txt | grep gettimeofday | grep RET | wc -l 1195556 mail2#
and it's tipical chunk:
20910 imap 51.825657 RET kevent 1 20910 imap 51.825712 CALL gettimeofday(0x80c2c60,0x80c2c68) 20910 imap 51.825715 RET gettimeofday 0 20910 imap 51.825734 CALL gettimeofday(0x80c2c60,0x80c2c68) 20910 imap 51.825737 RET gettimeofday 0 20910 imap 51.825755 CALL kevent(0x5,0,0,0xbfbfe940,0x1,0xbfbfe938) 20910 imap 51.825759 RET kevent 1 20910 imap 51.825778 CALL gettimeofday(0xbfbfe9a0,0) 20910 imap 51.825782 RET gettimeofday 0 20910 imap 51.825801 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 20910 imap 51.825804 RET kevent 1 20910 imap 51.825823 CALL gettimeofday(0x80c2c60,0x80c2c68) 20910 imap 51.825826 RET gettimeofday 0 20910 imap 51.825844 CALL gettimeofday(0x80c2c60,0x80c2c68) 20910 imap 51.825847 RET gettimeofday 0 20910 imap 51.825864 CALL kevent(0x5,0,0,0xbfbfe940,0x1,0xbfbfe938) 20910 imap 51.825869 RET kevent 1 20910 imap 51.825888 CALL gettimeofday(0xbfbfe9a0,0) 20910 imap 51.825892 RET gettimeofday 0 20910 imap 51.825910 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 20910 imap 51.825914 RET kevent 1 20910 imap 51.825932 CALL gettimeofday(0x80c2c60,0x80c2c68) 20910 imap 51.825936 RET gettimeofday 0 20910 imap 51.825953 CALL gettimeofday(0x80c2c60,0x80c2c68) 20910 imap 51.825957 RET gettimeofday 0 20910 imap 51.825974 CALL kevent(0x5,0,0,0xbfbfe940,0x1,0xbfbfe938) 20910 imap 51.825978 RET kevent 1 20910 imap 51.825997 CALL gettimeofday(0xbfbfe9a0,0) 20910 imap 51.826000 RET gettimeofday 0 20910 imap 51.826019 CALL kevent(0x4,0,0,0x80d1000,0x5,0xbfbfe998) 20910 imap 51.826022 RET kevent 1 20910 imap 51.826041 CALL gettimeofday(0x80c2c60,0x80c2c68) 20910 imap 51.826044 RET gettimeofday 0 20910 imap 51.826062 CALL gettimeofday(0x80c2c60,0x80c2c68) 20910 imap 51.826065 RET gettimeofday 0
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка +7 (495) 589 68 81 +7 (926) 575 22 11 http://www.elantech.ru info@elantech.ru
LI Xin пишет:
Taras Savchuk wrote: [snip]
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
~20000 calls to gettimeofday() per second. Is it normall?
No! What's more, gettimeofday() is much more expensive on FreeBSD than its Linux counterpart, as it provides higher resolution time which is not useful for many applications.
My guess is that your imap instance is stuck in a state where either kevent() is called with a zero ts structure (which is effectively a poll) every time, or some other weird thing happend.
mail2# cat k.dov.l.txt | grep CALL | grep gettimeofday | wc -l 1195552 mail2# head -n 1 k.dov.l.txt 20910 imap 0.000000 CALL gettimeofday(0x80c2c60,0x80c2c68) mail2# tail -n 1 k.dov.l.txt 20910 imap 51.826065 RET gettimeofday 0
Are these gettimeofday() calls called one by one? Or is there any other calls between gettimeofday's?
Cheers,
I can make additional testing, if someone give me some tips what to do.
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка http://www.elantech.ru +7 (495) 589 68 81 +7 (926) 575 22 11
LI Xin wrote, On 20.9.2006 17:36:
Taras Savchuk wrote: [snip]
What if you get some output from ktrace/kdump? Additionally, try to force the process to save a coredump and investigate its .core with gdb would help tracking down the issue...
Note that because when using kqueue the process should get a "kqread" if it is waiting for kqueue event. If the process is stuck in "RUN" state then a coredump would be more helpful.
Cheers,
~20000 calls to gettimeofday() per second. Is it normall?
No! What's more, gettimeofday() is much more expensive on FreeBSD than its Linux counterpart, as it provides higher resolution time which is not useful for many applications. True.
My guess is that your imap instance is stuck in a state where either kevent() is called with a zero ts structure (which is effectively a poll) every time, or some other weird thing happend.
Yes, thats exactly what happens. I tried to debug this, once.
But first some introduction. There are two kqueues. One for directory change notification, that is the descriptor 5. And the second is for polling for events on other handles, including the kqueue for directory change notification, descriptor 4. Now, two strange things happen when you see the 100% load. The time to next timer run is always zero or close to zero, and the second is that the descriptor 5 seems to be always signaling some event. I never found out why either of the two things happens.
mail2# cat k.dov.l.txt | grep CALL | grep gettimeofday | wc -l 1195552 mail2# head -n 1 k.dov.l.txt 20910 imap 0.000000 CALL gettimeofday(0x80c2c60,0x80c2c68) mail2# tail -n 1 k.dov.l.txt 20910 imap 51.826065 RET gettimeofday 0
Are these gettimeofday() calls called one by one? Or is there any other calls between gettimeofday's?
Cheers,
-- VH
On Wed, 2006-09-13 at 15:54 +0400, Taras Savchuk wrote:
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
Oh, I thought I had fixed this already for 1.0rc7, but looks like it got fixed only after..:
Hi, Timo,
Timo Sirainen wrote:
On Wed, 2006-09-13 at 15:54 +0400, Taras Savchuk wrote:
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
Oh, I thought I had fixed this already for 1.0rc7, but looks like it got fixed only after..:
I am not really convinced by the change... Are you sure that EV_ADD'ing twice the culprit for EPERM, and that is not a FreeBSD bug? :-)
Cheers,
Xin LI delphij@delphij.net http://www.delphij.net/ FreeBSD - The Power to Serve!
Timo Sirainen пишет:
On Wed, 2006-09-13 at 15:54 +0400, Taras Savchuk wrote:
Hi to ALL!
I have dovecot-1.0r7 installed on FreeBSD 6.1, using kqueue and Maildir (<20 mailboxes, < 0,5 Gb size). Periodically CPU load of imap processes increasing up to 60-80%. Is it normall behavior or not? May be somebody had such kind of problems?
Oh, I thought I had fixed this already for 1.0rc7, but looks like it got fixed only after..:
I'm installed dovecot_rc26, but still have likely the same problem:
CPU states: 1.7% user, 0.0% nice, 42.2% system, 0.0% interrupt, 56.1% idle Mem: 1558M Active, 144M Inact, 193M Wired, 56M Cache, 112M Buf, 51M Free Swap: 2022M Total, 1032K Used, 2021M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME CPU COMMAND 46642 email 1 117 0 3552K 1968K RUN 0 1:15 75.20% imap
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка http://www.elantech.ru +7 (495) 589 68 81 +7 (926) 575 22 11
On Sun, 2007-03-11 at 01:48 +0300, Taras Savchuk wrote:
I'm installed dovecot_rc26, but still have likely the same problem:
46642 email 1 117 0 3552K 1968K RUN 0 1:15 75.20% imap
Could you tell me what it's doing? http://wiki.dovecot.org/Debugging/ProcessTracing
Also a gdb backtrace could help:
gdb attach 46642 bt full
And did you try if the problem goes away by not using kqueue?
Here are ktrace chunks and some statistics (process uses ~70% CPU, states of process - RUN/kqread, ~30-50% system CPU load):
mail2# head -n 5 47651.ktrace.4.log.txt 47651 imap 0.000000 RET lstat 0 47651 imap 0.001906 CALL lstat(0x8111d60,0xbfbfe540) 47651 imap 0.001911 NAMI "/var/email/elantech.ru/taras/Maildir/.Lists.dovecot/cur/1165899508.P13178Q191.mail2.elantech.ru:2,b" 47651 imap 0.001951 RET lstat 0 47651 imap 0.002315 CALL lstat(0x8111d60,0xbfbfe540) mail2# tail -n 5 47651.ktrace.4.log.txt 47651 imap 16.514815 CALL lstat(0x8111ec0,0xbfbfe4e0) 47651 imap 16.514818 NAMI "/var/email/elantech.ru/taras/Maildir/.Lists.dovecot/cur/1166721346.P51379Q78.mail2.elantech.ru:2,b" 47651 imap 16.514842 RET lstat 0 47651 imap 16.514868 CALL lstat(0x8111ec0,0xbfbfe4e0) 47651 imap 16.514871 NAMI "/var/email/elantech.ru/taras/Maildir/.Lists.dovecot/cur/1166721942.P51379Q83.mail2.elantech.ru:2,b" mail2# cat 47651.ktrace.4.log.txt | grep lstat | wc -l 530528 mail2# cat 47651.ktrace.4.log.txt | grep kevent | wc -l 12 mail2# cat 47651.ktrace.4.log.txt | grep gettimeofday | wc -l 30 mail2#
500K lstat calls in 17 seconds is not too much?
Timo Sirainen пишет:
On Sun, 2007-03-11 at 01:48 +0300, Taras Savchuk wrote:
I'm installed dovecot_rc26, but still have likely the same problem:
46642 email 1 117 0 3552K 1968K RUN 0 1:15 75.20% imap
Could you tell me what it's doing? http://wiki.dovecot.org/Debugging/ProcessTracing
Also a gdb backtrace could help:
gdb attach 46642 bt full
And did you try if the problem goes away by not using kqueue?
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка http://www.elantech.ru +7 (495) 589 68 81 +7 (926) 575 22 11
On Sun, 2007-03-11 at 02:45 +0300, Taras Savchuk wrote:
47651 imap 0.001911 NAMI "/var/email/elantech.ru/taras/Maildir/.Lists.dovecot/cur/1165899508.P13178Q191.mail2.elantech.ru:2,b" .. 500K lstat calls in 17 seconds is not too much?
I bet you're using quota plugin with dirsize backend. It really sucks with maildir. Wonder if I should disallow it completely because so many people seem to have that problem..
Yes I'm using it :) Is there plan to optimize quota behavior with maildir in future?
Timo Sirainen пишет:
On Sun, 2007-03-11 at 02:45 +0300, Taras Savchuk wrote:
47651 imap 0.001911 NAMI "/var/email/elantech.ru/taras/Maildir/.Lists.dovecot/cur/1165899508.P13178Q191.mail2.elantech.ru:2,b" .. 500K lstat calls in 17 seconds is not too much?
I bet you're using quota plugin with dirsize backend. It really sucks with maildir. Wonder if I should disallow it completely because so many people seem to have that problem..
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка http://www.elantech.ru +7 (495) 589 68 81 +7 (926) 575 22 11
On Sun, 2007-03-11 at 04:57 +0300, Taras Savchuk wrote:
Yes I'm using it :) Is there plan to optimize quota behavior with maildir in future?
Use maildir++ quota backend and everything should work much faster. http://wiki.dovecot.org/Quota/Maildir
Big thanks! Everything works great!
Timo Sirainen пишет:
On Sun, 2007-03-11 at 04:57 +0300, Taras Savchuk wrote:
Yes I'm using it :) Is there plan to optimize quota behavior with maildir in future?
Use maildir++ quota backend and everything should work much faster. http://wiki.dovecot.org/Quota/Maildir
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка http://www.elantech.ru +7 (495) 589 68 81 +7 (926) 575 22 11
Timo Sirainen wrote:
On Sun, 2007-03-11 at 02:45 +0300, Taras Savchuk wrote:
47651 imap 0.001911 NAMI "/var/email/elantech.ru/taras/Maildir/.Lists.dovecot/cur/1165899508.P13178Q191.mail2.elantech.ru:2,b" .. 500K lstat calls in 17 seconds is not too much?
I bet you're using quota plugin with dirsize backend. It really sucks with maildir. Wonder if I should disallow it completely because so many people seem to have that problem..
or just replace it in the doc, example, wiki:-)
-- Levente "Si vis pacem para bellum!"
dovecot without kqueue:
CPU states: 5.3% user, 3.4% nice, 40.0% system, 0.8% interrupt, 50.7% idle Mem: 1584M Active, 162M Inact, 184M Wired, 69M Cache, 112M Buf, 2996K Free Swap: 2022M Total, 1336K Used, 2021M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME CPU COMMAND 6016 email 1 113 0 3376K 1840K RUN 0 1:13 63.72% imap
Timo Sirainen пишет:
On Sun, 2007-03-11 at 01:48 +0300, Taras Savchuk wrote:
I'm installed dovecot_rc26, but still have likely the same problem:
46642 email 1 117 0 3552K 1968K RUN 0 1:15 75.20% imap
Could you tell me what it's doing? http://wiki.dovecot.org/Debugging/ProcessTracing
Also a gdb backtrace could help:
gdb attach 46642 bt full
And did you try if the problem goes away by not using kqueue?
I recompiled do
-- С уважением, Савчук Тарас ООО "Элантек" : Аутсорсинг ИТ, WEB-разработка http://www.elantech.ru +7 (495) 589 68 81 +7 (926) 575 22 11
participants (8)
-
Dominic Marks
-
Dominic Marks
-
Farkas Levente
-
LI Xin
-
Paul Lathrop
-
Taras Savchuk
-
Timo Sirainen
-
Václav Haisman