Re: [Dovecot] severe performance problem (mail cache related?)
On Fri, February 16, 2007 6:39 am, Timo Sirainen tss@iki.fi said:
On Fri, 2007-02-16 at 03:29 -0800, Dan Price wrote:
Well there you go, I'm a dope. I think I set mmap_disable=yes because I was seeing strange things happening where there would be thousands (millions?) of madvise(DONTNEED) calls-- seemingly forever in a loop. I have not yet tracked that problem down-- next time I see it, I will make sure to get to the bottom of it.
Actually the only place where madvise(MADV_DONTNEED) is called is in the file-cache code which is used only with mmap_disable=yes.
I'm not sure if this is related or not, but we set mmap_disable=yes last week and since then have seen what appears to be an occasional infinite loops. I have been trying to track down some more details before posting to the list, but here is what I have so far. There is no error logged in the dovecot log when this occurs, and it only stops by "kill -9" on the process...
# top -c PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 7832 virtual 25 0 580 496 376 R 71.5 0.0 70:59 0 imap [user@example.com 192.168.1.14]
# lsof -p 7832 COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME imap 7832 virtual cwd DIR 147,0 600 2372995 /data1/mail/ex/example.com/us/user imap 7832 virtual rtd DIR 3,3 4096 2 / imap 7832 virtual txt REG 3,3 412996 789553 /usr/libexec/dovecot/imap imap 7832 virtual mem REG 3,3 33856 771355 /usr/lib/dovecot/lib10_quota_plugin.so imap 7832 virtual mem REG 3,3 7720 771364 /usr/lib/dovecot/imap/lib11_imap_quota_plugin.so imap 7832 virtual mem REG 3,3 106912 278626 /lib/ld-2.3.2.so imap 7832 virtual mem REG 3,3 14868 278653 /lib/libdl-2.3.2.so imap 7832 virtual mem REG 3,3 1571824 278633 /lib/tls/libc-2.3.2.so imap 7832 virtual 0u IPv4 -1140164370 TCP vip.mail17a.r4.iad.mlsrvr.com:imap->director2.r2.iad.emailsrvr.com:56195 (ESTABLISHED) imap 7832 virtual 1u IPv4 -1140164370 TCP vip.mail17a.r4.iad.mlsrvr.com:imap->director2.r2.iad.emailsrvr.com:56195 (ESTABLISHED) imap 7832 virtual 2w FIFO 0,5 3154802927 pipe imap 7832 virtual 3r CHR 1,9 599160 /dev/urandom imap 7832 virtual 4r FIFO 0,5 3154802928 pipe imap 7832 virtual 5w FIFO 0,5 3154802928 pipe imap 7832 virtual 6r FIFO 0,5 3154802929 pipe imap 7832 virtual 7w FIFO 0,5 3154802929 pipe imap 7832 virtual 8u REG 22,1 11112 6602923 /var/index/data1/mail/ex/example.com/us/user/.Elementos enviados/dovecot.index imap 7832 virtual 9u REG 22,1 50824 6602903 /var/index/data1/mail/ex/example.com/us/user/.Elementos enviados/dovecot.index.log imap 7832 virtual 10u REG 147,0 3074 6346008 /data1/mail/ex/example.com/us/user/maildirsize imap 7832 virtual 11w REG 147,0 905650 7062984 /data1/mail/ex/example.com/us/user/.Elementos enviados/tmp/1171483586.P7832Q0M770594.mail17a.r4.iad.mlsrvr.com
# strace -p 7832 read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable)
On Fri, 2007-02-16 at 07:40 -0500, Bill Boebel wrote:
I'm not sure if this is related or not, but we set mmap_disable=yes last week and since then have seen what appears to be an occasional infinite loops. I have been trying to track down some more details before posting to the list, but here is what I have so far. There is no error logged in the dovecot log when this occurs, and it only stops by "kill -9" on the process...
Getting gdb backtrace from the running process would help the most:
gdb attach <pid> bt
On Fri, 2007-02-16 at 07:40 -0500, Bill Boebel wrote:
7832 virtual 25 0 580 496 376 R 71.5 0.0 70:59 0 imap [user@example.com 192.168.1.14] .. read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable) read(0, 0x96ca1f0, 3992) = -1 EAGAIN (Resource temporarily unavailable)
Looks like my rc22 fix broke that then. It didn't loop forever, only until client sent the whole message or until it disconnected, but it did eat all the CPU while waiting.
Fix here:
http://dovecot.org/list/dovecot-cvs/2007-February/007721.html
I guess this is good enough reason for rc23.
participants (2)
-
Bill Boebel
-
Timo Sirainen