Hi all, we're having a bad day with email :)
I have a user who was complaining of poor performance today when opening mailboxes. Total time to open the box was about 1 minute. Upon truss'ing the imap process during an open, I saw the trace quoted below at the end of the email.
As you can see, for each "FETCH" line of response, we were seeing dovecot do a ton of work-- mmaping anon, lots of munmaps(), etc. -- mmap() and munmap() are not free.
The stack trace of at least one of the munmaps is captured below, as well-- deep inside of the mail caching code.
Then, I moved aside the user's .imap directory, assuming that forcing dovecot to rebuild its indices and caches could help-- and sure enough the user reported that mailbox open times had dropped to 18 seconds. I truss'd that open, briefly, and it was just a ton of preads at high speed.
So I don't know why this user was fine for a week, then suddenly hit this. This is a grave concern if I'm going to roll this out from 15 test users to 250 production users. Thanks for your help.
PS: Raw logs are also attached below.
-dp
write(1, " * 1 0 7 9 5 F E T C".., 2050) = 2050 mmap64(0x00000000, 7356416, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFD800000 munmap(0xFE602000, 1048576) = 0 munmap(0xFE502000, 1048576) = 0 munmap(0xFE402000, 1048576) = 0 munmap(0xFE302000, 1048576) = 0 munmap(0xFE202000, 1048576) = 0 munmap(0xFE102000, 1048576) = 0 munmap(0xFE002000, 1048576) = 0 munmap(0xFE000000, 8192) = 0 pread64(9, " n c o m @ o s s 1 >\n\0".., 8192, 7340032) = 8192 mmap64(0x00000000, 7364608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFE800000 munmap(0xFDE04000, 1048576) = 0 munmap(0xFDD04000, 1048576) = 0 munmap(0xFDC04000, 1048576) = 0 munmap(0xFDB04000, 1048576) = 0 munmap(0xFDA04000, 1048576) = 0 munmap(0xFD904000, 1048576) = 0 munmap(0xFD804000, 1048576) = 0 munmap(0xFD802000, 8192) = 0 munmap(0xFD800000, 8192) = 0 pread64(9, " l @ p l a t i n u m >\n".., 8192, 7348224) = 8192 mmap64(0x00000000, 7372800, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFE000000 munmap(0xFEE06000, 1048576) = 0 munmap(0xFED06000, 1048576) = 0 munmap(0xFEC06000, 1048576) = 0 munmap(0xFEB06000, 1048576) = 0 munmap(0xFEA06000, 1048576) = 0 munmap(0xFE906000, 1048576) = 0 munmap(0xFE806000, 1048576) = 0 munmap(0xFE802000, 16384) = 0 munmap(0xFE800000, 8192) = 0 pread64(9, "\0\0\0 0\0\0\0\0 C c : ".., 8192, 7356416) = 8192
ff1c3e14 munmap (100000, bdc000, fe6de000, 2dc000, 0, 1ef0) + 8 00080bb4 file_cache_set_size (c03a0, 0, bdc110, 408, 0, 110) + e0 00080cec file_cache_read (0, bdbd08, bdbd08, 408, 0, 684a8) + a4 000594a0 mail_cache_map (191268, bdbd08, 408, ffbff274, 3, fffffffc) + ec 0005ae20 mail_cache_get_record (0, bdbd08, ffbff274, ffbff274, 0, 0) + 20 0005b00c mail_cache_foreach_rec (1959c8, ffbff2ec, 5b3d8, 0, b5760, 7ed8) + 10 0005b2bc mail_cache_foreach (1, 47ae, 5b3d8, 0, 0, 0) + b4 0005b47c mail_cache_field_exists (0, 47ae, 0, c0000000, b4f10, 88184) + 84 0005b64c mail_cache_lookup_field (1959c8, b5898, 47ae, 0, 191d48, 195560) + c 000531ac index_mail_get_fixed_field (195dd8, 0, 195ea0, 4, ffffffff, ffffffff) + 4c 000545e8 index_mail_set_seq (195dd8, 47ae, 19a1c8, b4b4f, 0, 47ae) + bc 00070dac mail_set_seq (195dd8, 47ae, 1, c2ce8, 0, 39c) + c 0005770c index_storage_search_next (1, 195dd8, 1, bd778, 5769c, c8ae0) + 70 00072024 mailbox_search_next (19a1c8, 195dd8, 3, 2, 0, c2b50) + 14 0002b3bc imap_fetch (c4e20, c4de8, ffbff748, 5, c2eb0, 80000000) + 6c 00026210 cmd_fetch (1, c2e88, 0, 75696400, 26054, 80000000) + 1bc 00029000 cmd_uid (0, c22f8, 0, 8, 2a134, c22f8) + 8c 00029b58 client_handle_input (c2b94, c2e80, b5760, bd778, 7ef8, 2000000) + 138 00029ad0 client_handle_input (0, 2f05e860, b5760, bd778, 45d3834d, b5000) + b0 00029cb0 _client_input (c2b50, c2b50, 26ee, 0, 1, 0) + 68 00086338 io_loop_handler_run (c0348, 0, 0, ffbffa14, 4c, 80000000) + 140 00085c14 io_loop_run (c0348, ff212cb0, 1, b573c, bae80, ff215dbc) + 34 00032064 main (ffbffc18, b4400, b5000, b573c, c1314, ff3a0180) + 3f8 00024b18 _start (0, 0, 0, 0, 0, 0) + 5c
[in]
3 select "Mail/OpenSolaris/discuss" 4 UID fetch 1:* (FLAGS) 5 IDLE DONE 6 close 7 logout
[out, taking 1+ minutes]
- OK [RAWLOG TIMESTAMP] 2007-02-14 14:05:35
- FLAGS (\Answered \Flagged \Deleted \Seen \Draft NonJunk $Forwarded Junk)
- OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft NonJunk $Forwarded Junk \*)] Flags permitted.
- 20462 EXISTS
- 0 RECENT
- OK [UIDVALIDITY 1159980013] UIDs valid
- OK [UIDNEXT 20487] Predicted next UID 3 OK [READ-WRITE] Select completed.
- 1 FETCH (FLAGS (\Seen) UID 1)
- 2 FETCH (FLAGS (\Seen) UID 2)
- 3 FETCH (FLAGS (\Seen) UID 3)
- 4 FETCH (FLAGS (\Seen) UID 4)
- 5 FETCH (FLAGS (\Seen) UID 5)
- 6 FETCH (FLAGS (\Seen) UID 6)
- 7 FETCH (FLAGS (\Seen) UID 7)
- 8 FETCH (FLAGS (\Seen) UID 8)
- 9 FETCH (FLAGS (\Seen) UID 9)
- 10 FETCH (FLAGS (\Seen) UID 10)
- 11 FETCH (FLAGS (\Seen) UID 11)
- 12 FETCH (FLAGS (\Seen) UID 12)
- 13 FETCH (FLAGS (\Seen) UID 13)
- 14 FETCH (FLAGS (\Seen) UID 14)
- 15 FETCH (FLAGS (\Seen) UID 15)
- 16 FETCH (FLAGS (\Seen) UID 16)
- 17 FETCH (FLAGS (\Seen) UID 17)
- 18 FETCH (FLAGS (\Seen) UID 18)
- 19 FETCH (FLAGS (\Seen) UID 19)
- 20 FETCH (FLAGS (\Seen) UID 20)
- 21 FETCH (FLAGS (\Seen) UID 21)
- 22 FETCH (FLAGS (\Seen) UID 22)
- 23 FETCH (FLAGS (\Seen) UID 23) [-----elided-----]
- 20459 FETCH (FLAGS (\Seen) UID 20483)
- 20460 FETCH (FLAGS (\Seen) UID 20484)
- 20461 FETCH (FLAGS (\Seen) UID 20485)
- 20462 FETCH (FLAGS (\Seen) UID 20486) 4 OK Fetch completed.
- idling 5 OK Idle completed. 6 OK Close completed.
- BYE Logging out 7 OK Logout completed.
-- Daniel Price - Solaris Kernel Engineering - dp@eng.sun.com - blogs.sun.com/dp