[Dovecot] Possible sort optimization (?)
Maybe this is just noise... but I can reproduce this fairly reliably.
Mailbox with 21,000+ messages
This query:
a UID SORT RETURN (ALL COUNT) (DATE) UTF-8 SUBJECT "foo"
is always about 10 percent slower than this split query (I've done
this 4-5 times, and the numbers are similar):
a UID SEARCH RETURN (SAVE) CHARSET UTF-8 SUBJECT "foo" b UID SORT RETURN (ALL COUNT) (DATE) UTF-8 UID $
(The particular query I used matched 5 messages out of the 21,000+)
My not-very-scientific benchmarking process:
1.) Stop dovecot process 2.) Delete all dovecot index files for that mailbox 3.) Flush linux paging cache (sync && echo 3 > /proc/sys/vm/drop_caches) 4.) Restart dovecot 5.) Access dovecot via command-line (PREAUTH) 6.) SELECT mailbox 7.) Issue command(s)
Could be a potential area for performance improvement or could simply
be lazy benchmarking.
michael
On 6.2.2013, at 1.02, Michael M Slusarz <slusarz@curecanti.org> wrote:
a UID SORT RETURN (ALL COUNT) (DATE) UTF-8 SUBJECT "foo"
is always about 10 percent slower than this split query (I've done this 4-5 times, and the numbers are similar):
a UID SEARCH RETURN (SAVE) CHARSET UTF-8 SUBJECT "foo" b UID SORT RETURN (ALL COUNT) (DATE) UTF-8 UID $
(The particular query I used matched 5 messages out of the 21,000+)
I think the main difference is that the first command fetches also Date: header from dovecot.index.cache. Did you check if the slowness was because of additional userspace CPU usage (instead of disk IO)?
See if the attached patch makes a difference?
One possible solution would be to do the prefetching when search program is "all", but not otherwise. But if most of the messages match the search query then this is slower..
Quoting Timo Sirainen <tss@iki.fi>:
On 6.2.2013, at 1.02, Michael M Slusarz <slusarz@curecanti.org> wrote:
a UID SORT RETURN (ALL COUNT) (DATE) UTF-8 SUBJECT "foo"
is always about 10 percent slower than this split query (I've done
this 4-5 times, and the numbers are similar):a UID SEARCH RETURN (SAVE) CHARSET UTF-8 SUBJECT "foo" b UID SORT RETURN (ALL COUNT) (DATE) UTF-8 UID $
(The particular query I used matched 5 messages out of the 21,000+)
I think the main difference is that the first command fetches also
Date: header from dovecot.index.cache. Did you check if the slowness
was because of additional userspace CPU usage (instead of disk IO)?See if the attached patch makes a difference?
Without patch - single query (time output):
3.126u 2.763s 1:54.87 5.1% 0+0k 491192+19016io 11pf+0w 3.236u 2.663s 2:14.62 4.3% 0+0k 491064+18616io 9pf+0w
With patch - single query:
2.909u 2.689s 2:20.58 3.9% 0+0k 491064+15816io 11pf+0w 2.989u 2.673s 2:07.51 4.4% 0+0k 491056+15720io 11pf+0w
Userspace CPU usage dropped slightly, along with significant I/O.
(FYI: Split query with patch): 2.806u 2.696s 2:27.17 3.7% 0+0k 491200+11848io 11pf+0w 2.929u 2.626s 2:23.37 3.8% 0+0k 491064+11856io 11pf+0w
One possible solution would be to do the prefetching when search
program is "all", but not otherwise. But if most of the messages
match the search query then this is slower..
Above results are from a search that matches 42 out of 26300 messages.
michael
participants (2)
-
Michael M Slusarz
-
Timo Sirainen