Re: Indexing Mail faster
Dear Peter, Noted. Thanks for your input. Appreciate it. At this point my urgent most priority is to get FTS. Waiting 3 mins for a body search email is bad(but of course my email box is large) I need to have this sorted out by today as I have been putting this off for too long mostly because of lack of troubleshooting documentation online(if i do figure this out will create some for guidance) Regards Kevin
On Thu, Jan 29, 2015 at 1:18 PM, Peter Hodur petehodur@gmail.com wrote:
On Thursday, January 29, 2015, Kevin Laurie superinterstellar@gmail.com wrote:
Dear Peter, Oh. Sorry(didnt know you were addressing someone else) my apologies. But as you can see , I am desperately trying to address this issue.
No problem ;))) i wrote because my search result is good but not so good as someone wrote here couple of days ago.
I have 1 disk with 200GB running on a VPS. The file system is ext4.
Im not dovecot expert, but i think this is your problem. You need more iops. If you need performance, VPS may be (if connected to fast SAN) but basicly is not a good choice.
I think I will need to implement FTS to fix this as I will need body searches.
FTS could be a solution. You are right.
Have you tried FTS before?
Im sorry, never :(
But may be someone more skilled than me could answer the main question, how much dovecot index helps in case imap search agains body.
Im not sure, but my opinion is that it helps little (may be not at all)
Ps: may be you use it, but do not forget to setup delivering from mail server via LDA or LMTP instead of direct to maildirs. This is better because dovecot updates your indexes at the time of message arrival. Later accesses should be faster.
Peter
Regards Kevin
On Thu, Jan 29, 2015 at 12:55 PM, Peter Hodur petehodur@gmail.com wrote:
Kevin,
My message was not addressed to you ;) because my results are NOT so good as someone wrote it here. I can fulltext cca 8k messages in 7-8 seconds.
Someone wrote that he has results cca 22k messages in 4 seconds :(
Im not sure, but the answer is may be in the index and disk subsystem.
If and only if dovecot index does not have saved in the index keywords from BODY of messages the problem is your/my disk subsystem.
Generally, messages from mailing list like this are pretty small. The problem is that dovecot must read all messages. If you have maildir its mean open and read many files.
But my test search was agains archive of my personal inbox - so not onoy small messages like this but also messages with big attachments etc.
If dovecot index only headers, than it in case of full text search read whole messages - mime parts allows that plain text body can be after attachment etc.
And here is it all about iops and throughput.
How many disks you have? And what setup?
Generally, the only good setup is many smaller disks in RAID 10 (stripped mirrors) and if your fs allows it, with added read cache - in case of zfs: l2arc on ssd.
Pete
On Thursday, January 29, 2015, Kevin Laurie superinterstellar@gmail.com wrote:
Hi Peter, Sorry,
I think its maildir Output of my dovecot -n is listed below:-
# 2.2.9: /etc/dovecot/dovecot.conf # OS: Linux 3.10.62-xenU-25-0e6777a-x86_64 x86_64 Ubuntu 14.04.1 LTS auth_master_user_separator = * auth_mechanisms = PLAIN LOGIN dict { acl = mysql:/etc/dovecot/dovecot- share-folder.conf quotadict = mysql:/etc/dovecot/dovecot-used-quota.conf } first_valid_uid = 2000 last_valid_uid = 2000 listen = * log_path = /var/log/dovecot.log mail_debug = yes mail_gid = 2000 mail_location = maildir:/%Lh/Maildir/:INDEX=/%Lh/Maildir/ mail_plugins = quota fts mail_uid = 2000 managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date ihave namespace { inbox = yes location = mailbox Drafts { auto = subscribe special_use = \Drafts } mailbox Junk { auto = subscribe special_use = \Junk } mailbox Sent { auto = subscribe special_use = \Sent } mailbox "Sent Messages" { auto = no special_use = \Sent } mailbox Spam { auto = no special_use = \Junk } mailbox Trash { auto = subscribe special_use = \Trash } prefix = separator = / type = private } namespace { list = children location = maildir:/%%Lh/Maildir/:INDEX=/%%Lh/Maildir/Shared/%%u prefix = Shared/%%u/ separator = / subscriptions = yes type = shared } passdb { args = /etc/dovecot/dovecot-mysql.conf driver = sql } passdb { args = /etc/dovecot/dovecot-master-users-password driver = passwd-file master = yes } plugin { acl = vfile acl_shared_dict = proxy::acl auth_socket_path = /var/run/dovecot/auth-master quota = dict:user::proxy::quotadict quota_rule = *:storage=1G quota_warning = storage=85%% quota-warning 85 %u quota_warning2 = storage=90%% quota-warning 90 %u quota_warning3 = storage=95%% quota-warning 95 %u sieve = /%Lh/sieve/dovecot.sieve sieve_default = /var/vmail/sieve/dovecot.sieve sieve_dir = /%Lh/sieve sieve_global_dir = /var/vmail/sieve } protocols = pop3 imap sieve lmtp service auth { unix_listener /var/spool/postfix/private/dovecot-auth { group = postfix mode = 0666 user = postfix } unix_listener auth-master { group = vmail mode = 0666 user = vmail } unix_listener auth-userdb { group = vmail mode = 0660 user = vmail } } service dict { unix_listener dict { group = vmail mode = 0660 user = vmail } } service imap-login { process_limit = 500 service_count = 1 } service lmtp { executable = lmtp -L inet_listener lmtp { port = 24 } process_min_avail = 5 unix_listener /var/spool/postfix/private/dovecot-lmtp { group = postfix mode = 0600 user = postfix } user = vmail } service pop3-login { service_count = 1 } service quota-warning { executable = script /usr/local/bin/dovecot-quota-warning.sh unix_listener quota-warning { group = vmail mode = 0660 user = vmail } } ssl = required ssl_cert =
On Thu, Jan 29, 2015 at 12:37 PM, Kevin Laurie < superinterstellar@gmail.com> wrote:
Dear Peter,
My inbox is MDA_external Storage: 17GB of 24GB
Subject / From / To is fast but FTS(Full Text Search) for body is horrible. I suppose this is where we need Apache Solr.
Do you think my mail storage format is bad? Do I need to change for better performance? Please advise Kevin
On Thu, Jan 29, 2015 at 12:25 PM, Peter Hodur petehodur@gmail.com wrote:
> * Kevin Laurie superinterstellar@gmail.com 2015.01.24 19:41: > > > Currently the time it takes to search 25,000mails is 4mins. If indexed > how > > much faster are we looking at? > > With a current version of Dovecot a search is pretty fast _without_ using > external indexes. I have a view defined (virtual plugin) with around 22.000 > messages in it, and searching the full view only takes 2.5 seconds: > > hmmm, could you please tell me more about your setup? What storage format do you use? Maildir or? What is the cumulative size of your messages?
My results without FTS on ZFS FS with SSD L2ARC are not so good:
a4 select INBOX._OLD-OUTLOOK
- FLAGS (\Answered \Flagged \Deleted \Seen \Draft $Junk $NotJunk JunkRecorded $Forwarded)
- OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft $Junk $NotJunk JunkRecorded $Forwarded \*)] Flags permitted. ** 7748 EXISTS*
- 0 RECENT
- OK [UIDVALIDITY 1421198037] UIDs valid
- OK [UIDNEXT 11509] Predicted next UID
- OK [HIGHESTMODSEQ 12204] Highest a4 OK [READ-WRITE] Select completed (0.001 secs).
a5 search charset utf-8 body "mall"
- SEARCH 2 49 101 117 158 171 185 192 197 202 207 223 228 234 236 240 249 279 280 281 287 288 289 290 297 321 327 337 344 351 360 370 373 385 389 390 391 398 405 413 424 444 458 463 470 474 480 482 505 513 520 530 531 532 533 543 559 560 561 562 563 566 588 593 597 625 630 639 644 656 671 672 677 692 720 723 734 738 741 745 752 755 757 765 775 777 784 791 818 820 821 833 855 863 864 868 881 896 910 917 922 926 928 931 991 996 997 998 1000 1010 1011 1012 1014 1018 1019 1026 1047 1068 1077 1095 1101 1105 1122 1136 1137 1140 1155 1160 1166 1171 1179 1180 1197 1208 1229 1239 1258 1263 1271 1282 1286 1290 1298 1319 1364 1365 1370 1386 1408 1410 1429 1463 1465 1470 1471 1494 1518 1522 1529 1530 1536 1541 1548 1571 1581 1585 1588 1594 1605 1606 1611 1612 1619 1620 1625 1652 1666 1667 1729 1730 1731 1732 1733 1734 1735 1781 1782 1817 1818 1897 1900 1921 1940 1946 1960 1972 1981 1995 1998 2002 2006 2028 2049 2057 2095 2100 2157 2168 2181 2185 2192 2203 2204 2207 2208 2210 2220 2225 2255 2273 2282 2283 2288 2289 2317 2320 2340 2367 2374 2377 2378 2379 2384 2389 2402 2409 2436 2459 2475 2476 2488 2504 2519 2538 2539 2551 2566 2572 2597 2599 2603 2617 2629 2664 2698 2716 2731 2733 2753 2754 2780 2805 2808 2815 2818 2850 2861 2862 2867 2886 2896 2900 2914 2931 2936 2938 2939 2950 2969 2990 3017 3019 3062 3075 3094 3101 3115 3138 3159 3161 3178 3185 3190 3204 3217 3218 3248 3263 3265 3266 3273 3282 3288 3295 3386 3428 3453 3476 3478 3479 3511 3548 3606 3629 3693 3694 3737 3793 3799 3801 3808 3812 3814 3815 3834 3849 3860 3862 3880 3910 3917 3930 3932 3952 3953 3954 3957 3959 3968 3971 3973 3978 3979 3980 4008 4022 4040 4057 4058 4059 4063 4064 4066 4069 4070 4075 4096 4112 4131 4132 4133 4141 4143 4144 4145 4146 4147 4167 4174 4199 4201 4202 4203 4206 4211 4217 4218 4226 4229 4258 4259 4267 4287 4357 4359 4363 4364 4365 4367 4390 4391 4462 4475 4497 4502 4538 4540 4552 4557 4558 4561 4563 4567 4571 4572 4573 4575 4577 4593 4594 4604 4611 4619 4628 4638 4639 4662 4672 4678 4679 4692 4696 4785 4786 4787 4788 4789 4792 4793 4794 4802 4817 4818 4819 4820 4836 4857 4874 4887 4901 4905 4906 4907 4908 4911 4925 4928 4940 4941 4953 5060 5103 5116 5118 5129 5131 5136 5158 5163 5182 5184 5203 5212 5216 5269 5270 5271 5272 5273 5276 5277 5278 5286 5301 5302 5303 5304 5320 5341 5358 5371 5385 5389 5390 5391 5392 5395 5409 5412 5424 5425 5437 5544 5587 5600 5602 5613 5615 5620 5642 5647 5666 5668 5687 5696 5700 5736 5740 5749 5764 5783 5809 5814 5853 5866 5867 5877 5888 5895 5896 5897 5898 5899 5908 5910 5911 5912 5939 5950 5958 5990 6000 6059 6074 6095 6097 6112 6137 6141 6189 6193 6212 6228 6229 6233 6271 6273 6275 6285 6310 6317 6335 6383 6384 6397 6427 6430 6459 6463 6482 6492 6506 6565 6585 6620 6670 6673 6675 6705 6715 6716 6741 6812 6826 6852 6859 6895 6896 6907 6913 6919 6935 6943 6948 6979 7023 7025 7035 7039 7042 7108 7131 7145 7163 7171 7172 7194 7198 7199 7203 7256 7257 7294 7303 7317 7322 7343 7344 7347 7348 7352 7386 7390 7391 7392 7393 7407 7408 7409 7417 7418 7419 7420 7421 7426 7432 7437 7462 7467 7468 7473 7474 7475 7488 7502 7503 7558 7588 7589 7628 7685 7695 7699 7703 7723 *a5 OK Search completed (7.846 secs).*
Searching against "subject" is pretty fast, few miliseconds ...
participants (1)
-
Kevin Laurie