[Dovecot] dovecot fts_lucene indexes not being updated
Hi everyone,
I have a dovecot 2.1.6 setup with fts_lucene.
The lucene indexes get created/updated fine when I do a manual search thru telnet.
However, if I run a webapp that uses the php::imap_search function, the index never gets created/updated. I run indo the same problem with thunderbird and full-body search.
Anyone has some insight into this?
Regards,
-Joe
Sorry for double-posting, but i forgot to mention something:
If I run php::imap_search() AFTER having manually updated the lucene indexes (thru a telnet command), I see a HUGE gain in search performance compared to before creating the lucene indexes (at least 1 order of magnitude).
-Joe
On Sat, May 19, 2012 at 1:19 PM, Joe Beaubien joe.beaubien@gmail.comwrote:
Hi everyone,
I have a dovecot 2.1.6 setup with fts_lucene.
The lucene indexes get created/updated fine when I do a manual search thru telnet.
However, if I run a webapp that uses the php::imap_search function, the index never gets created/updated. I run indo the same problem with thunderbird and full-body search.
Anyone has some insight into this?
Regards,
-Joe
On Sat, 2012-05-19 at 13:19 -0400, Joe Beaubien wrote:
I have a dovecot 2.1.6 setup with fts_lucene.
The lucene indexes get created/updated fine when I do a manual search thru telnet.
So you run SEARCH TEXT or SEARCH BODY?
However, if I run a webapp that uses the php::imap_search function, the index never gets created/updated.
What IMAP command is this search sending? You could check by e.g. running rawlog: http://wiki2.dovecot.org/Debugging/Rawlog
I run indo the same problem with thunderbird and full-body search.
Thunderbird probably isn't using IMAP SEARCH at all, but using its own local indexes. Although this is configurable I think.
See answers inline.
I just want to add that i also tried the parameter "fts_index_timeout = 10" in the plugin section. Unfortunately the logs never showed me any indexing taking place. I also tried "fts_index_timeout = 10s" in case i had the wrong syntax.
I also tried "doveadm fts rescan -u my_user". The command returned instantly and nothing appeared in the logs to indicate that a rescan happened.
Thanks,
-Joe
On Sat, May 19, 2012 at 2:21 PM, Timo Sirainen tss@iki.fi wrote:
On Sat, 2012-05-19 at 13:19 -0400, Joe Beaubien wrote:
I have a dovecot 2.1.6 setup with fts_lucene.
The lucene indexes get created/updated fine when I do a manual search thru telnet.
So you run SEARCH TEXT or SEARCH BODY?
Yes, with telnet i send a SEARCH TEXT command: 4 search text "test"
However, if I run a webapp that uses the php::imap_search function, the index never gets created/updated.
What IMAP command is this search sending? You could check by e.g. running rawlog: http://wiki2.dovecot.org/Debugging/Rawlog
Ok, I enabled rawlog and noticed the code was only sending "SEARCH FROM". I modified it to also send a "SEARCH TEXT" and it triggered the fts index update.
However, in my use case, that's not the search that needs to be done. The search needed is "SEARCH FROM". It's not practical (or efficient I imagine) to send a "SEARCH TEXT" for every folder before sending a "SEARCH FROM", just to be sure the indexes are up-to-date.
If i could just do an update of the entire account (all folders) right after I download new emails, that would be the least painful (I think). I thought "fts_index_timeout" or "doveadm fts rescan -u my_user" would help me achieve this, but those 2 options never seem to trigger an update on the fts indexes. Any idea if I did something wrong there?
I run indo the same problem with thunderbird and full-body search.
Thunderbird probably isn't using IMAP SEARCH at all, but using its own local indexes. Although this is configurable I think.
You are right, let's forget about thunderbird.
On 19.5.2012, at 23.40, Joe Beaubien wrote:
Ok, I enabled rawlog and noticed the code was only sending "SEARCH FROM". I modified it to also send a "SEARCH TEXT" and it triggered the fts index update.
However, in my use case, that's not the search that needs to be done. The search needed is "SEARCH FROM". It's not practical (or efficient I imagine) to send a "SEARCH TEXT" for every folder before sending a "SEARCH FROM", just to be sure the indexes are up-to-date.
SEARCH FROM doesn't update the Lucene index, because it can usually be looked up quite fast also from dovecot.index.cache file. Of course if you're not delivering mails via dovecot-lda/lmtp that doesn't get updated.
If i could just do an update of the entire account (all folders) right after I download new emails, that would be the least painful (I think). I thought "fts_index_timeout" or "doveadm fts rescan -u my_user" would help me achieve this, but those 2 options never seem to trigger an update on the fts indexes. Any idea if I did something wrong there?
You can run "doveadm index -u user" to get new mails indexed.
Answers inline.
On Sat, May 19, 2012 at 4:47 PM, Timo Sirainen tss@iki.fi wrote:
On 19.5.2012, at 23.40, Joe Beaubien wrote:
Ok, I enabled rawlog and noticed the code was only sending "SEARCH FROM". I modified it to also send a "SEARCH TEXT" and it triggered the fts index update.
However, in my use case, that's not the search that needs to be done. The search needed is "SEARCH FROM". It's not practical (or efficient I imagine) to send a "SEARCH TEXT" for every folder before sending a "SEARCH FROM", just to be sure the indexes are up-to-date.
SEARCH FROM doesn't update the Lucene index, because it can usually be looked up quite fast also from dovecot.index.cache file. Of course if you're not delivering mails via dovecot-lda/lmtp that doesn't get updated.
If i could just do an update of the entire account (all folders) right after I download new emails, that would be the least painful (I think). I thought "fts_index_timeout" or "doveadm fts rescan -u my_user" would help me achieve this, but those 2 options never seem to trigger an update on the fts indexes. Any idea if I did something wrong there?
You can run "doveadm index -u user" to get new mails indexed.
Awesome, this does seem to work.
2 last questions:
Does it update both indexes (dovecot and fts) or only dovecot.index.cache? I ask because I didn't see any index messages in log files.
Is there a way to update the index of the entire email account instead of doing it for each folder (mailbox)?
Thanks alot for your awesome support.
On 20.5.2012, at 0.12, Joe Beaubien wrote:
You can run "doveadm index -u user" to get new mails indexed.
Awesome, this does seem to work.
2 last questions:
- Does it update both indexes (dovecot and fts) or only dovecot.index.cache? I ask because I didn't see any index messages in log files.
It should update both. If it doesn't then there's some kind of a configuration problem.
- Is there a way to update the index of the entire email account instead of doing it for each folder (mailbox)?
You should be able to use '*' as the mailbox name.
On Sat, May 19, 2012 at 6:16 PM, Timo Sirainen tss@iki.fi wrote:
On 20.5.2012, at 0.12, Joe Beaubien wrote:
You can run "doveadm index -u user" to get new mails indexed.
Awesome, this does seem to work.
2 last questions:
- Does it update both indexes (dovecot and fts) or only dovecot.index.cache? I ask because I didn't see any index messages in log files.
It should update both. If it doesn't then there's some kind of a configuration problem.
How can I verify this? When I run "sudo ./doveadm -v index -u form INBOX2", I see nothing in the normal log files. I do see the expected output on stdout:
doveadm(form): Info: INBOX2: Caching mails seq=41808..41822 15/15 joe@XXXXXX:/opt/dovecot/bin$
However, I see nothing in the normal logs. I normally see a message similar to "Indexed 500 new messages" when I force an fts update but I get nothing in this case.
My config for plugins is pretty simple:
10-mail.conf: mail_plugins = zlib fts fts_lucene
90-plugin.conf: plugin { zlib_save_level = 6 # 1..9 zlib_save = gz # or bz2 fts = lucene fts_lucene = whitespace_chars="@.-_()[]{}<>/\\+" }
Is it normal to not get output in the normal log files in this case, or did i miss something?
- Is there a way to update the index of the entire email account instead of doing it for each folder (mailbox)?
You should be able to use '*' as the mailbox name.
I think there is an issue with the '*' wildcard. It doesn't seem to do anything. It returns instantly, doesn't give any message or error, even in verbose mode. however, when I specify a folder, it works as expected:
joe@XXXX:/opt/dovecot/bin$ sudo ./doveadm -v index -u form * joe@XXXX:/opt/dovecot/bin$ sudo ./doveadm -v index -u form form_positif doveadm(form): Info: form_positif: Caching mails seq=130044..130095 52/52 joe@XXXX:/opt/dovecot/bin$
Your help is greatly appreciated.
-Joe
On Sat, May 19, 2012 at 10:16 PM, Timo Sirainen tss@iki.fi wrote:
On 20.5.2012, at 5.15, Joe Beaubien wrote:
Is it normal to not get output in the normal log files in this case, or did i miss something?
Before looking into it further, what's your doveconf -n output?
Here is the output:
joe@XXXXXX:/opt/dovecot/bin$ ./doveconf -n # 2.1.6: /opt/dovecot-2.1.6-lucene/etc/dovecot/dovecot.conf # OS: Linux 3.2.0-24-generic x86_64 Ubuntu 12.04 LTS ext4 auth_username_format = %Ln disable_plaintext_auth = no listen = * mail_fsync = never mail_location = mdbox:/data/emails/%u mail_plugin_dir = /opt/dovecot/lib/dovecot mail_plugins = zlib fts fts_lucene managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date ihave mdbox_rotate_size = 20 M namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = separator = / } passdb { driver = shadow } plugin { fts = lucene fts_lucene = whitespace_chars="@.-_()[]{}<>/\\+" sieve = /data/emails/sieve-scripts/%u.sieve zlib_save = gz zlib_save_level = 6 } postmaster_address = info@XXXXXXXXXXX.com protocols = imap lmtp service lmtp { unix_listener /var/spool/postfix/private/dovecot-lmtp { group = postfix mode = 0660 user = postfix } } ssl = no userdb { driver = passwd } protocol lmtp { mail_fsync = optimized mail_plugins = zlib fts fts_lucene sieve } protocol lda { mail_fsync = optimized } protocol imap { imap_idle_notify_interval = 10 mins } joe@XXXXX:/opt/dovecot/bin$
I can also add that I had "top" running while doing a "doveadm index" on 800 emails; doveadm took about 15 seconds to complete and in top I never saw lucene-indexer (or lucene-worker) appear.
-Joe
On Sun, May 20, 2012 at 9:30 AM, Joe Beaubien joe.beaubien@gmail.comwrote:
On Sat, May 19, 2012 at 10:16 PM, Timo Sirainen tss@iki.fi wrote:
On 20.5.2012, at 5.15, Joe Beaubien wrote:
Is it normal to not get output in the normal log files in this case, or did i miss something?
Before looking into it further, what's your doveconf -n output?
Here is the output:
joe@XXXXXX:/opt/dovecot/bin$ ./doveconf -n # 2.1.6: /opt/dovecot-2.1.6-lucene/etc/dovecot/dovecot.conf # OS: Linux 3.2.0-24-generic x86_64 Ubuntu 12.04 LTS ext4 auth_username_format = %Ln disable_plaintext_auth = no listen = * mail_fsync = never mail_location = mdbox:/data/emails/%u mail_plugin_dir = /opt/dovecot/lib/dovecot mail_plugins = zlib fts fts_lucene managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date ihave mdbox_rotate_size = 20 M namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = separator = / } passdb { driver = shadow } plugin { fts = lucene fts_lucene = whitespace_chars="@.-_()[]{}<>/\\+" sieve = /data/emails/sieve-scripts/%u.sieve zlib_save = gz zlib_save_level = 6 } postmaster_address = info@XXXXXXXXXXX.com protocols = imap lmtp service lmtp { unix_listener /var/spool/postfix/private/dovecot-lmtp { group = postfix mode = 0660 user = postfix } } ssl = no userdb { driver = passwd } protocol lmtp { mail_fsync = optimized mail_plugins = zlib fts fts_lucene sieve } protocol lda { mail_fsync = optimized } protocol imap { imap_idle_notify_interval = 10 mins } joe@XXXXX:/opt/dovecot/bin$
On 21.5.2012, at 17.12, Joe Beaubien wrote:
I can also add that I had "top" running while doing a "doveadm index" on 800 emails; doveadm took about 15 seconds to complete and in top I never saw lucene-indexer (or lucene-worker) appear.
If you run "doveadm index", the doveadm itself is doing all the work. This isn't necessarily recommended, since if another indexer is running for the user it may cause problems. Also this is probably why you're not seeing a log message about how many messages were indexed.
If you run "doveadm index -q", you should have indexer-worker process and the log message.
participants (2)
-
Joe Beaubien
-
Timo Sirainen