[Dovecot] Difference between Indexing and Rescan in FTS
I've had squat running on dovecot 2.0 and have been updating all users mailbox indexes nighty via cron with this command:
doveadm -v search -A text xyzzyx
I've just updated to 2.1 and I'm migrating to lucene indexes, but reading the documentation I'm having a hard time understanding the semantic differences between indexing and rescanning.
If I were to continue to run an all user all mailbox index every night, would that be appropriate?
Would running this every night avoid the need to ever rescan?
Should I run rescan instead of index?
Should I run both rescan and index? In which order?
Best Regards,
FredK
On 10/16/2012 6:14 PM, Fred Kilbourn wrote:
I've had squat running on dovecot 2.0 and have been updating all users mailbox indexes nighty via cron with this command:
doveadm -v search -A text xyzzyx
I've just updated to 2.1 and I'm migrating to lucene indexes, but reading the documentation I'm having a hard time understanding the semantic differences between indexing and rescanning.
If I were to continue to run an all user all mailbox index every night, would that be appropriate?
Would running this every night avoid the need to ever rescan?
2 sets of indexes.
dovecot indexes FTS indexes
Performing the cron search will update the FTS indexes, although you should read up on 2.1's doveadm index command. The dovecot indexes should stay in sync. However, if they do lose track of the FTS indexes, you can do a rescan to sync them back up. Except for corruption or index changes made outside of dovecot, the two sets of indexes should stay in sync.
Jack
-----Original Message----- From: dovecot-bounces@dovecot.org [mailto:dovecot-bounces@dovecot.org] On Behalf Of Jack Bates Sent: Tuesday, October 16, 2012 9:44 PM To: dovecot@dovecot.org Subject: Re: [Dovecot] Difference between Indexing and Rescan in FTS
On 10/16/2012 6:14 PM, Fred Kilbourn wrote:
I've had squat running on dovecot 2.0 and have been updating all users mailbox indexes nighty via cron with this command:
doveadm -v search -A text xyzzyx
I've just updated to 2.1 and I'm migrating to lucene indexes, but reading the documentation I'm having a hard time understanding the semantic differences between indexing and rescanning.
If I were to continue to run an all user all mailbox index every night, would that be appropriate?
Would running this every night avoid the need to ever rescan?
2 sets of indexes.
dovecot indexes FTS indexes
Performing the cron search will update the FTS indexes, although you should read up on 2.1's doveadm index command. The dovecot indexes should stay in sync. However, if they do lose track of the FTS indexes, you can do a rescan to sync them back up. Except for corruption or index changes made outside of dovecot, the two sets of indexes should stay in sync.
Jack
Thanks Jack,
So here are my takeaways, let me know if I'm wrong:
- The FTS index is the actual search data
- The dovecot index holds, among other information, which messages are indexed by FTS
- The FTS index still doesn't update automatically, so my nightly cronjob should keep it in order
- The dovecot index should stay in order under normal circumstances, and issuing a resync command shouldn't be needed unless something bad happens
Assuming my understanding above is correct, how about these questions, to further clarify my original questions:
- As a system administrator, what signs should I look for that a resync is needed? (aside from user complaints)
- What exact impact does running the resync command have?
- Is it worthwhile to resync periodically as a maintenance task?
- Or, does resyncing reset all FTS indexing that has been done, causing it to have to be done again from scratch?
And, I did catch the revision in the user docs for updating indexes. I plan on updating my maintenance script accordingly.
Thanks, Fred
On 17.10.2012, at 2.14, Fred Kilbourn wrote:
I've had squat running on dovecot 2.0 and have been updating all users mailbox indexes nighty via cron with this command:
doveadm -v search -A text xyzzyx
doveadm index is a bit more efficient.
I've just updated to 2.1 and I'm migrating to lucene indexes, but reading the documentation I'm having a hard time understanding the semantic differences between indexing and rescanning.
doveadm fts rescan makes sure that 1) all of the old messages are indexed and 2) there are no extra (already deleted) messages indexed. So it's basically repairing fts index. You probably shouldn't run it automatically, or at least not very often.
-----Original Message----- From: dovecot-bounces@dovecot.org [mailto:dovecot-bounces@dovecot.org] On Behalf Of Timo Sirainen Sent: Tuesday, October 16, 2012 10:16 PM To: Fred Kilbourn Cc: dovecot@dovecot.org Subject: Re: [Dovecot] Difference between Indexing and Rescan in FTS
On 17.10.2012, at 2.14, Fred Kilbourn wrote:
I've had squat running on dovecot 2.0 and have been updating all users mailbox indexes nighty via cron with this command:
doveadm -v search -A text xyzzyx
doveadm index is a bit more efficient.
I've just updated to 2.1 and I'm migrating to lucene indexes, but reading the documentation I'm having a hard time understanding the semantic differences between indexing and rescanning.
doveadm fts rescan makes sure that 1) all of the old messages are indexed and 2) there are no extra (already deleted) messages indexed. So it's basically repairing fts index. You probably shouldn't run it automatically, or at least not very often.
Okay, you've clarified that for me.
I understand that rescan isn't a nightly task, but could be run every now and then periodically. How often might be appropriate if I wanted to do this as a maintenance task? Once a month?
Lastly, I'm trying to use the index command instead of the search command, but I can't figure out how to make it index every mailbox for every user. Is there a wildcard that can be used for the mailbox? Or do I need to iterate all the mailboxes with one command and run index however many times for each inbox?
Thanks for your help
On 17.10.2012, at 9.26, Fred Kilbourn wrote:
doveadm fts rescan makes sure that 1) all of the old messages are indexed and 2) there are no extra (already deleted) messages indexed. So it's basically repairing fts index. You probably shouldn't run it automatically, or at least not very often.
Okay, you've clarified that for me.
I understand that rescan isn't a nightly task, but could be run every now and then periodically. How often might be appropriate if I wanted to do this as a maintenance task? Once a month?
I don't know, depends on if you have problems related to it. I think the most common answer would be "never".
Lastly, I'm trying to use the index command instead of the search command, but I can't figure out how to make it index every mailbox for every user. Is there a wildcard that can be used for the mailbox? Or do I need to iterate all the mailboxes with one command and run index however many times for each inbox?
doveadm index '*' works in new versions. I don't remember from which version.
participants (3)
-
Fred Kilbourn
-
Jack Bates
-
Timo Sirainen