On Thu, 2009-04-02 at 13:03 -0600, Ryan wrote:
I've been reading about using Squat with Dovecot on large mailboxes. I was wondering if some of the memory usage during searching and indexing could be reduced by using a berkeley db? I've used this in the past for something similar and in BTREE mode it was very fast for millions of keys. I havent had time to look at how squat is implemented, but I thought I'd throw it out there...
The problem that Squat tries to solve is searching substrings. Btrees aren't very helpful there.
Or there is an alternative way to implement substring searches with btrees by separately indexing all suffixes of all indexed words, but I fear that would hugely bloat the database since each message has at least one unique string. Although perhaps some kind of a hybrid approach might work, where non-unique strings are indexed with btrees and unique strings in the current Squat style index..