[Dovecot] Search over big folder hierarchy (was: Global FTS index? )

Patrick Nagel patrick.nagel at star-group.net
Tue Jun 24 11:42:13 EEST 2008


On Tuesday 24 June 2008, Timo Sirainen wrote:
> Yes, they have to be selected. There isn't any way currently in IMAP to
> search from multiple mailboxes using a single command, so even if
> Dovecot implemented a Squat index that indexed mails from all mailboxes,
> you'd still have to implement a non-standard extension to use that.

I see. That's what I thought. :(

> Hmm. Or v1.2 has virtual mailboxes - you could create a single virtual
> mailbox from all your other mailboxes and then search it. I think if
> Squat is enabled it'll create a single index from all the mails. I'm not
> sure if I want to leave it like that though..

How about making it configurable? I'm sure there are scenarios where it's not 
desirable to have an index for each virtual mailbox (which sounds like a very 
cool concept, by the way) - but like in my case, it would be a great 
workaround :)

> I have also been thinking about making Squat indexes global for all
> mailboxes. If done well it should reduce disk space as well as enable
> fast multi-mailbox searches, but I'm a bit worried about memory usage
> and other slowness when updating the index. The Squat building/updating
> could use more work, but I haven't yet figured out a great solution for
> it.

I'm not sure if it would reduce disk space usage... I'm thinking of the 
following:

Now (fictitious, don't know how dovecot.index.search really looks like):

mailbox1.in.a.subfolder.of.a.subfolder.of.a.subfolder/dovecot.index.search:
INDEX	UID
word	12345
ord	12345
rd	12345
d	12345
...

Then (of course also fictitious):

dovecot.global.index.search:
INDEX	MAILBOX							UID
word	mailbox1.in.a.subfolder.of.a.subfolder.of.a.subfolder	12345
ord	mailbox1.in.a.subfolder.of.a.subfolder.of.a.subfolder	12345
rd	mailbox1.in.a.subfolder.of.a.subfolder.of.a.subfolder	12345
d	mailbox1.in.a.subfolder.of.a.subfolder.of.a.subfolder	12345
...

Of course this would be very compressible, but in an uncompressed form it 
would probably be much bigger than now all dovecot.index.search files 
together. This would cause the need for mailbox UIDs, so that the path names 
only need to be stored in a map once... or something along those lines.

Anyway, I think improved (= faster) search capabilities are a huge plus for an 
IMAP server, because the possibility to search in old mails is what makes 
people keep their mails (available, on the server) in the first place...

Patrick.

-- 
STAR Software (Shanghai) Co., Ltd.            http://www.star-group.net/
Phone:    +86 (21) 3462 7688 x 826             Fax:   +86 (21) 3462 7779

PGP key:         https://stshacom1.star-china.net/keys/patrick_nagel.asc
Fingerprint:           E09A D65E 855F B334 E5C3 5386 EF23 20FC E883 A005
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://dovecot.org/pipermail/dovecot/attachments/20080624/2d472c6a/attachment.bin 


More information about the dovecot mailing list