[Dovecot] slow squat fts index creation

Timo Sirainen tss at iki.fi
Fri Jun 3 17:46:33 EEST 2011


On Tue, 2011-05-24 at 17:01 +0200, Cor Bosman wrote:
> Hi all, ive been playing with squat indexes.  Up to about 300.000 emails in a single mailbox this was working flawlessly. The search index file is about 500MB at that time. Ive now added some more emails, and at 450.000 or so emails im seeing a serious problem with squat index creation. It takes...f o r e v e r .  The .tmp file is being so slowly, it will probably take 2-3 hours to create. Upto this point it took maybe a minute. 
> 
> Im doing this in an openvz container, so theoretically i may be hitting some openvz resource limit. But ive upped all the limits and dont see any improvements. I dont see any resources starvation either. 
> 
> Could there be some dovecot issue when the search index reaches say 1GB? (im estimating that it's now trying to save about 1GB search index). 

Initially squat just builds a large unorganized index. The last step is
the organization. This is the main problem with Squat's indexing speed.
The file is mmaped() and the accessed in pretty random order. As long as
you have enough memory to keep all of this mmaped data in physical
memory this works pretty fast, but otherwise the kernel starts page
faulting like crazy and it takes forever. That's why the Squat has this
code:

	/* Tell the kernel we're going to use the uidlist data, so it loads
	   it into memory and keeps it there. */
	(void)madvise(uidlist->mmap_base, uidlist->mmap_size, MADV_WILLNEED);
	/* It also speeds up a bit for us to sequentially load everything
	   into memory, although at least Linux catches up quite fast even
	   without this code. Compiler can quite easily optimize away this
	   entire for loop, but volatile seems to help with gcc 4.2. */
	for (i = 0; i < uidlist->mmap_size; i += page_size)
		((const volatile char *)uidlist->data)[i];




More information about the dovecot mailing list