Huge difference between the lucene index size created by v2.1 and v2.2

Akash akbwiz+dovecot at gmail.com
Mon Aug 18 07:42:44 UTC 2014


Hi everyone,

While examining dovecot versions v2.1 and v2.2 for their lucene search 
performances, I have noticed a huge difference in the index sizes 
created by them. Both versions were compiled on same system, against 
same libclucene, same configure options and were used with same 
dovecot.conf during run-time. I deleted the lucene-indexes folder and 
dovecot* files in the Maildir prior to indexing with both versions. The 
tests were performed on a untouched mail archive folder containing 
300000 mails without any dovecot* files in it:

root at server:/home/admin/mails/.Archive# ls -l
total 78640
drwx------ 2 2500 2500 14098432 Aug 17 22:16 cur
drwx------ 2 2500 2500 12435456 Jul 30 09:46 new
drwx------ 2 2500 2500     4096 Aug  2 13:02 tmp

The command used was:

doveadm -v index -u admin Archive

After search indexing on v2.1 resulted in:

root at server:/home/admin/mails# ls -lh lucene-indexes
total 390M
-rw------- 1 2500 2500 390M Aug 18 07:03 _25.cfs
-rw------- 1 2500 2500   20 Aug 18 07:03 segments.gen
-rw------- 1 2500 2500   46 Aug 18 07:03 segments_4d

Whereas dovecot v2.2 resulted in:

root at server:/home/admin/mails# ls -lh lucene-indexes
total 1.5G
-rw------- 1 2500 2500 1.5G Aug 18 06:41 _5g.cfs
-rw------- 1 2500 2500   20 Aug 18 06:41 segments.gen
-rw------- 1 2500 2500   46 Aug 18 06:41 segments_az

390M vs 1.5G. That is a huge difference in size. Why is that?

Thanks in advance.

-Regards,
Akash


More information about the dovecot mailing list