On Tue, 6 Jun 2006, Lev Serebryakov wrote:
Best solution seems to store e-mails as-is, one e-mail per file, and store some indexes in simple low-level database, like BerkeleyDB. Filesystem does best in working with "BLOBS" and simple database engine without complex query langauge allows to have VERY fast indexes.
I have nothing but trouble using OpenLDAP's fast BerkleyDB implementation. I agree that BLOBs in a filesystem are much faster than the overhead required to pull it from a DB, esp. because you can sent the data right to the client. (I don't know if you can instruct a DB to send BLOBs right into a TCP channel).
However, when it comes to discussion to take a message apart and store its parts sharing them, it would be a matter of benchmarking: Can you still read the part from the filesystem and pass it forth directly?
I also don't believe that SQL is a complex language per se, because it offers complex and slow stuff; there has been a pletoria of high- and low-grade programmers, engineers and theoreticers, who built good algorithms and optimization strategies for SQL, nobody can re-implement easily.
I also guess that it would make no sense for Dovecot to utilize both backends, I mean: if you go the SQL way half-heartedly, it worse and nobody will use it happily, but then you will loose focus for the filesystem based storage and lots of people don't want to install a DB. There is a discussion about this very same topic in OpenLDAP: Summary (view of mine :): the SQL backend is lots quicker, but OpenLDAP's useage of it is bad, because written for BerkleyDB, that the performance is lower; one needs to rewrite to much code and afterwards the SQL DB is quick, but BDB slow.
This solution has one additional advantage: all indexes can be rebuild by e-mails, if DB is backed up with errors, for example.
IMO: This is one thing, that needs to be done for Dovecot, e.g. keywords, and a toolset.
-- Steffen Kaiser