Hello Marc,
Tuesday, June 6, 2006, 7:25:25 PM, you wrote:
MP> I'm suggesting it in addition to MBOX and MAILDIR. And of course if MP> there's a MySQL version then other databases will follow. Just seems to MP> me that if I were running a really BIG email operation that MySQL could MP> have some serious benefits. I don't think it will be true. I've read articles by Vladimir Butenko (author of CommuniGate PRO, very scalable and hight-performace mail solution), that "generic SQL" is not give too much benefits on average-loaded sites and limit performace of hight-loaded sites. As any 'generic' solution.
E-mails are like objects, not relations, so you have two approaches, really:
(a) Try to emulate objects on tables, like some OO2RDBMS wrappers does. Base is perfectly normalized, all strings are stored only once in separate tables (like 'headenames' and 'headervalues'), etc. Result is very massive JOINs and poor performance.
(b) Store whole e-mail as BLOBs + some indexes by main headers. Result is very high IO-load on RDBMS, which need to retrive and store large continous objects.
Best solution seems to store e-mails as-is, one e-mail per file, and store some indexes in simple low-level database, like BerkeleyDB. Filesystem does best in working with "BLOBS" and simple database engine without complex query langauge allows to have VERY fast indexes. This solution has one additional advantage: all indexes can be rebuild by e-mails, if DB is backed up with errors, for example.
-- Best regards, Lev mailto:lev@serebryakov.spb.ru