On Wed, 2009-08-12 at 11:35 -0700, Daniel L. Miller wrote:
Timo Sirainen wrote:
Also the mime structure could be torn apart to store attachments individually - the motivation being single instance storage of large attachments with identical content... Anyway, these seem like very speculative directions...
Yes, this is also something in dbox's far future plans.
Speaking as a pathetic little admin of a small site of 20 users, my needs for replication & scalability are quite minor. However, single-instance storage would be a miracle of biblical proportions. Has any progress been made on this?
Do you need per-MIME part single instance storage, or would per-email be enough? Since the per-email can already done with hard links.
Do you have a roadmap for how you plan on implementing it?
I've written about it a couple of times I think, but no specific plans. Something about using hashes anyway.
I don't know if you've considered this at all - this was my first thought:
If you're able to store a message with the attachments separately, then you can come up with an attachment database (not meaning to imply SQL backend). Then after breaking the message up into message + attachments, you scan the attachment database to see if it is already present prior to saving it. This could mean that not only could we save on the huge space wasted by idiots merrily forwarding large attachments to multiple people, but even received mails with embedded graphical signatures would benefit.
Yes, that's pretty much how I thought about it. It's anyway going to be dbox-only feature. Would be way too much trouble with other formats.