[Dovecot] (Single instance) attachment storage

Timo Sirainen tss at iki.fi
Mon Jul 19 20:49:48 EEST 2010


On Mon, 2010-07-19 at 18:30 +0100, William Blunn wrote:
> Consider storing the recovery filter stack in the dbox metadata rather 
> than the attachment file.
> 
> This has a couple of upshots:
> 
> 1. If one person receives a message with an attachment which is encoded 
> with base64 at say 19 cells (76 bytes) per line, and then re-sends the 
> same file as an attachment to someone else but their MUA encodes base64 
> at say 18 cells (72 bytes) per line, the attachment file can contain 
> exactly the same data, allowing for deduplication even in this case.

I thought about that also, but it would require calculating and using a
hash of the decoded message (but not the compressed message). Could get
complex.

> 2. Assuming we have configured Dovecot to decode base64 but not to 
> compress, then the file in which we store the attachment data contains 
> literally the exact same byte stream as if the attachment were saved out 
> from the MUA. I don't know what practical use this might be, but it 
> /sounds/ cool :-) Perhaps a suitable filesystem or backup-system could 
> deduplicate both a file *and* its instance as a message attachment.

I was thinking about adding some small header to the dbox file, so they
wouldn't be completely identical.

BTW. I was thinking about using "number of characters per base64 line"
rather than "number of cells". I don't think it's required that line
ends with a complete cell.



More information about the dovecot mailing list