[Dovecot] (Single instance) attachment storage
tss at iki.fi
Mon Jul 19 20:49:48 EEST 2010
On Mon, 2010-07-19 at 18:30 +0100, William Blunn wrote:
> Consider storing the recovery filter stack in the dbox metadata rather
> than the attachment file.
> This has a couple of upshots:
> 1. If one person receives a message with an attachment which is encoded
> with base64 at say 19 cells (76 bytes) per line, and then re-sends the
> same file as an attachment to someone else but their MUA encodes base64
> at say 18 cells (72 bytes) per line, the attachment file can contain
> exactly the same data, allowing for deduplication even in this case.
I thought about that also, but it would require calculating and using a
hash of the decoded message (but not the compressed message). Could get
> 2. Assuming we have configured Dovecot to decode base64 but not to
> compress, then the file in which we store the attachment data contains
> literally the exact same byte stream as if the attachment were saved out
> from the MUA. I don't know what practical use this might be, but it
> /sounds/ cool :-) Perhaps a suitable filesystem or backup-system could
> deduplicate both a file *and* its instance as a message attachment.
I was thinking about adding some small header to the dbox file, so they
wouldn't be completely identical.
BTW. I was thinking about using "number of characters per base64 line"
rather than "number of cells". I don't think it's required that line
ends with a complete cell.
More information about the dovecot