[Dovecot] SIS Implementation
What would be involved in implementing SIS within Dovecot? A new or modified mailbox format?
-- Daniel
On Fri, 2009-08-14 at 11:28 -0700, Daniel L. Miller wrote:
What would be involved in implementing SIS within Dovecot? A new or modified mailbox format?
It could be added to dbox without too much trouble. I already kind of planned for it:
/* Pointer to external message data. Format is:
1*(<start offset> <byte count> <ref>) */
DBOX_METADATA_EXT_REF = 'X',
Nothing uses that yet though. So what you'd need is:
When writing the data, extract the attachments and write them to different files. Add pointers to those files to the EXT_REF metadata. Dovecot's message parsers should make this not-too-difficult to implement.
When reading messages, get the ext references and then you should be able to create concat-istream by piecing together the main dbox istream with those external istreams. Maybe implement some istream that lazily opens the external files only when it's actually needed.
Once that works, figure out a way to implement SIS for those externally written files. Creating hard links to files a global attachment storage would probably be a workable solution. Then scan through the storage once in a while and delete files with link count=1.
Timo Sirainen wrote:
On Fri, 2009-08-14 at 11:28 -0700, Daniel L. Miller wrote:
What would be involved in implementing SIS within Dovecot? A new or modified mailbox format?
It could be added to dbox without too much trouble. I already kind of planned for it:
/* Pointer to external message data. Format is: 1*(<start offset> <byte count> <ref>) */ DBOX_METADATA_EXT_REF = 'X',
Nothing uses that yet though. So what you'd need is:
Now do we need to implement some kind of external database for tracking the attachments between mailboxes? Any thoughts on what that should look like?
-- Daniel
On Fri, 2009-08-14 at 12:06 -0700, Daniel L. Miller wrote:
Now do we need to implement some kind of external database for tracking the attachments between mailboxes? Any thoughts on what that should look like?
I think:
Step 1) Calculate SHA256 of the attachment and get base64 sum of it. See if you have $attachment_dir/$base64 file. If you do, assume it's the file and use it. If not, save the attachment there.
Step 2) Instead of a single huge attachment dir add some directory hashing. Could be as simple as attachments/first-two-bytes-of-base64/next-two-bytes-of-base64/base64.
Step 3) Optionally also check (on background?) that the files match byte-by-byte to handle the (really low probability of) hash collisions. This is probably a bit trickier to do, especially on background.
Quoting Timo Sirainen <tss@iki.fi>:
- When writing the data, extract the attachments and write them to different files. Add pointers to those files to the EXT_REF metadata. Dovecot's message parsers should make this not-too-difficult to implement.
I'd rather it did mime parts, rather than attachments. In my use case, we don't get attachments distrubuted as widely as we get messages distributed. If the local mailbox had the headers, but the SIS area had the mime parts, this would save tons of space. Since attachments are mime-parts, this works for both cases...
- Once that works, figure out a way to implement SIS for those externally written files. Creating hard links to files a global attachment storage would probably be a workable solution. Then scan through the storage once in a while and delete files with link count=1.
Hardlinks is one way, for filesystems that support it. But it does have limits (can't span volumes, etc).
But any kind of setup that can maintain the file and a usage count should work (and the two don't have to be kept together, though they can). If you add a management interface, all the better.
BTW, PMDF implemented all this eons ago (in their popstore I think it was) system added around PMDF 5 or 6.... Was actually pretty nice, in particular for the times (this was the 1990's).
Anyway, my $0.02 worth, not that I'm waiting on this feature, but it sure would save me tons of disk space if I had it...
-- Eric Rostetter The Department of Physics The University of Texas at Austin
This message is provided "AS IS" without warranty of any kind, either expressed or implied. Use this message at your own risk.
On Fri, 2009-08-14 at 14:18 -0500, Eric Jon Rostetter wrote:
Quoting Timo Sirainen <tss@iki.fi>:
- When writing the data, extract the attachments and write them to different files. Add pointers to those files to the EXT_REF metadata. Dovecot's message parsers should make this not-too-difficult to implement.
I'd rather it did mime parts, rather than attachments.
Well, okay, s/attachments/mime parts/ throughout my mail. That's what I meant. Checking if a mime part was an attachment would be just more extra work. Also there should be some size checks, so that the extraction is done only for parts that are larger than n bytes.
- Once that works, figure out a way to implement SIS for those externally written files. Creating hard links to files a global attachment storage would probably be a workable solution. Then scan through the storage once in a while and delete files with link count=1.
Hardlinks is one way, for filesystems that support it. But it does have limits (can't span volumes, etc).
But any kind of setup that can maintain the file and a usage count should work (and the two don't have to be kept together, though they can). If you add a management interface, all the better.
Hard links would be the simplest implementation without needing a separate database. Sure you could implement that too if you wanted to.
Hard links would be the simplest implementation without needing a separate database. Sure you could implement that too if you wanted to.
It would be worth checking the limits for hard links, and making sure they are suitable for a large mail system using this scheme, without having a fallback plan of some sort.
Looks like UFS hardlink limit is 32767; ext2 32000; reiser and jfs, 65535. http://www.dirvish.org/viewcvs/dirvish_1_2/FAQ.html?rev=2 see "Could linking between images be limited by a maximum link count?"
On Fri, 2009-08-14 at 12:40 -0700, Jason Fesler wrote:
Hard links would be the simplest implementation without needing a separate database. Sure you could implement that too if you wanted to.
It would be worth checking the limits for hard links, and making sure they are suitable for a large mail system using this scheme, without having a fallback plan of some sort.
Looks like UFS hardlink limit is 32767; ext2 32000; reiser and jfs, 65535. http://www.dirvish.org/viewcvs/dirvish_1_2/FAQ.html?rev=2 see "Could linking between images be limited by a maximum link count?"
Well, if there are that many copies already it probably doesn't matter much if the file gets duplicated a few times. The logic can be as easy as:
if (link(a, b) < 0 && errno == EMLINK) { copy(a, tmp); rename(tmp, a); link(a, b); }
So the old file with 32k links no longer exists in attachments dir, but you have a new file which can be linked 32k times more.
On 8/14/2009, Timo Sirainen (tss@iki.fi) wrote:
Hard links would be the simplest implementation without needing a separate database. Sure you could implement that too if you wanted to.
So... support hard links natively (on FS that support them), then allow for supporting other backend storage implementations with plugins... ?
--
Best regards,
Charles
On Fri, 2009-08-14 at 17:06 -0400, Charles Marcus wrote:
On 8/14/2009, Timo Sirainen (tss@iki.fi) wrote:
Hard links would be the simplest implementation without needing a separate database. Sure you could implement that too if you wanted to.
So... support hard links natively (on FS that support them), then allow for supporting other backend storage implementations with plugins... ?
Well, I'm not going to implement either anytime soon. So depends on whoever happens to want this feature enough to actually code it. :)
Step 4) Figure out if base64-encoded attachments can be decoded in a way that allows re-encoding them back to the exact original encoding. If so, save the attachment decoded and add the necessary encoding info the dbox metadata.
Or perhaps just store them compressed. How much of a difference is there between storing decoded data vs. compressed base64 encoded data?
Timo Sirainen wrote:
Step 4) Figure out if base64-encoded attachments can be decoded in a way that allows re-encoding them back to the exact original encoding. If so, save the attachment decoded and add the necessary encoding info the dbox metadata.
Or perhaps just store them compressed. How much of a difference is there between storing decoded data vs. compressed base64 encoded data?
Do we need a new parameter in dovecot.conf? "SIS_Location"?
Why is the base64 sum needed of the SHA256? Isn't the SHA256 unique enough?
If the attachments are already compressed via zip or some such, would further encoding just add processing time without any storage benefit?
Daniel
On Fri, 2009-08-14 at 13:54 -0700, Daniel L. Miller wrote:
Timo Sirainen wrote:
Step 4) Figure out if base64-encoded attachments can be decoded in a way that allows re-encoding them back to the exact original encoding. If so, save the attachment decoded and add the necessary encoding info the dbox metadata.
Or perhaps just store them compressed. How much of a difference is there between storing decoded data vs. compressed base64 encoded data?
Do we need a new parameter in dovecot.conf? "SIS_Location"?
Why is the base64 sum needed of the SHA256? Isn't the SHA256 unique enough?
I'm not sure if you're mixing two things or replying to wrong mail or.. :) Anyway:
In previous mail I said take base64 of SHA256 for the filename. Because all characters aren't valid in filenames.
Here I'm just thinking about dropping base64 encoding of attachments and storing them decoded, so 25% of disk space could be used.
If the attachments are already compressed via zip or some such, would further encoding just add processing time without any storage benefit?
Further compression of attached binaries wouldn't of course benefit anything. But if the compressed attachment was stored as base64 (since that's how they're sent via email..) compressing that could get it close to original size.
Step 4) Figure out if base64-encoded attachments can be decoded in a way that allows re-encoding them back to the exact original encoding. If so, save the attachment decoded and add the necessary encoding info the dbox metadata.
Although you might like to do that for some sort of tidiness or whatever, I don't think there's an actual requirement to restore it to base64 (or q-p or whatever). Those are just transfer encodings, and intermediate MTAs might in theory have transformed them (though in practice that's doubtful). The one use case for an exact reconstruction might be some older digital signature schemes (which were not so robust and probably lost if an intermediate MTA messed with things).
Or perhaps just store them compressed. How much of a difference is there between storing decoded data vs. compressed base64 encoded data?
The best you can ever do is break even, and often you will lose. In any case, you have the CPU/memory cost of the compress/decompress. I would get rid of any transfer encoding and try to compress. If the compression was above some threshold of storage benefit, store it compressed. Otherwise, store it uncompressed.
On Aug 14, 2009, at 7:15 PM, WJCarpenter wrote:
Step 4) Figure out if base64-encoded attachments can be decoded in
a way that allows re-encoding them back to the exact original encoding.
If so, save the attachment decoded and add the necessary encoding info the
dbox metadata.Although you might like to do that for some sort of tidiness or
whatever, I don't think there's an actual requirement to restore it
to base64 (or q-p or whatever). Those are just transfer encodings,
and intermediate MTAs might in theory have transformed them (though
in practice that's doubtful). The one use case for an exact
reconstruction might be some older digital signature schemes (which
were not so robust and probably lost if an intermediate MTA messed
with things).
I was thinking things like: upper vs. lowercase characters, different
line wrapping lengths, possibly some other weird stuff.. I'd think
that all digital signatures break if any of those change? Or do they
really parse the headers and do calculate the signatures using the
decoded base64?
Another issue is that the MIME structure (MIME part sizes, offsets)
must match what got saved into dovecot's cache file, but that could be
fixed with some extra code.
I was thinking things like: upper vs. lowercase characters, different line wrapping lengths, possibly some other weird stuff.. I'd think that all digital signatures break if any of those change? Or do they really parse the headers and do calculate the signatures using the decoded base64?
Yes, you will have to perfectly preserve whatever is inside the base64 or q-p, but that's a different matter from needing to preserve the base64 or q-p itself. base64 and q-p are just schemes for safely transporting the message since there is some mild danger of losing the 8th bit.
These days, standardized digitial signature schemes take into account
legal transformations that can happen during message transmission. Most
of them have a canonicalization formula so that things still work.
However, in early days, various schemes didn't take that into account.
Luckily, MTAs typically didn't rearrange anything even if they were
legally allowed to.
So, I think you should regard all MIME parts as binary (after decoding any base64, q-p, or whatever). If some of them happen to contain plain text, so what? Just perfectly preserve every bit, possibly with lossless compression for storage, and everything should work. (Because the SMTP spec has the ridiculous requirement that mail be transmitted with CRLF line endings, some mail systems do line-ending conversion to the local convention. That's a nightmare; best to avoid it and just store everything as binary.)
Another issue is that the MIME structure (MIME part sizes, offsets) must match what got saved into dovecot's cache file, but that could be fixed with some extra code.
Right. I assumed that that area of code would need a lot of touching anyhow. If you take my advice and basically discard the base64/q-p encoding, you also can't depend on the MIME boundary being unambiguous any more. But if you're reassembling things on the fly from an SIS store, you can generate new MIME boundaries if you need them. All that stuff is just wrapping paper. (Of course, you should check the MIME specs to see what you can officially do, but I'm pretty sure most of the things that are interesting to do were anticipated. Even if not, the MIME specs only cover message transmission. You can do whatever you want in your local store.)
On Aug 14, 2009, at 8:39 PM, WJCarpenter wrote:
These days, standardized digitial signature schemes take into
account legal transformations that can happen during message
transmission. Most of them have a canonicalization formula so that
things still work. However, in early days, various schemes didn't
take that into account. Luckily, MTAs typically didn't rearrange
anything even if they were legally allowed to.
Are you sure that really works with e.g. PGP signatures? A quick look
at RFC 3156 seems to say that the data inside multipart/signed really
shouldn't be touched in any way.
participants (6)
-
Charles Marcus
-
Daniel L. Miller
-
Eric Jon Rostetter
-
Jason Fesler
-
Timo Sirainen
-
WJCarpenter