[Dovecot] (Single instance) attachment storage

Wed Aug 25 01:16:54 EEST 2010

  On 24/08/2010 16:48, Timo Sirainen wrote:
>
> Current implementation checks how many hard links are left for the hash
> while deleting it. If it's deleting the last reference then the final
> hashes/hash file is also deleted.

I sense an interesting race possibility here?

> The hash is already a full hash of the message. I don't really like the
> idea of trusting that a hash is unique.

If SHA-1 becomes breakable in sensible time then you have a whole host 
of other attack vectors right now.  I believe your mercurial repo is 
using SHA-1 hashes to detect tampering for example?  (Also SSL, TLS, 
PGP, SSH and a bunch of other rarely used applications...)

At the moment SHA-256 is considered "good enough for the US 
government".  SHA-3 should be out in a couple of years

>   Especially because this could be
> attacked against. Someone could read another user's attachment if they
> only knew its hash and then were able to create another file with the
> same hash and send it to themselves in the same system.

I can't argue that unknown security issues won't be found, because you 
can only talk about the known ones by definition...

That said I don't see that you can ever solve the de-duplicating problem 
if you don't trust your hash algorithm?  At some point you are going to 
bite the bullet and say that attachment A and B have the same hash so 
lets hard link them together?  At that point you are vulnerable to 
someone pulling off some way to disrupt your system if they can figure 
out how to generate attachments with arbitrary hashes?

At the moment I would claim that you are just automatically generating a 
very complicated filename.  If you never trust your hash then you might 
as well instead simply use one of the existing GUID algorithms, if you 
trust your hash then you use that.  I don't really see the point of a 
halfway house really?

> I might make Dovecot trust the hash optionally anyway, but not
> unconditionally.

I don't really see how you can get around trusting the hashes at some 
point if you are de-duping?

SHA-1 will become breakable at some point for certain.  I don't think 
that makes trusting SHA-1 hashes useless though.  Various programming 
techniques can still be used to push out the life of this technique 
quite a bit further.  For example:

- Compute relatively cheap secondary hash, eg even CRC32.  Causing a 
collision in two hashes is likely to be more difficult than a single hash
- Check attachment length.  Likely this will make it harder to generate 
a collision
- You already commented that it's reasonably hard to access the hash in 
the first place (caveat idiots like me...)
- Use SHA-2 or some other hash, as of right now there are no attacks 
against SHA-2, likely it has a few years life..?

Just a thought?

Cheers

Ed W