On 24.8.2010, at 23.16, Ed W wrote:
On 24/08/2010 16:48, Timo Sirainen wrote:
Current implementation checks how many hard links are left for the hash while deleting it. If it's deleting the last reference then the final hashes/hash file is also deleted.
I sense an interesting race possibility here?
Yes, but it doesn't matter much. The worst that can happen is that the file gets duplicated about once (because hashes/file gets deleted too early).
The hash is already a full hash of the message. I don't really like the idea of trusting that a hash is unique.
If SHA-1 becomes breakable in sensible time then you have a whole host of other attack vectors right now.
But not Dovecot itself.
I believe your mercurial repo is using SHA-1 hashes to detect tampering for example? (Also SSL, TLS, PGP, SSH and a bunch of other rarely used applications...)
Yeah, but those aren't Dovecot itself. :)
I can't argue that unknown security issues won't be found, because you can only talk about the known ones by definition...
That said I don't see that you can ever solve the de-duplicating problem if you don't trust your hash algorithm? At some point you are going to bite the bullet and say that attachment A and B have the same hash so lets hard link them together? At that point you are vulnerable to someone pulling off some way to disrupt your system if they can figure out how to generate attachments with arbitrary hashes?
By default Dovecot will do byte-by-byte comparison of the data before hard linking them together. It's not trusting the hash, it's only using it to quickly find potential files for deduplication.
BTW. http://valerieaurora.org/review/hash.html talks about this same thing.