On Fri, 2010-08-27 at 09:34 -0700, Daniel L. Miller wrote:
On 8/24/2010 4:35 PM, Timo Sirainen wrote:
On 24.8.2010, at 23.16, Ed W wrote:
At the moment I would claim that you are just automatically generating a very complicated filename. If you never trust your hash then you might as well instead simply use one of the existing GUID algorithms, if you trust your hash then you use that. I don't really see the point of a halfway house really? Oh and this current scheme of hash-guid + hashes/hash hard linking is required in any case to keep track of reference counting. Unconditionally trusting the hash wouldn't make it any simpler. With key-value databases you'd have to figure out some other way to keep track of how many references there are to the attachment.
Can you append some "trivial" information from the data file to the hash in generating the file name to help ensure uniqueness? Like filesize,
I guess size could be there at least optionally, I'm not sure about as default.
mimetype,
I think different clients could use different MIME types sometimes, causing unnecessary duplicates.
and/or date?
I don't think attachments ever have dates? But if they did, again the problem of causing unnecessary duplicates.