On January 22, 2010 11:44:07 PM +0200 Timo Sirainen tss@iki.fi wrote:
On 22.1.2010, at 23.39, Frank Cusack wrote:
On January 22, 2010 11:21:09 PM +0200 Timo Sirainen tss@iki.fi wrote:
Or will there be a global index?
Yes. That's what dbox SIS is about. You have a global repository of (large) MIME parts, indexed by their SHA1 sum (or something).
In the case of zfs then, the filesystem may as well do the dedup'ing.
Or "dbox may as well do the deduping"? :) I guess it comes down to whose algorithm is fastest.
Yeah, I just meant that if dbox has a global hash list then either method should have similar overhead. zfs checksums every single block written anyway (regardless of dedup) so I think it would be faster vs dbox.
Of course dbox can be used on systems without zfs.
I would suggest that using zfs would give you more portability (mail files appear "normal" and copied or manipulated however you care to), however normal mail files do not separate the headers and the message parts so that isn't valid.
-frank