sis deduplication broken from 2.2.16 upwards

Harald Leithner leithner at itronic.at
Fri Mar 11 10:17:34 UTC 2016


Am 11.03.2016 um 01:56 schrieb Timo Sirainen:
>
>> On 11 Mar 2016, at 02:37, Charles Marcus <CMarcus at Media-Brokers.com> wrote:
>>
>> On 3/9/2016 9:02 PM, Timo Sirainen <tss at iki.fi> wrote:
>>> On 08 Mar 2016, at 01:50, Pavel Stano <stanojr at websupport.sk> wrote:
>>>>
>>>> sis attachment deduplication is broken in 2.2.16 upwards.
>>>> It is caused by this commit.
>>>> https://github.com/dovecot/core/commit/664bf3e236c214aee86294483c379e4fa66c2e63
>>>>
>>>> in src/lib-fs/fs-sis.c function fs_sis_try_link() is comparation of
>>>> inodes of hash files.
>>>> Because fs_stat() after that commit use fstat() on open fd of temporary
>>>> file instead of stat on filename. But that temporary file has differnt
>>>> inode.
>>>>
>>>> It not cause any corruption but it will not save any space.
>>>> Because every duplicate attachment will be in separate file.
>>> Thanks, fixed: https://github.com/dovecot/core/commit/3b39022ea0513363241cf852b7d454c841584ea1
>>
>> So, after the fix is applied, does dovecot silently delete the
>> duplicated files, or is there a command that needs to be run manually?
>
> You'd have to do it manually in some way. A script that does something like:
>
> Go through all attachment directories and for each file:
>   - Sort files by filename
>   - Identify that files A and B the same (beginning of the filename begins with same hash), but have a different inode
>   - ln A B.tmp && mv B.tmp B
>

This functionality is how it works in sis-queue correct?

Wouldn't it be nice to adopted doveadm sis deduplicate to handle this?

regards

-- 
Harald Leithner

ITronic
Wiedner Hauptstraße 120/5.1, 1050 Wien, Austria
Tel: +43-1-545 0 604
Mobil: +43-699-123 78 4 78
Mail: leithner at itronic.at | itronic.at


More information about the dovecot mailing list