On Wed, 2011-12-07 at 17:02 +0100, Yann Dupont wrote:
before doing rm -rf for the user's mails. And in the archiving step you should do it with dsync with mail_attachment_dir disabled in the destination storage, so the the attachments get written to the archive directly instead of only referencing SIS.
Yes, I understand, it will work. But, if case of any error (even our fault : premature end of script, for example) you can still end up with attachement forever lost on the filesystem.
Right, it SHOULD not happen, and it probably won't represent a big volume. But Still, it could happen under specific circonstances. In that case, I don't see any simple way to detect that kind of files ?
Do you see how a script could detect some orphaned links ??
It wouldn't be simple. The only safe way would be to:
Scan through all the attachment HASH-GUID names and save them. This scanning step could already detect some orphaned attachments, where the hashes/HASH file exists with nlink=1 (i.e. HASH-GUID* files have been deleted, but the HASH itself hasn't been for some reason).
Read through all users' all dboxes contents and get a list of all referenced attachment HASH-GUIDs.
Delete all attachments that exist in list 1, but not in list 2.
I guess there should be a "doveadm sis rescan" command that does this.