SIS and tracing the origin of an attachment
doug
cincodemayo_67 at yahoo.com
Wed Mar 16 14:16:57 UTC 2022
On 3/16/2022 6:05 AM, Patrick Cernko wrote:
> Hi all,
>
> On 15.03.22 22:40, doug wrote:
>>
>>
>> On 3/15/2022 3:45 PM, Oscar del Rio wrote:
>>> On 2022-03-15 9:02 a.m., doug wrote:
>>>> On 3/8/2022 5:51 PM, doug wrote:
>>>>>
>>>>> I'm trying to trace an attachment within an SIS subdirectory to
>>>>> the email message(s) that link to it. I say messages because I'm
>>>>> also using dovecot dedup. My understanding is the linked file name
>>>>> is the hash value of the attachments contents concatenated with
>>>>> the GUID of the email message. I have had marginal success with a
>>>>> message I created myself.
>>>>>
>>>>> Example: I generated an email with two attachments. Here are the
>>>>> links in my attachment directory.
>>>>> ./26/c5/26c5c540d41779d83d2f5388041d05c67d720d9a-73eca8051acd276272310000f2bc99a3
>>>>>
>>>>> ./65/cd/65cd73112a489ef07f17ed5740aa60358e2dd3fb-74eca8051acd276272310000f2bc99a3
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> I keep experimenting with this and I still haven't found a reliable
>>>> way to track an attachment back to it's original message so I can
>>>> either notify the user or delete the message with doveadm. Is this
>>>> not possible? I'm using mdbox if that matters. I see a similar
>>>> thread going right now about virus scanning and deleting messages
>>>> but that is maildir and I suspect not using SIS for attachments.
>>>
>>> The very few times I've needed to trace a SIS attachment to a
>>> mailbox, I just grep the "storage" folders for the file hash
>>>
>>> find username/storage -type f -exec grep
>>> 9ffa4b246589f8039d123ea909f1520e791bd880 {} +
>>> username/storage/m.46588:X908 2409141 B72
>>> 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40
>>>
>>> username/storage/m.46589:X1918 2409141 B72
>>> 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40
>>>
>>>
>>> username/storage/m.46588:
>>> BSent
>>> X908 2409141 B72
>>> 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40
>>>
>>>
>>> username/storage/m.46589:
>>> BINBOX
>>> X1918 2409141 B72
>>> 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40
>>>
>>>
>>> -> Attachment in username's INBOX and Sent folders.
>>>
>>
>> Thank you for the suggestion Oscar. My mdbox files are encrypted and
>> compressed, so unfortunately directly grepping them will not work.
>>
>>
>
> You can use "doveadm dump" to decompress the files for grepping them,
> not sure about encryption:
>
> find path/to/userhomes/mdbox/storage -name 'm.*' | \
> while read f; do
> doveadm dump $f | \
> grep -E '^msg.(ext-ref|orig-mailbox|guid)' | \
> grep -B2 xx/yy/hash-guid || continue
> echo "Match in $f"
> done
>
> The dump also contains several other fields you might want to display.
>
> Best,
I'll give that a try. With access to the encryption key doveadm dump
should handle it just fine. I was hopeful there was a method using
search and index files to minimize overhead.
To summarize what I think I have learned on this journey is the link to
the hash file only exists within the contents of the email body, but not
in a way that doveadm search will find it. Hence raw scanning the
contents of the emails is required.
Many thanks for everyone's help.
--
Doug
More information about the dovecot
mailing list