SIS and tracing the origin of an attachment

doug cincodemayo_67 at yahoo.com
Wed Mar 16 14:16:57 UTC 2022


On 3/16/2022 6:05 AM, Patrick Cernko wrote:
> Hi all,
>
> On 15.03.22 22:40, doug wrote:
>>
>>
>> On 3/15/2022 3:45 PM, Oscar del Rio wrote:
>>> On 2022-03-15 9:02 a.m., doug wrote:
>>>> On 3/8/2022 5:51 PM, doug wrote:
>>>>>
>>>>> I'm trying to trace an attachment within an SIS subdirectory to 
>>>>> the email message(s) that link to it. I say messages because I'm 
>>>>> also using dovecot dedup. My understanding is the linked file name 
>>>>> is the hash value of the attachments contents concatenated with 
>>>>> the GUID of the email message. I have had marginal success with a 
>>>>> message I created myself.
>>>>>
>>>>> Example: I generated an email with two attachments. Here are the 
>>>>> links in my attachment directory.
>>>>> ./26/c5/26c5c540d41779d83d2f5388041d05c67d720d9a-73eca8051acd276272310000f2bc99a3 
>>>>>
>>>>> ./65/cd/65cd73112a489ef07f17ed5740aa60358e2dd3fb-74eca8051acd276272310000f2bc99a3 
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> I keep experimenting with this and I still haven't found a reliable 
>>>> way to track an attachment back to it's original message so I can 
>>>> either notify the user or delete the message with doveadm. Is this 
>>>> not possible? I'm using mdbox if that matters. I see a similar 
>>>> thread going right now about virus scanning and deleting messages 
>>>> but that is maildir and I suspect not using SIS for attachments.
>>>
>>> The very few times I've needed to trace a SIS attachment to a 
>>> mailbox, I just grep the "storage" folders for the file hash
>>>
>>> find username/storage -type f -exec grep 
>>> 9ffa4b246589f8039d123ea909f1520e791bd880 {} +
>>> username/storage/m.46588:X908 2409141 B72 
>>> 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40 
>>>
>>> username/storage/m.46589:X1918 2409141 B72 
>>> 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40 
>>>
>>>
>>> username/storage/m.46588:
>>> BSent
>>> X908 2409141 B72 
>>> 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40 
>>>
>>>
>>> username/storage/m.46589:
>>> BINBOX
>>> X1918 2409141 B72 
>>> 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40 
>>>
>>>
>>> -> Attachment in username's INBOX and Sent folders.
>>>
>>
>> Thank you for the suggestion Oscar. My mdbox files are encrypted and 
>> compressed, so unfortunately directly grepping them will not work.
>>
>>
>
> You can use "doveadm dump" to decompress the files for grepping them, 
> not sure about encryption:
>
> find path/to/userhomes/mdbox/storage -name 'm.*' | \
>   while read f; do
>     doveadm dump $f | \
>       grep -E '^msg.(ext-ref|orig-mailbox|guid)' | \
>       grep -B2 xx/yy/hash-guid || continue
>     echo "Match in $f"
>   done
>
> The dump also contains several other fields you might want to display.
>
> Best,

I'll give that a try. With access to the encryption key doveadm dump 
should handle it just fine. I was hopeful there was a method using 
search and index files to minimize overhead.

To summarize what I think I have learned on this journey is the link to 
the hash file only exists within the contents of the email body, but not 
in a way that doveadm search will find it. Hence raw scanning the 
contents of the emails is required.

Many thanks for everyone's help.

--
Doug




More information about the dovecot mailing list