SIS and tracing the origin of an attachment
Hi All,
I'm trying to trace an attachment within an SIS subdirectory to the email message(s) that link to it. I say messages because I'm also using dovecot dedup. My understanding is the linked file name is the hash value of the attachments contents concatenated with the GUID of the email message. I have had marginal success with a message I created myself.
Example: I generated an email with two attachments. Here are the links in my attachment directory. ./26/c5/26c5c540d41779d83d2f5388041d05c67d720d9a-73eca8051acd276272310000f2bc99a3 ./65/cd/65cd73112a489ef07f17ed5740aa60358e2dd3fb-74eca8051acd276272310000f2bc99a3
In my sent folder the actual GUID of the message is 75eca8051acd276272310000f2bc99a3. So the GUID of the attachment is based on the GUID of the message, but not exact. The second hex byte seems to be decremented as an offset of the attachment index from the GUID of the message. At least in my one example.
# doveadm dump /mailstore/doug/mail/mailboxes/Sent/dbox-Mails/dovecot.index | grep guid | tail -1 - guid: 75eca8051acd276272310000f2bc99a3
With that actual GUID I can find the message with a search: # doveadm search -u doug mailbox Sent guid 75eca8051acd276272310000f2bc99a3 doug e5711f1cf2c9294f71090000059b96e4 53526
Now let's try to track down another email when only the HASH-GUID value is known. Here is one randomly picked.
./00/a2/00a2d5de3e41053d59bd10084826bbe094aa1c59-57857b09d1a327627e260000f2bc99a3
# doveadm search -A mailbox '*' guid 57857b09d1a327627e260000f2bc99a3 # doveadm search -A mailbox '*' guid 58857b09d1a327627e260000f2bc99a3 # doveadm search -A mailbox '*' guid 59857b09d1a327627e260000f2bc99a3
I repeated this incrementing and decrementing from 5085... through 5f85... and never located the message.
This seems like it should be trivial but I've been struggling with it for days. The GUID isn't random, there must be a way to track the attachment back. What am I missing?
And for those wondering why, our virus scanner flagged a number of attachments, some with several links, and I want ask the users to delete the offending messages so I can purge them from the server. If I can find the emails I can give them the mail folder, date/time, and subject of the message.
-- Doug
On 3/8/2022 5:51 PM, doug wrote:
Hi All,
I'm trying to trace an attachment within an SIS subdirectory to the email message(s) that link to it. I say messages because I'm also using dovecot dedup. My understanding is the linked file name is the hash value of the attachments contents concatenated with the GUID of the email message. I have had marginal success with a message I created myself.
Example: I generated an email with two attachments. Here are the links in my attachment directory. ./26/c5/26c5c540d41779d83d2f5388041d05c67d720d9a-73eca8051acd276272310000f2bc99a3
./65/cd/65cd73112a489ef07f17ed5740aa60358e2dd3fb-74eca8051acd276272310000f2bc99a3
In my sent folder the actual GUID of the message is 75eca8051acd276272310000f2bc99a3. So the GUID of the attachment is based on the GUID of the message, but not exact. The second hex byte seems to be decremented as an offset of the attachment index from the GUID of the message. At least in my one example.
# doveadm dump /mailstore/doug/mail/mailboxes/Sent/dbox-Mails/dovecot.index | grep guid | tail -1 - guid: 75eca8051acd276272310000f2bc99a3
With that actual GUID I can find the message with a search: # doveadm search -u doug mailbox Sent guid 75eca8051acd276272310000f2bc99a3 doug e5711f1cf2c9294f71090000059b96e4 53526
Now let's try to track down another email when only the HASH-GUID value is known. Here is one randomly picked.
./00/a2/00a2d5de3e41053d59bd10084826bbe094aa1c59-57857b09d1a327627e260000f2bc99a3
# doveadm search -A mailbox '*' guid 57857b09d1a327627e260000f2bc99a3 # doveadm search -A mailbox '*' guid 58857b09d1a327627e260000f2bc99a3 # doveadm search -A mailbox '*' guid 59857b09d1a327627e260000f2bc99a3
I repeated this incrementing and decrementing from 5085... through 5f85... and never located the message.
This seems like it should be trivial but I've been struggling with it for days. The GUID isn't random, there must be a way to track the attachment back. What am I missing?
And for those wondering why, our virus scanner flagged a number of attachments, some with several links, and I want ask the users to delete the offending messages so I can purge them from the server. If I can find the emails I can give them the mail folder, date/time, and subject of the message.
I keep experimenting with this and I still haven't found a reliable way to track an attachment back to it's original message so I can either notify the user or delete the message with doveadm. Is this not possible? I'm using mdbox if that matters. I see a similar thread going right now about virus scanning and deleting messages but that is maildir and I suspect not using SIS for attachments.
-- Doug
On 2022-03-15 9:02 a.m., doug wrote:
On 3/8/2022 5:51 PM, doug wrote:
I'm trying to trace an attachment within an SIS subdirectory to the email message(s) that link to it. I say messages because I'm also using dovecot dedup. My understanding is the linked file name is the hash value of the attachments contents concatenated with the GUID of the email message. I have had marginal success with a message I created myself.
Example: I generated an email with two attachments. Here are the links in my attachment directory. ./26/c5/26c5c540d41779d83d2f5388041d05c67d720d9a-73eca8051acd276272310000f2bc99a3
./65/cd/65cd73112a489ef07f17ed5740aa60358e2dd3fb-74eca8051acd276272310000f2bc99a3
I keep experimenting with this and I still haven't found a reliable way to track an attachment back to it's original message so I can either notify the user or delete the message with doveadm. Is this not possible? I'm using mdbox if that matters. I see a similar thread going right now about virus scanning and deleting messages but that is maildir and I suspect not using SIS for attachments.
The very few times I've needed to trace a SIS attachment to a mailbox, I just grep the "storage" folders for the file hash
find username/storage -type f -exec grep 9ffa4b246589f8039d123ea909f1520e791bd880 {} + username/storage/m.46588:X908 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40 username/storage/m.46589:X1918 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40
username/storage/m.46588: BSent X908 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40
username/storage/m.46589: BINBOX X1918 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40
-> Attachment in username's INBOX and Sent folders.
On 3/15/2022 3:45 PM, Oscar del Rio wrote:
On 2022-03-15 9:02 a.m., doug wrote:
On 3/8/2022 5:51 PM, doug wrote:
I'm trying to trace an attachment within an SIS subdirectory to the email message(s) that link to it. I say messages because I'm also using dovecot dedup. My understanding is the linked file name is the hash value of the attachments contents concatenated with the GUID of the email message. I have had marginal success with a message I created myself.
Example: I generated an email with two attachments. Here are the links in my attachment directory. ./26/c5/26c5c540d41779d83d2f5388041d05c67d720d9a-73eca8051acd276272310000f2bc99a3
./65/cd/65cd73112a489ef07f17ed5740aa60358e2dd3fb-74eca8051acd276272310000f2bc99a3
I keep experimenting with this and I still haven't found a reliable way to track an attachment back to it's original message so I can either notify the user or delete the message with doveadm. Is this not possible? I'm using mdbox if that matters. I see a similar thread going right now about virus scanning and deleting messages but that is maildir and I suspect not using SIS for attachments.
The very few times I've needed to trace a SIS attachment to a mailbox, I just grep the "storage" folders for the file hash
find username/storage -type f -exec grep 9ffa4b246589f8039d123ea909f1520e791bd880 {} + username/storage/m.46588:X908 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40 username/storage/m.46589:X1918 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40
username/storage/m.46588: BSent X908 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40
username/storage/m.46589: BINBOX X1918 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40
-> Attachment in username's INBOX and Sent folders.
Thank you for the suggestion Oscar. My mdbox files are encrypted and compressed, so unfortunately directly grepping them will not work.
Hi all,
On 15.03.22 22:40, doug wrote:
On 3/15/2022 3:45 PM, Oscar del Rio wrote:
On 2022-03-15 9:02 a.m., doug wrote:
On 3/8/2022 5:51 PM, doug wrote:
I'm trying to trace an attachment within an SIS subdirectory to the email message(s) that link to it. I say messages because I'm also using dovecot dedup. My understanding is the linked file name is the hash value of the attachments contents concatenated with the GUID of the email message. I have had marginal success with a message I created myself.
Example: I generated an email with two attachments. Here are the links in my attachment directory. ./26/c5/26c5c540d41779d83d2f5388041d05c67d720d9a-73eca8051acd276272310000f2bc99a3
./65/cd/65cd73112a489ef07f17ed5740aa60358e2dd3fb-74eca8051acd276272310000f2bc99a3
I keep experimenting with this and I still haven't found a reliable way to track an attachment back to it's original message so I can either notify the user or delete the message with doveadm. Is this not possible? I'm using mdbox if that matters. I see a similar thread going right now about virus scanning and deleting messages but that is maildir and I suspect not using SIS for attachments.
The very few times I've needed to trace a SIS attachment to a mailbox, I just grep the "storage" folders for the file hash
find username/storage -type f -exec grep 9ffa4b246589f8039d123ea909f1520e791bd880 {} + username/storage/m.46588:X908 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40
username/storage/m.46589:X1918 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40
username/storage/m.46588: BSent X908 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40
username/storage/m.46589: BINBOX X1918 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40
-> Attachment in username's INBOX and Sent folders.
Thank you for the suggestion Oscar. My mdbox files are encrypted and compressed, so unfortunately directly grepping them will not work.
You can use "doveadm dump" to decompress the files for grepping them, not sure about encryption:
find path/to/userhomes/mdbox/storage -name 'm.*' |
while read f; do
doveadm dump $f |
grep -E '^msg.(ext-ref|orig-mailbox|guid)' |
grep -B2 xx/yy/hash-guid || continue
echo "Match in $f"
done
The dump also contains several other fields you might want to display.
Best,
Patrick Cernko <pcernko@mpi-klsb.mpg.de> +49 681 9325 5815 Joint Administration: Information Services and Technology Max-Planck-Institute fuer Informatik & Softwaresysteme
On 3/16/2022 6:05 AM, Patrick Cernko wrote:
Hi all,
On 15.03.22 22:40, doug wrote:
On 3/15/2022 3:45 PM, Oscar del Rio wrote:
On 2022-03-15 9:02 a.m., doug wrote:
On 3/8/2022 5:51 PM, doug wrote:
I'm trying to trace an attachment within an SIS subdirectory to the email message(s) that link to it. I say messages because I'm also using dovecot dedup. My understanding is the linked file name is the hash value of the attachments contents concatenated with the GUID of the email message. I have had marginal success with a message I created myself.
Example: I generated an email with two attachments. Here are the links in my attachment directory. ./26/c5/26c5c540d41779d83d2f5388041d05c67d720d9a-73eca8051acd276272310000f2bc99a3
./65/cd/65cd73112a489ef07f17ed5740aa60358e2dd3fb-74eca8051acd276272310000f2bc99a3
I keep experimenting with this and I still haven't found a reliable way to track an attachment back to it's original message so I can either notify the user or delete the message with doveadm. Is this not possible? I'm using mdbox if that matters. I see a similar thread going right now about virus scanning and deleting messages but that is maildir and I suspect not using SIS for attachments.
The very few times I've needed to trace a SIS attachment to a mailbox, I just grep the "storage" folders for the file hash
find username/storage -type f -exec grep 9ffa4b246589f8039d123ea909f1520e791bd880 {} + username/storage/m.46588:X908 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40
username/storage/m.46589:X1918 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40
username/storage/m.46588: BSent X908 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-c9ee303687e13062cf740012bfe47a40
username/storage/m.46589: BINBOX X1918 2409141 B72 9f/fa/9ffa4b246589f8039d123ea909f1520e791bd880-080ce71390e1306299730012bfe47a40
-> Attachment in username's INBOX and Sent folders.
Thank you for the suggestion Oscar. My mdbox files are encrypted and compressed, so unfortunately directly grepping them will not work.
You can use "doveadm dump" to decompress the files for grepping them, not sure about encryption:
find path/to/userhomes/mdbox/storage -name 'm.*' |
while read f; do doveadm dump $f |
grep -E '^msg.(ext-ref|orig-mailbox|guid)' |
grep -B2 xx/yy/hash-guid || continue echo "Match in $f" doneThe dump also contains several other fields you might want to display.
Best,
I'll give that a try. With access to the encryption key doveadm dump should handle it just fine. I was hopeful there was a method using search and index files to minimize overhead.
To summarize what I think I have learned on this journey is the link to the hash file only exists within the contents of the email body, but not in a way that doveadm search will find it. Hence raw scanning the contents of the emails is required.
Many thanks for everyone's help.
-- Doug
participants (3)
-
doug
-
Oscar del Rio
-
Patrick Cernko