Mass Stripping Attachments by Directory, Age, Size

Plutocrat plutocrat at gmail.com
Sat Mar 20 08:01:41 EET 2021


On 19/03/2021 07.31, Joseph Tam wrote:
>> I've found the strip_attachments.pl script here <
https://fossies.org/linux/Mail-Box/examples/strip-attachments.pl> which
works fine on mbox (as tested on my local Thunderbird mboxes), but not on
maildir which is on the dovecot server. My Perl isn't strong enough to
re-purpose it.
>
> It you have anything that works on mbox, it will probably work on Maildir
> as each file can be considered a single message mbox.  You can combine
> the script with
>
>      find ~user/MailDir -type f ... -exec /path/to/mbox-strip {} \;

I thought that too, but my initial test on a single message file didn't
work like that. I think I got a zero length file. I'll dig into the code to
see if I can figure it out, although my Perl hasn't been used for 20 years
or so ...

> The ... can be replaced with more file tests (like minimum size or age or
only within */cur/) to cut down on processing.

Sure. I'm quite handy with find, sed, awk and all that bash malarkey. I was
actually wondering if it could be done with those alone, but it would make
more sense to use a library which understands mime already, and does the
heavy lifting. This approach might be good as a last resort.

> MIMEDefang may help.
Nice. Thanks for the pointer.

P.

On Fri, Mar 19, 2021 at 7:31 AM Joseph Tam <jtam.home at gmail.com> wrote:

> On Thu, 18 Mar 2021, Plutocrat wrote:
>
> > I've been looking around for a solution to this problem. I want to prune
> down
> > the attachments on a server before a migration. Some of the emails are 7
> > years old and have 40Mb attachments, so this seems like a good
> opportunity to
> > rationalize things. So perhaps I'd like to "Remove all attachments from
> > emails older than 2 years, in the .Sent directory", or "Attachments over
> 10Mb
> > anywhere in the mail tree"
> >
> > I've found the strip_attachments.pl script here
> > <https://fossies.org/linux/Mail-Box/examples/strip-attachments.pl>
> which
> > works fine on mbox (as tested on my local Thunderbird mboxes), but not
> on
> > maildir which is on the dovecot server. My Perl isn't strong enough to
> > re-purpose it.
>
> It you have anything that works on mbox, it will probably work on Maildir
> as each file can be considered a single message mbox.  You can combine
> the script with
>
>         find ~user/MailDir -type f ... -exec /path/to/mbox-strip {} \;
>
> The ... can be replaced with more file tests (like minimum size or age
> or only within */cur/) to cut down on processing.
>
> I wrote a gawk script to slim down a multi-Gb Outlook mbox
> for a user, but it wasn't really complicated, just matching for
> /^Content-Transfer-Encoding:.*base64/i header (virtually all bulky data
> will be encoded this way), buffering the base64 data part, then outputting
> it if it was small, or deleting/replacing/extracting it otherwise.
>
> It was a one-off discarded tool but I can hunt for it if you're hard up.
>
> > I've looked at ripmime and mpack/munpack, and although they seem like
> useful
> > tools to do the job of deconstructing the mail into its constituent
> parts, it
> > doesn't seem to help in re-building the email. I think they could be
> used
> > with a bit of study into mail MIME structure, and used with a helper
> script.
> >
> > So before I take a deep dive into scripting my own solution, I just
> wanted to
> > check if anyone else on the list has been through this and has some
> resources
> > or pointers they can share, or maybe even someone to tell me "Duh, you
> can do
> > it with doveadm of course".
>
> MIMEDefang may help.
>
> Joseph Tam <jtam.home at gmail.com>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dovecot.org/pipermail/dovecot/attachments/20210320/4f0d73bf/attachment.html>


More information about the dovecot mailing list