Sieve to strip html from multipart messages
Is is possible (I’m sure it is, so how) to strip multipart messages that are HTML and text of the HTML portion and leave the bare text message without also stripping other parts (like images or attached files).
Or also to take messages that are only HTML and strip the HTML and replace it with a plain text version?
I used to do this a long time ago with procmail and lynx, but it was never reliable since procmail doesn’t really understand MIME.
On 2019-06-10 20:39, @lbutlr via dovecot wrote:
Is is possible (I’m sure it is, so how) to strip multipart messages that are HTML and text of the HTML portion and leave the bare text message without also stripping other parts (like images or attached files).
Or also to take messages that are only HTML and strip the HTML and replace it with a plain text version?
I used to do this a long time ago with procmail and lynx, but it was never reliable since procmail doesn’t really understand MIME.
It should be possible by piping the mail through a script, using the sieve-extprogram plugin [1]. You would then need to find or write a script to strip out the HTML. For example if you are fluent in Perl then the Email::MIME module would be a good starting point, or if you used to do it with procmail, perhaps the script you used with it could be adapted to suit.
If you want to strip html from *ALL* emails, then it would probably make more sense to put the hooks into your MTA config. See this [2[ stack overflow post.
1: https://wiki2.dovecot.org/Pigeonhole/Sieve/Plugins/Extprograms
2: https://serverfault.com/questions/506894/how-to-route-email-to-a-script
-- David Pottage
On 13 Jun2019, at 05:57, David Pottage via dovecot dovecot@dovecot.org wrote:
On 2019-06-10 20:39, @lbutlr via dovecot wrote:
Is is possible (I’m sure it is, so how) to strip multipart messages that are HTML and text of the HTML portion and leave the bare text message without also stripping other parts (like images or attached files). Or also to take messages that are only HTML and strip the HTML and replace it with a plain text version? I used to do this a long time ago with procmail and lynx, but it was never reliable since procmail doesn’t really understand MIME.
It should be possible by piping the mail through a script, using the sieve-extprogram plugin [1]. You would then need to find or write a script to strip out the HTML. For example if you are fluent in Perl then the Email::MIME module would be a good starting point, or if you used to do it with procmail, perhaps the script you used with it could be adapted to suit.
If it had ever been reliable, I would. I think Mimedefang is probably the way to go now, running vi an extprogram as you suggest.
If you want to strip html from *ALL* emails, then it would probably make more sense to put the hooks into your MTA config. See this [2[ stack overflow post.
Thank you for the info. Not sure I want to dive back into perl after a pretty happy nearly 20 years of avoidance, but maybe.
I certainly do not want to strip all HTML from all user mail. Really I just want to remove the HTML from the few lists that for reasons I cannot understand, allow HTML to be posted to them.
These damn whippersnappers with their white backgrounds and grey text. Why I oughta…! Get off my lawn!
😃
-- <[TN]FBMachine> I got kicked out of Barnes and Noble once for moving all the bibles into the fiction section
participants (2)
-
@lbutlr
-
David Pottage