Invoking the spam checker on the sieve script

Alejandro Exojo

23 Oct 2014 23 Oct '14

6:11 p.m.

Hi.

TL/DR version:

Is it advisable to invoke an spam checker from the sieve script, and then, once the message is filtered, decide if should be moved to a certain spam folder?

A bit more context on why I'm considering this:

I'm just a geek who wants to handle his own mail, but not a great experience as a system administrator. I don't have any corporate setup, just a simple VPS with me as the only user.

Previously I fetched all my mail through POP, and in the client of my PC I moved (lots of) mailing list traffic to its folders. Then, for those remaining messages (and only those remaining) I passed the spam filter and moved to spam folder. That's a significant saving since I have many mailing list subscriptions, and now I'm even using rss2email, so I have lots of email that is not spam.

I moved to IMAP and my filters are server side with sieve, but I don't have spam filtering yet. I thought I could replicate the setup easily, but it seems nobody is doing that, and everyone has the mail already scored when it reaches sieve. Seems like the "extprograms" extension would be a fit, but again, nobody seems to mention on documentation, so I'm fearing I'm probably wrong.

Suggestions?

Thank you very much!

-- Alex (a.k.a. suy) | GPG ID 0x0B8B0BC2 http://barnacity.net/ | http://disperso.net

Show replies by date

Robert Schetterer

23 Oct 23 Oct

6:56 p.m.

Am 23.10.2014 um 17:11 schrieb Alejandro Exojo:

...

Hi.

TL/DR version:

Is it advisable to invoke an spam checker from the sieve script, and then, once the message is filtered, decide if should be moved to a certain spam folder?

why not use i.e spamass milter with postfix to flag, and then use sieve global rule to sort in Users Imap Junk Folder, to download SPAM Mails in imap Junk folder via pop3 to , use dovecot virtual plugin

see

https://sys4.de/de/blog/2013/02/11/dovecot-virtual-setup-mit-globaler-sieve-...

sorry german only but configs should speak for its own

...

A bit more context on why I'm considering this:

I'm just a geek who wants to handle his own mail, but not a great experience as a system administrator. I don't have any corporate setup, just a simple VPS with me as the only user.

Previously I fetched all my mail through POP, and in the client of my PC I moved (lots of) mailing list traffic to its folders. Then, for those remaining messages (and only those remaining) I passed the spam filter and moved to spam folder. That's a significant saving since I have many mailing list subscriptions, and now I'm even using rss2email, so I have lots of email that is not spam.

I moved to IMAP and my filters are server side with sieve, but I don't have spam filtering yet. I thought I could replicate the setup easily, but it seems nobody is doing that, and everyone has the mail already scored when it reaches sieve. Seems like the "extprograms" extension would be a fit, but again, nobody seems to mention on documentation, so I'm fearing I'm probably wrong.

Suggestions?

i dont get what your problem is exactly but looks like you wanna pre sort mail about other stuff ( i.e. sender ) and doing spam sort at last, should be no problem with a sieve user rule if spam score is allready in the mail ( flagged by spamassassin before) etc

...

Thank you very much!

i wouldnt use extprograms for spam scanning, i see no need for it

Best Regards MfG Robert Schetterer

-- [*] sys4 AG

http://sys4.de, +49 (89) 30 90 46 64 Franziskanerstraße 15, 81669 München

Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263 Vorstand: Patrick Ben Koetter, Marc Schiffbauer Aufsichtsratsvorsitzender: Florian Kirstein

Alejandro Exojo

7:19 p.m.

El Thursday 23 October 2014, Robert Schetterer escribió:

...

i dont get what your problem is exactly but looks like you wanna pre sort mail about other stuff ( i.e. sender ) and doing spam sort at last, should be no problem with a sieve user rule if spam score is allready in the mail ( flagged by spamassassin before) etc

That most of my mail comes from 100% assured not spam sources: mailing lists that are already filtered or rss2email (the second probably can be skipped easily because it comes locally). I only have a small VPS, so I'm trying to save some resources if possible. Spamassassin consumes quite a lot, AFAIK.

-- Alex (a.k.a. suy) | GPG ID 0x0B8B0BC2 http://barnacity.net/ | http://disperso.net

Robert Schetterer

7:39 p.m.

Am 23.10.2014 um 18:19 schrieb Alejandro Exojo:

...

El Thursday 23 October 2014, Robert Schetterer escribió:

...
i dont get what your problem is exactly but looks like you wanna pre sort mail about other stuff ( i.e. sender ) and doing spam sort at last, should be no problem with a sieve user rule if spam score is allready in the mail ( flagged by spamassassin before) etc

That most of my mail comes from 100% assured not spam sources: mailing lists that are already filtered or rss2email (the second probably can be skipped easily because it comes locally). I only have a small VPS, so I'm trying to save some resources if possible. Spamassassin consumes quite a lot, AFAIK.

anyway if you want classify spam by your own you need some spam score software, if you know senders bypass them from spam scoring

...

Best Regards MfG Robert Schetterer

-- [*] sys4 AG

http://sys4.de, +49 (89) 30 90 46 64 Franziskanerstraße 15, 81669 München

Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263 Vorstand: Patrick Ben Koetter, Marc Schiffbauer Aufsichtsratsvorsitzender: Florian Kirstein

deano-dovecot＠areyes.com

24 Oct 24 Oct

3:46 p.m.

On 2014-10-23 12:19, Alejandro Exojo wrote:

...

That most of my mail comes from 100% assured not spam sources: mailing lists that are already filtered or rss2email (the second probably can be skipped easily because it comes locally). I only have a small VPS, so I'm trying to save some resources if possible. Spamassassin consumes quite a lot, AFAIK.

What kind of VPS are you using ? I'm in a similar boat to you, running my own domain(s) and email, and have built the mail system on a set of 3 VPS', two 6G ram that cost $7/mo and one 1G ram that's $3.50/mo. The two larger ones run exim4, spamassassin, clamav, nginx, roundcube, dovecot, munin (stats), solr (search), zpush, tinyrss, percona (mysql).

It all works swimmingly well. The main setup will run in a 2G ram VPS, albeit with some swapping. If you're on an SSD-backed VPS, it works OK

that was my old setup with Digital Ocean.

ClamAV is the memory hog, spamassassin really isn't bad , so you might give it a shot ...

24576 www-data php /usr/share/tt-rss/www/u 0 10732 12943
17572 3310 unbound /usr/sbin/unbound 0 17644 17779
19084 5298 debian-spamd spamd chil 0 1860 34989
101596 5297 debian-spamd spamd chil 0 2156 35137
101596 5292 root /usr/sbin/spamd --max-child 0 3148 36869
104944 3474 tomcat6 /usr/lib/jvm/default-java/b 0 122240 122621
124692 5480 clamav /usr/sbin/clamd 0 416496 416726
417804 20010 mysql /usr/sbin/mysqld --basedir= 0 684200 684523
686692

All the mysql stuff is a 3-node replication cluster, the two main systems and a 3rd (small one) just running percona. Dovecot is also replicating between the two main systems. This way ALL the data is replicated between them, and I can hit either main system for all functionality. Replication is over tinc encrypted sessions.

-- Dean Carpenter deano is at areyes dot com 203 six oh four 6644

Alejandro Exojo

25 Oct 25 Oct

4:55 p.m.

El Friday 24 October 2014, deano-dovecot@areyes.com escribió:

...

On 2014-10-23 12:19, Alejandro Exojo wrote: What kind of VPS are you using ? I'm in a similar boat to you, running my own domain(s) and email, and have built the mail system on a set of 3 VPS', two 6G ram that cost $7/mo and one 1G ram that's $3.50/mo. The two larger ones run exim4, spamassassin, clamav, nginx, roundcube, dovecot, munin (stats), solr (search), zpush, tinyrss, percona (mysql).

That's quite a powerful setup. :) My VPS is one of the cheapest in Hetzner: 7.9€ for 512MB of RAM. I thought of upgrading, specially because the sovereign guys (https://github.com/al3x/sovereign) claim that with 512/1024 you can use all of their setup, which is pretty powerful, much more than I would really use, I think.

...

It all works swimmingly well. The main setup will run in a 2G ram VPS, albeit with some swapping. If you're on an SSD-backed VPS, it works OK

that was my old setup with Digital Ocean.

ClamAV is the memory hog, spamassassin really isn't bad , so you might give it a shot ...

I think that running some simple spam filtering would be enough for me, so maybe I'll try to hardcode some stuff to make it at least filter something but not much.

Well, thank you all for the advice. I'll see what's easier to setup for me and give it a try next week.

-- Alex (a.k.a. suy) | GPG ID 0x0B8B0BC2 http://barnacity.net/ | http://disperso.net

Peter Chiochetti

24 Oct 24 Oct

2:35 a.m.

Am 2014-10-23 um 17:11 schrieb Alejandro Exojo:

...

I moved to IMAP and my filters are server side with sieve, but I don't have spam filtering yet.

I understand, that you do not want spamassassin (SA) to check lots of messages that are clean anyways.

If you can call SA from sieve, as a condtion in an if clause, filtering should be no problem, should it?

MUAs, e.g. Thunderbird (TB) also have good junk filters, so that might be an option too. In one account of mine, mail is filtered on the server and I later have TB filter the SPAM folder locally and occasionally they both disagree ;) False SA positives are more rare than false SA negatives. SA and TB score about the same, though TB seems to me to be more accustomed to my spool.

On a server I administer, quite similar to your setup, very few users, I recently had to turn on greylisting, which proved exceptionally well in reducing SA load: Checking is done there during SMTP time. Of course this will not help in your case when the bulk will pass...

...

I thought I could replicate the setup easily, but it seems nobody is doing that, and everyone has the mail already scored when it reaches sieve. Seems like the "extprograms" extension would be a fit, but again, nobody seems to mention on documentation, so I'm fearing I'm probably wrong.

You can use this list to provide the missing documentation ;)

-- peter

Tom Hendrikx

12:18 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256

On 24-10-14 01:35, Peter Chiochetti wrote:

...

Am 2014-10-23 um 17:11 schrieb Alejandro Exojo:

...
I moved to IMAP and my filters are server side with sieve, but I don't have spam filtering yet.

I understand, that you do not want spamassassin (SA) to check lots of messages that are clean anyways.

If you can call SA from sieve, as a condtion in an if clause, filtering should be no problem, should it?

SIeve even has a facility for doing virus/spam filtering: http://wiki2.dovecot.org/Pigeonhole/Sieve/Examples#Filtering_using_the_spamt...

But running spamasassin on the MTA level with some whitelist entries would also work, and is probably a lot easier to setup.

Regards, Tom -----BEGIN PGP SIGNATURE----- Version: GnuPG v1

iQIcBAEBCAAGBQJUShl3AAoJEJPfMZ19VO/1NfcQALheA5RhOsMbgy2hP0pjEhNe O49rjC8OTAIeTD7Eb+p6bWQ+fI03uHrpmNFeq9zMdaYeiRaJQLvi34M1xK3qC53O lyq+sPkwvqnU6Ak0G1E+UN3/BvJh22kPZJYaoMELxHh0smfnmZwNeyPjK/sBYblE 1vTo6bdPX/r9QtINxkW8cqqjlrx1FtV7PX4Nzuy/HpvtigcM8nwQFDE5QvZabXNt pT2camQtP96YHofwFQSrND+q1y2TgCribHMPi0bWxwmktAThkEFl4kDA1KUsNVmR 2tX4nk5HHkuynCWK9oNkM8FhLyraESu1JySBsghbQFk6VBnmapsYkgcpRRnOwIGF HFRIGbSsXwbue5HPkkLrNfIBZR1L7V4iJqmzWw19IrnOylE8okUL6KQcFWPA1yPS 6pL9VT2quIm9Eyqh3gbOEDtTtKg2tkI6E7DSmNEj0WqTnc7+Ax39r8zqTX47/b0N 8Xasiw3RANU1xUdXSXY8D4bGrSJ4GaKaOWIOQwICv6Lk6L6mIeRaK/OyNnLBGxwK ekOFmcTG5RdboQIjzjS9S2Rxyw/KAIxHPuWYZH9cq2IM4qKxLutogQQxkKD1MIEm IgylLkQSqnRszyz+d9shWpD1tTbXAEGN+ZNWREFHL/CeaE5YphCTvjU0nHfl+6bF Muv/j4Qh1L2acmwnIy+O =808Z -----END PGP SIGNATURE-----

Tom Hendrikx

12:26 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256

On 24-10-14 11:18, Tom Hendrikx wrote:

...

On 24-10-14 01:35, Peter Chiochetti wrote:

...
Am 2014-10-23 um 17:11 schrieb Alejandro Exojo:

...
I moved to IMAP and my filters are server side with sieve, but I don't have spam filtering yet.

...
I understand, that you do not want spamassassin (SA) to check lots of messages that are clean anyways.

...
If you can call SA from sieve, as a condtion in an if clause, filtering should be no problem, should it?

SIeve even has a facility for doing virus/spam filtering: http://wiki2.dovecot.org/Pigeonhole/Sieve/Examples#Filtering_using_the_spamt...

Never

...

mind, this was only for evaluating the headers added in an earlier stage, not running the spam classifier itself. You could still do that on a sieve level using extprograms, but the MTA route is a lot easier to setup.

I did use extprograms to do automated bayes training as an experiment: while running 2 spam filters (X and Y), if X thinks it's spam and Y is unsure, sieve automatically triggers a script that makes Y learn the message as spam.

Regards, Tom -----BEGIN PGP SIGNATURE----- Version: GnuPG v1

iQIcBAEBCAAGBQJUShs0AAoJEJPfMZ19VO/1wKYP/3SXXlgktjFKL7wy90zJwFcy WoXAIMFbq2JJNbIcPDdzVR4xY7SDMTAcNNbuzJKVU/dPxNfbGIs1rlaM3hF3pdx3 nRpLhICAyD6J+IC4Ov5nBE/N0tbh5pefLkEvCyIRuKtRBljkbrKp2xy//mad+5yP pBkGVwNYaLcLeYsTIkiDD6zErRJknXJw/ddnsRRfAb/dwOT38XdP72ZzojndI3CC +hbh21iaTebkyNAL1N7ekR/IJ/kMK9cPDLKWoM/aa7SFpqYkd2tNqIBHmwDhXbAw MhF/S46qvejbBCqFhEvfoWWpIJrHeji5QfSpYe9fp5K3cQtN/gdVcl0w7T+vr8C1 UHnvoCIGNeN6HsabaTyNnoKh90oneEO7io6/e5AsZWPkBll3jgBmxHwctojd4/qC bQRqK7LXyMia+ff+YgHmseyif56dLX6R96CFI3Y1OhnMPRQxPlSczTDEXPpRxiwl bpIa54iJP9bitooq024F31/olaSy4qpsAzDe4tEm3J2iRHh3RcGDQbMw+ogiVTn/ owuvzxBBj/0Hx8HswYrp2hOy+jt29kbTNpP2fmfev66aRYERJT7SUeRlwL5mfNwt BuVy28ctq0JXuc/X7NfkGo4wRJedy2RmKnODVRg9JFXAtIy/GbJ4d8hYxt2MAD6f Ubl/0gv96T2GSgi31CKu =W8PB -----END PGP SIGNATURE-----

3940

Age (days ago)

3942

Last active (days ago)

List overview

8 comments

5 participants

participants (5)

Alejandro Exojo
deano-dovecot＠areyes.com
Peter Chiochetti
Robert Schetterer
Tom Hendrikx