Curtis Maloney schrieb:
I was about to chime in and say "I think people missed your point."
Thanks. At least someone... ;-)
It never came across to me that you were wanting something specific with dpsam... more that you wanted an explicit trigger for when a user decided something was/wasn't SPAM. And I, personally, love the idea.
Right. I don't care if its SA or dspam or bogofilter. Hey, I was just evaluating dspam because SA is killing me with its overhead, so I'll be coding for dspam.
So, let me see if I fully understand how you want it to go: 1) mail hits your MTA, which hands it off to dspam 2) dspam munges it, decides if it's SPAM or HAM, and delivers it to the user, either in INBOX or SPAM 3) if the USER moves a messages into SPAM, dspam is notified it missed a message, and retrains on it. 4) if the USER moves a message out of SPAM, dspam is notified it got a false-positive, and retrains on it.
Precisely. Add a step to clean up the SPAM folder once a while.
Seems to me it will possibly lower the overall load, since you will only rescan/retrain messages _explicitly_ changed from/to SPAM/HAM.
Now, if only you could get some resident form of dspam, so you didn't have to keep spawning it.... or did I miss something in the docs?
Then again, there's libdspam...
Yeah, though both these options kinda suck. Spawning dspam gives you all the benefit of the command line client (it reads config files etc.) while using libdspam makes it in-process. I'm looking at making another dspam library that encapsulates more functionality of the dspam client (ie. the config file reading etc.) and using that in-process with dovecot, I'll kick that idea around the dspam-dev list. Also, I'd link that library into my MTA (exim). The rationale for that idea is to centralize dspam's configuration while still using it from within multiple processes. There's one catch: This system will require that dspam stores the signature in the header (that way I can use dovecot's API to extract it and pass it to libdspam w/o retrieving the whole message).
Also, dspam appears to store the messages in its database, so I was thinking of making a dspam-database dovecot storage plugin as well (or integrate that with the dspam plugin I need to write anyway). That way, those emails are only stored once. I haven't figured out what it stores though, whether all messages, to a certain limit, only spam, or ..... Needs some thinking, probably, and for a start, I'll just deliver the spam-messages to another maildir and make a namespace for it. Oh, and I'll have to prohibit APPENDing to the spam box, if some braindead imap clients moves by fetch/append/delete then that's their problem, but APPEND is kinda hard to manage I think (as per timo's message about append).
johannes