[Dovecot] deploying dspam

Thu Dec 16 14:05:23 EET 2004

Curtis Maloney schrieb:

> I was about to chime in and say "I think people missed your point."

Thanks. At least someone... ;-)

> It never came across to me that you were wanting something specific 
> with dpsam... more that you wanted an explicit trigger for when a user 
> decided something was/wasn't SPAM.  And I, personally, love the idea.

Right. I don't care if its SA or dspam or bogofilter. Hey, I was just 
evaluating dspam because SA is killing me with its overhead, so I'll be 
coding for dspam.

> So, let me see if I fully understand how you want it to go:
>     1) mail hits your MTA, which hands it off to dspam
>     2) dspam munges it, decides if it's SPAM or HAM, and delivers it 
> to the user, either in INBOX or SPAM
>     3) if the USER moves a messages into SPAM, dspam is notified it 
> missed a message, and retrains on it.
>     4) if the USER moves a message out of SPAM, dspam is notified it 
> got a false-positive, and retrains on it.

Precisely. Add a step to clean up the SPAM folder once a while.

> Seems to me it will possibly lower the overall load, since you will 
> only rescan/retrain messages _explicitly_ changed from/to SPAM/HAM.  
> Now, if only you could get some resident form of dspam, so you didn't 
> have to keep spawning it.... or did I miss something in the docs?  
> Then again, there's libdspam...

Yeah, though both these options kinda suck. Spawning dspam gives you all 
the benefit of the command line client (it reads config files etc.) 
while using libdspam makes it in-process. I'm looking at making another 
dspam library that encapsulates more functionality of the dspam client 
(ie. the config file reading etc.) and using that in-process with 
dovecot, I'll kick that idea around the dspam-dev list. Also, I'd link 
that library into my MTA (exim). The rationale for that idea is to 
centralize dspam's configuration while still using it from within 
multiple processes. There's one catch: This system will require that 
dspam stores the signature in the header (that way I can use dovecot's 
API to extract it and pass it to libdspam w/o retrieving the whole message).

Also, dspam appears to store the messages in its database, so I was 
thinking of making a dspam-database dovecot storage plugin as well (or 
integrate that with the dspam plugin I need to write anyway). That way, 
those emails are only stored once. I haven't figured out what it stores 
though, whether all messages, to a certain limit, only spam, or ..... 
Needs some thinking, probably, and for a start, I'll just deliver the 
spam-messages to another maildir and make a namespace for it.
Oh, and I'll have to prohibit APPENDing to the spam box, if some 
braindead imap clients moves by fetch/append/delete then that's their 
problem, but APPEND is kinda hard to manage I think (as per timo's 
message about append).

johannes