Re: [Dovecot] plugin problem
Let's try to keep this on the list, shall we.
Fábio M. Catunda wrote:
Lars Stavholm escreveu:
Question: How can I retrieve the full unix path for a specific mail?
I am trying the same thing, but I have a different idea.
I want that when the user moves a message the transaction is logged into a file, then I can right external programs that will read this file and control dspam.
Have a look at Johannes Berg's reply above: the Maildir format would have a file for each mail message, other formats would not.
My idea is based on the following: Q: Scalability: And if a user moves 400 messages to Spam folder at once? A: Well, its not a problem to write 400 lines into a file, then the external program will control how much resources dspam can use to classify all those messages.
Sounds OK, not really a problem.
Q: Why not a FIFO: A: In case of a crash I need to classify all messages anyway, so, a FIFO is not a good idea here. (I think so)
Didn't quite understand the reasoning there, but never mind, it's just me:)
Q: And if the user moves a message from Spam folder to trash folder, will it be considered innocent? A: Thats a big problem. When a message is deleted by a MUA its usually copied from one folder to another and then deleted, but there is no default trash folder in imap, so, you have to be able to configure a lot of "possible trash folders" to ignore then, thats why I prefer to have a external program controlling dspam.
I don't see this as a problem at all (why create one when there's none to be found:):
Move message into Spam: it's a spam that should be reclassified.
Move message out of Spam: it's a ham that should be reclassified.
Don't really care where the mail comes from or where it is moved. This is the beauty of it all, count the key or mouse clicks, can't be less than this:)
Using the expire plugin, the Spam folder will be emptied auto- matically in due time (typically 30 days maybe) without user intervention.
All close to zero maintenance for sysadmin as well as end-user.
Well, by now I dont have much, but I really would like to know how to find the filename of a message being copied from a folder to another.
No such luck (I think): to my understanding (with the help of Johannes in previous reply in this thread), you'd have to create a temporary file with the mail message using tmpnam() + mail_get_stream() or similar, and then do your thing.
I'm aiming towards that exact functionality: I want to be able to do training using "pristine" source (so I'll need the whole message), and keep the previous functionality using signatures. We'll see how it goes.
Good Luck and thanks for your input /Lars
Lars Stavholm escreveu:
Let's try to keep this on the list, shall we.
Sorry, my TB dont like the reply-to header.
Have a look at Johannes Berg's reply above: the Maildir format would have a file for each mail message, other formats would not.
Well, some plugins only works with Maildir, mine will be another one of those.
My idea is based on the following: Q: Scalability: And if a user moves 400 messages to Spam folder at once? A: Well, its not a problem to write 400 lines into a file, then the external program will control how much resources dspam can use to classify all those messages.
Sounds OK, not really a problem.
Thats the question, this is not a problem if you right it to a file, but if you decide to classify the message instantly it will become a problem. The base idea of dpam_plugin (at least what I have read from the wiki page) is to classify messages at night with a cron job, but it might be a problem (my mail server is realy busy at night, spammers love to work late).
Q: Why not a FIFO: A: In case of a crash I need to classify all messages anyway, so, a FIFO is not a good idea here. (I think so)
Didn't quite understand the reasoning there, but never mind, it's just me:)
If you create a FIFO for communication and you have a crash all messages that are queued get lost, with a file it wouldn't happen!
- Move message out of Spam: it's a ham that should be reclassified.
And if the user just wants to delete all spam messages without shift+del? Those messages will be moved to trash and will be reclassified as ham? Maybe this is a problem!
Anyway, I will try to do some tricks, unfortunately my C skills are close to zero! lol
Thankz!
FMC!
!DSPAM:46fe746d122631113417023!
On Sat, 2007-09-29 at 12:52 -0300, "Fábio M. Catunda" wrote:
Well, some plugins only works with Maildir, mine will be another one of those.
That's pretty sad for a general mail plugin but hey :)
Thats the question, this is not a problem if you right it to a file, but if you decide to classify the message instantly it will become a problem. The base idea of dpam_plugin (at least what I have read from the wiki page) is to classify messages at night with a cron job, but it might be a problem (my mail server is realy busy at night, spammers love to work late).
No, you didn't read my wiki page properly, it has this under the "scaling better" section that suggests future ideas :) But in any case, somebody else suggested to have a "classification crawler" that continually classifies messages if there are any.
And if the user just wants to delete all spam messages without shift+del? Those messages will be moved to trash and will be reclassified as ham? Maybe this is a problem!
See my other reply.
!DSPAM:46fe746d122631113417023!
You really should fix your dspam setup to not add headers and signature lines to outgoing mail.
johannes
Fábio M. Catunda wrote:
Lars Stavholm escreveu:
Let's try to keep this on the list, shall we.
Sorry, my TB dont like the reply-to header.
Have a look at Johannes Berg's reply above: the Maildir format would have a file for each mail message, other formats would not.
Well, some plugins only works with Maildir, mine will be another one of those.
My idea is based on the following: Q: Scalability: And if a user moves 400 messages to Spam folder at once? A: Well, its not a problem to write 400 lines into a file, then the external program will control how much resources dspam can use to classify all those messages.
Sounds OK, not really a problem.
Thats the question, this is not a problem if you right it to a file, but if you decide to classify the message instantly it will become a problem. The base idea of dpam_plugin (at least what I have read from the wiki page) is to classify messages at night with a cron job, but it might be a problem (my mail server is realy busy at night, spammers love to work late).
Q: Why not a FIFO: A: In case of a crash I need to classify all messages anyway, so, a FIFO is not a good idea here. (I think so)
Didn't quite understand the reasoning there, but never mind, it's just me:)
If you create a FIFO for communication and you have a crash all messages that are queued get lost, with a file it wouldn't happen!
Well, that would depend on how you implement it. If you implement persistent storage of the queue, it's not a problem, except possibly for the message being processed, and again, that could be implemented in a way so that it wont get lost in a process crash.
- Move message out of Spam: it's a ham that should be reclassified.
And if the user just wants to delete all spam messages without shift+del?
Shift+Del got to be client specific, has nothing to do with an IMAP server as such.
In any case, a bulk move to Trash could be regarded as a bulk move to any folder, whereas a bulk delete, which in Thunderbird actually is the same thing as a bulk move to Trash, should be regarded as a non-reclassification situation. As you said, a bit trouble some maybe.
Those messages will be moved to trash and will be reclassified as ham? Maybe this is a problem!
Anyway, I will try to do some tricks, unfortunately my C skills are close to zero! lol
Well, I'm close to that myself, but we'll see how it goes.
Good Luck /L
On Sat, 2007-09-29 at 17:23 +0200, Lars Stavholm wrote:
Q: And if the user moves a message from Spam folder to trash folder, will it be considered innocent? A: Thats a big problem. When a message is deleted by a MUA its usually copied from one folder to another and then deleted, but there is no default trash folder in imap, so, you have to be able to configure a lot of "possible trash folders" to ignore then, thats why I prefer to have a external program controlling dspam.
I don't see this as a problem at all (why create one when there's none to be found:):
Well, if the user insist on deleting his spam folder *and* has a client that insists on using a trash folder (I tell people using my setup to turn off trash folders) then there may be a problem. But my plugin has an option to ignore trash folders.
- Using the expire plugin, the Spam folder will be emptied auto- matically in due time (typically 30 days maybe) without user intervention.
FWIW, I just put this up a few hours ago: http://johannes.sipsolutions.net/Projects/dovecot-dspam-integration#addition...
Well, by now I dont have much, but I really would like to know how to find the filename of a message being copied from a folder to another.
No such luck (I think): to my understanding (with the help of Johannes in previous reply in this thread), you'd have to create a temporary file with the mail message using tmpnam() + mail_get_stream() or similar, and then do your thing.
I'm aiming towards that exact functionality: I want to be able to do training using "pristine" source (so I'll need the whole message), and keep the previous functionality using signatures. We'll see how it goes.
You could simply log the signature and do signature-based training. Actually, somebody has done exactly that with logging to mysql. Search the list archives to find the variation of the dspam plugin doing that.
Sounds almost like the dspam plugin needs to grow a plugin system itself to support all the backend configuration and various dspam installations like with/without signature in uid etc. You guys want to pool and buy me to do that? ;)
johannes
Johannes Berg wrote:
On Sat, 2007-09-29 at 17:23 +0200, Lars Stavholm wrote:
Q: And if the user moves a message from Spam folder to trash folder, will it be considered innocent? A: Thats a big problem. When a message is deleted by a MUA its usually copied from one folder to another and then deleted, but there is no default trash folder in imap, so, you have to be able to configure a lot of "possible trash folders" to ignore then, thats why I prefer to have a external program controlling dspam. I don't see this as a problem at all (why create one when there's none to be found:):
Well, if the user insist on deleting his spam folder *and* has a client that insists on using a trash folder (I tell people using my setup to turn off trash folders) then there may be a problem. But my plugin has an option to ignore trash folders.
- Using the expire plugin, the Spam folder will be emptied auto- matically in due time (typically 30 days maybe) without user intervention.
FWIW, I just put this up a few hours ago: http://johannes.sipsolutions.net/Projects/dovecot-dspam-integration#addition...
Good stuff! Timo hinted the other day that the expire plugin might not work as expected:)
Well, by now I dont have much, but I really would like to know how to find the filename of a message being copied from a folder to another. No such luck (I think): to my understanding (with the help of Johannes in previous reply in this thread), you'd have to create a temporary file with the mail message using tmpnam() + mail_get_stream() or similar, and then do your thing.
I'm aiming towards that exact functionality: I want to be able to do training using "pristine" source (so I'll need the whole message), and keep the previous functionality using signatures. We'll see how it goes.
You could simply log the signature and do signature-based training. Actually, somebody has done exactly that with logging to mysql. Search the list archives to find the variation of the dspam plugin doing that.
I'll see if I can get the signature based training working reliably with the dspam hash driver, in which case I would possibly not implement the trainPristine option in the dspam plugin. Instead, the dspam plugin would merely be upgraded to fit dovecot-1.1.
Sounds almost like the dspam plugin needs to grow a plugin system itself to support all the backend configuration and various dspam installations like with/without signature in uid etc. You guys want to pool and buy me to do that? ;)
No such luck, I'm too poor:)
Cheers /Lars
Johannes Berg escreveu:
Well, if the user insist on deleting his spam folder *and* has a client that insists on using a trash folder (I tell people using my setup to turn off trash folders) then there may be a problem. But my plugin has an option to ignore trash folders.
Maybe I'm missing something, but is there a way in your pluggin to ignore more than one folder? Different clients create different folders for trash, this is another problem for my setup.
I know that I'm wanting everything to work "perfectly" with all situations, but I believe thats something important. Users get angry with spam and they call me ugly names when a ham is considered spam, thats why I'm trying to do a good job!
You could simply log the signature and do signature-based training. Actually, somebody has done exactly that with logging to mysql. Search the list archives to find the variation of the dspam plugin doing that.
Sounds almost like the dspam plugin needs to grow a plugin system itself to support all the backend configuration and various dspam installations like with/without signature in uid etc. You guys want to pool and buy me to do that? ;)
I will talk to my boss! :-)
Thankz,
FMC!
!DSPAM:46fe7d0a131011046663393!
On Sat, 2007-09-29 at 13:29 -0300, "Fábio M. Catunda" wrote:
Maybe I'm missing something, but is there a way in your pluggin to ignore more than one folder? Different clients create different folders for trash, this is another problem for my setup.
Not right now, but it's trivial to add. You just need to make sure that nobody actually then starts using a trash folder from one client for non-trash in another client or something but I guess that's unlikely.
!DSPAM:46fe7d0a131011046663393!
I keep thinking my dspam is misconfigured to add this. Why do you have it both in header and the email (header only is sufficient) and do it on outgoing mail? :)
johannes
Johannes Berg escreveu:
On Sat, 2007-09-29 at 13:29 -0300, "Fábio M. Catunda" wrote: Not right now, but it's trivial to add. You just need to make sure that nobody actually then starts using a trash folder from one client for non-trash in another client or something but I guess that's unlikely.
For now I'm worried with "trash" and "deleted items". I will try to find out what OsX clients use for deleted items too, post it during next week.
I keep thinking my dspam is misconfigured to add this. Why do you have it both in header and the email (header only is sufficient) and do it on outgoing mail? :)
johannes
I'm not the main sysadm from this server yet, so, by now I will keep sending this, sorry. I don't like it too, but by now I have no options.
About the filename, I found a function that is called mail_get_physical_size. This function finds the filename and the path with a search with another function: ... fname = maildir_uidlist_lookup(mbox->uidlist, _mail->uid, &flags); ...
Do you believe that I'm going to some place here?
Thankz!
FMC!
!DSPAM:46febcbc216361603523369!
I'm almost there, look, from my log: FROM=Spam FILE=1191103168.P24773Q0M499718.fabio TO=INBOX
Looks perfect, but it's not. The real filename is 1191105530.P22847Q0M112390.fabio:2,Sa and not 1191103168.P24773Q0M499718.fabio
I'm using this function to get filename: filename = mail_get_special(mail, MAIL_FETCH_UIDL_FILE_NAME);
Anybody knows why the filename is returned incomplete?
Thankz!
!DSPAM:46fed4c1257217110611695!
On Sat, 2007-09-29 at 19:43 -0300, "Fábio M. Catunda" wrote:
I'm almost there, look, from my log: FROM=Spam FILE=1191103168.P24773Q0M499718.fabio TO=INBOX
Looks perfect, but it's not. The real filename is 1191105530.P22847Q0M112390.fabio:2,Sa and not 1191103168.P24773Q0M499718.fabio
I'm using this function to get filename: filename = mail_get_special(mail, MAIL_FETCH_UIDL_FILE_NAME);
Anybody knows why the filename is returned incomplete?
I guess it gives you the base filename since the stuff after that is some flags etc. in maildir with special meaning which also changes when you mark the message read/unread/....
johannes
On Sat, 2007-09-29 at 19:43 -0300, "Fábio M. Catunda" wrote:
I'm almost there, look, from my log: FROM=Spam FILE=1191103168.P24773Q0M499718.fabio TO=INBOX
Looks perfect, but it's not. The real filename is 1191105530.P22847Q0M112390.fabio:2,Sa and not 1191103168.P24773Q0M499718.fabio
I'm using this function to get filename: filename = mail_get_special(mail, MAIL_FETCH_UIDL_FILE_NAME);
Anybody knows why the filename is returned incomplete?
Maildir filenames aren't stable. I suppose you could get it working most of the time by getting the current filename, but nothing guarantees that it hasn't changed already by the time you're trying to open it. You could get the current filename by using maildir-specific functions (like maildir_file_do() where the callback function would stat() the file, like maildir_mail_stat() works in maildir-mail.c).
You already mentioned something about using FIFOs. I'm not exactly sure how the dspam calling works, but I think FIFOs would be the best way to do this and also in mailbox format-independent way. If you do the spam training before COPY command finishes (i.e. in transaction_commit()), if something crashes in the middle you don't lose anything, because the entire COPY operation fails, and user tries it again later.
On Sun, 2007-09-30 at 13:56 +0300, Timo Sirainen wrote:
You already mentioned something about using FIFOs. I'm not exactly sure how the dspam calling works, but I think FIFOs would be the best way to do this and also in mailbox format-independent way.
The FIFO he wanted was pushing just the signatures to the external process w/o checking any validity or such.
If you do the spam training before COPY command finishes (i.e. in transaction_commit()), if something crashes in the middle you don't lose anything, because the entire COPY operation fails, and user tries it again later.
That's what I currently do.
I have a few quick questions.
The copy code has this:
keywords_list = mail_get_keywords(mail);
keywords = strarray_length(keywords_list) == 0 ? NULL :
mailbox_keywords_create(t, keywords_list);
if (mailbox_copy(t, mail, mail_get_flags(mail),
keywords, NULL) < 0)
ret = mail->expunged ? 0 : -1;
mailbox_keywords_free(t, &keywords);
I take it all the keywords handling is part of the mail and hence part of the transaction?
Also, you have src_trans = mailbox_transaction_begin(client->mailbox, 0);
is there no need to roll back that transaction if something fails? You don't seem to do so when e.g. mailbox_search_deinit fails but that still makes the COPY command return an error.
johannes
On Sun, 2007-09-30 at 14:10 +0200, Johannes Berg wrote:
The copy code has this:
keywords_list = mail_get_keywords(mail); keywords = strarray_length(keywords_list) == 0 ? NULL : mailbox_keywords_create(t, keywords_list); if (mailbox_copy(t, mail, mail_get_flags(mail), keywords, NULL) < 0) ret = mail->expunged ? 0 : -1; mailbox_keywords_free(t, &keywords);
I take it all the keywords handling is part of the mail and hence part of the transaction?
Right.
Also, you have src_trans = mailbox_transaction_begin(client->mailbox, 0);
is there no need to roll back that transaction if something fails? You don't seem to do so when e.g. mailbox_search_deinit fails but that still makes the COPY command return an error.
src_trans is used only for reading the mailbox. The only thing committing it does it to possibly update dovecot.index.cache file, which is a good thing to do always.
If mailbox_search_deinit() fails it could still mean that some mails were read and cache file could be updated for them.
On Sun, 2007-09-30 at 15:39 +0300, Timo Sirainen wrote:
src_trans is used only for reading the mailbox. The only thing committing it does it to possibly update dovecot.index.cache file, which is a good thing to do always.
If mailbox_search_deinit() fails it could still mean that some mails were read and cache file could be updated for them.
Right, so I can just safely commit it anyway since I'm just reading it, ok. Changing code. I'll publish it in a bit.
johannes
I don't see this as a problem at all (why create one when there's none to be found:):
Move message into Spam: it's a spam that should be reclassified.
Move message out of Spam: it's a ham that should be reclassified.
Don't really care where the mail comes from or where it is moved. This is the beauty of it all, count the key or mouse clicks, can't be less than this:)
Using the expire plugin, the Spam folder will be emptied auto- matically in due time (typically 30 days maybe) without user intervention.
All close to zero maintenance for sysadmin as well as end-user.
This is the part that's made me really excited about the plugin too. I've needed a new spam solution -- my old spamassassin setup was too hard to control for my users. This sounds truly wonderful. (haven't gotten the plugin to work yet, and haven't made time to)
Aria
participants (5)
-
"Fábio M. Catunda"
-
Aria Stewart
-
Johannes Berg
-
Lars Stavholm
-
Timo Sirainen