[Dovecot] "pipe" plugin - is anyone interested (or using it)?
Hi,
A few months ago, I sent a message to this list asking if people would be interested by a "pipe" plugin (see http://dovecot.org/pipermail/dovecot/2007-May/023005.html ). I received a few answers of people who told they were intersted.
A few months later, I sent a new message announcing the availability of my work (see http://dovecot.org/pipermail/dovecot/2007-August/024805.html ).
Since then, I have received no feedback...
So, is anyone using this plugin? Is anyone using it?
Timo, on my first message, you seemed to be interested to merge such a feature (see http://dovecot.org/pipermail/dovecot/2007-May/023055.html ). Are you still interested?
All comments are welcome. I think the feature (or a more generic one) would be a worthy addition for dovecot, be it for ham/spam learning, to implement an "outbox" or anything else I can't think about.
All comments about my work are welcome. (Please note that I have not (yet) tried to port it to dovecot 1.1...)
Cheers,
Nicolas Boullis Ecole Centrale Paris
On Dec 5, 2007 7:29 AM, Nicolas Boullis nicolas.boullis@ecp.fr wrote:
A few months ago, I sent a message to this list asking if people would be interested by a "pipe" plugin (see http://dovecot.org/pipermail/dovecot/2007-May/023005.html ). I received a few answers of people who told they were intersted.
A few months later, I sent a new message announcing the availability of my work (see http://dovecot.org/pipermail/dovecot/2007-August/024805.html ).
Since then, I have received no feedback...
So, is anyone using this plugin? Is anyone using it?
Hi Nicolas-
I don't remember seeing either of your earlier messages (although I admit to not always being able to closely follow the list despite being a subscriber), but I do like your plugin. I haven't had a chance to install and try it, but I'm adding it to my list of things to do. I've struggled to come up with an easy-to-use solution for manual classification of messages for end users, and ultimately sort of pushed it aside, but this seems like an elegant and easy-to-use solution.
Could you maybe explain how the configuration works in a little more detail? For some reason I'm having a bit of trouble wrapping my head around how this works:
(...) namespace private { separator = . location = maildir:/var/mail/%u inbox = yes hidden = no }
namespace public { separator = . prefix = learn. location = maildir:/var/learn/%u inbox = no hidden = no } (...) plugin { (...) pipe = /var/learn/%u/.spam:spamc -d some.host -L spam pipe2 = /var/learn/%u/.ham:spamc -d some.host -L ham (...) }
Thanks, Ben
Hi,
Ben Schumacher wrote:
Could you maybe explain how the configuration works in a little more detail? For some reason I'm having a bit of trouble wrapping my head around how this works:
What is the part you don't understand?
namespace private { separator = . location = maildir:/var/mail/%u inbox = yes hidden = no }
namespace public { separator = . prefix = learn. location = maildir:/var/learn/%u inbox = no hidden = no }
This is a simple namespace definition, nothing specific to my plugin. The learn.* part of the namespace is declared public and stored an a specific location. I did this to avoid counting the learnt messages to the quota.
plugin { (...) pipe = /var/learn/%u/.spam:spamc -d some.host -L spam pipe2 = /var/learn/%u/.ham:spamc -d some.host -L ham (...)
And here I define that any message stored to /var/learn/%u/.spam where %u is the username (that is learn.spam in the user's IMAP namespace) has to be piped to the "spamc -d some.host -L spam" command. And the same for ham.
Hope this helps,
Nicolas
plugin { (...) pipe = /var/learn/%u/.spam:spamc -d some.host -L spam pipe2 = /var/learn/%u/.ham:spamc -d some.host -L ham (...)
And here I define that any message stored to /var/learn/%u/.spam where %u is the username (that is learn.spam in the user's IMAP namespace) has to be piped to the "spamc -d some.host -L spam" command. And the same for ham.
The above is why I like my antispam plugin much better: you can push into *any* folder *from* spam to train as ham.
I do think, however, that the two plugins could possibly converge. That would entail being able to specify pipes with two endpoints, right now you just have the destination folder. In order to do that, the plugin would have to be configurable like this, with the pipes tried in the order they are numbered:
# learn into spam as spam pipe.1 = *>SPAM |/learn/as --spam pipe.2 = UNSURE>SPAM |/learn/as --spam # forbid into unsure pipe.3 = *>UNSURE - # learn from unsure or spam folder as ham pipe.4 = UNSURE>* |/learn/as --ham pipe.5 = SPAM>* |/learn/as --ham
Maybe for the configuration some "command aliases" could be defined:
pipe.cmd.learnas = /learn/as --user %u pipe.1 = *>SPAM !learnas --spam
causes an internal expansion to: pipe.1 = *>SPAM |/learn/as --user %u --spam
As you can see, I have prefixed the "command" with a pipe symbol. This indicates that a program shall be invoked. I have prefixed the alias with an exclamation mark. In order to have the libdspam (or similar) training some people wanted, we could allow a bare specification for ld-open style: pipe.1 = *>SPAM builtin_learn_as
which would essentially do: fn = dlsym(RTLD_DEFAULT, "builtin_learn_as"); fn(source folder, dest folder, mail struct[, ...?])
This would allow writing a simple plugin, having dovecot load it (before the pipe plugin) and having a function in that plugin called, splitting the work the antispam plugin would do in that case into two plugins.
I guess that such a plugin is actually more generic than "pipe" then.
Do you have the pipe plugin code in git anywhere I can pull from and see if I can do such modifications?
johannes
Quoting Johannes Berg johannes@sipsolutions.net:
plugin { (...) pipe = /var/learn/%u/.spam:spamc -d some.host -L spam pipe2 = /var/learn/%u/.ham:spamc -d some.host -L ham (...)
And here I define that any message stored to /var/learn/%u/.spam where %u is the username (that is learn.spam in the user's IMAP namespace) has to be piped to the "spamc -d some.host -L spam" command. And the same for ham.
The above is why I like my antispam plugin much better: you can push into *any* folder *from* spam to train as ham.
It certainly is a matter of taste, but that also was the reason why I
did not like your plugin... ;-)
Firstly, if a message is correctly identified as a spam, but I want to
keep it a copy for some reason, copying outside the spam plugin means
learning as ham, so I have to copy it again into the spam plugin to
re-learn it as ham.
Sencondly, I don't know how dspam works, but spamassassin can identify
a message as HAM or SPAM without learning it. If I want to learn a
message correctly identified as spam but not auto-learnt, I have again
to move it out of the spam folder and then back in. The same if a
message is correctly identified by ham but not auto-learnt, I have to
move it to the spam folder and then out of it.
Lastly, I'd prefer my user to make some explicit action to learn a
message rather than have a system that somewhat "guesses" for them...
I do think, however, that the two plugins could possibly converge.
I don't think they can converge, they work in very different ways,
since yours work at the IMAP level while mine works at the storage
level.
Hence, mine can also be triggered by a delivery with dovecot's
deliver, but when a message is copied or appended, it does not know
where it comes from (but as far as I am concerned, since I don't like
triggering when moving a message out of a folder, it does not matter).
On the other hand, yours only works with IMAP, overriding the COPY
command.
That would entail being able to specify pipes with two endpoints, right now you just have the destination folder. In order to do that, the plugin would have to be configurable like this, with the pipes tried in the order they are numbered:
# learn into spam as spam pipe.1 = *>SPAM |/learn/as --spam pipe.2 = UNSURE>SPAM |/learn/as --spam # forbid into unsure pipe.3 = *>UNSURE - # learn from unsure or spam folder as ham pipe.4 = UNSURE>* |/learn/as --ham pipe.5 = SPAM>* |/learn/as --ham
Those only work if you consider messages that are copied from a folder
to a different one. But how do you deal with messages that are
appended from another source (such as messages being delivered or
appended by an IMAP client)?
This would allow writing a simple plugin, having dovecot load it (before the pipe plugin) and having a function in that plugin called, splitting the work the antispam plugin would do in that case into two plugins.
I guess that such a plugin is actually more generic than "pipe" then.
It looks to me that you only consider spam/ham learning as a use for
such a plugin. I think other uses are possible. For example, I
remember someone who was willing to implement an outbox.
Do you have the pipe plugin code in git anywhere I can pull from and see if I can do such modifications?
No, I have never used git. But you have the tarball.
Cheers,
Nicolas
It certainly is a matter of taste, but that also was the reason why I
did not like your plugin... ;-)
Aha, but with what I proposed that becomes a matter of configuration, hence my SPAM>* rule in there.
Lastly, I'd prefer my user to make some explicit action to learn a
message rather than have a system that somewhat "guesses" for them...
You can still have that, I was just expressing how my plugin works in terms of some "rules".
I do think, however, that the two plugins could possibly converge.
I don't think they can converge, they work in very different ways,
since yours work at the IMAP level while mine works at the storage
level.
Actually, I work at storage level too now, I modified it.
Hence, mine can also be triggered by a delivery with dovecot's
deliver, but when a message is copied or appended, it does not know
where it comes from (but as far as I am concerned, since I don't like
triggering when moving a message out of a folder, it does not matter).
Yeah, if you load it into deliver which I'd consider somewhat stupid to do with my plugin.
Those only work if you consider messages that are copied from a folder
to a different one. But how do you deal with messages that are
appended from another source (such as messages being delivered or
appended by an IMAP client)?
Good point. You could treat that as the "empty" source, like
pipe.3 = >SPAM do something
and have a wildcard that matches all (like .* RE) and one that matches "all not empty" (like .+ RE)
It looks to me that you only consider spam/ham learning as a use for
such a plugin. I think other uses are possible. For example, I
remember someone who was willing to implement an outbox.
No, but I'm using spam as an example of how your plugin could be generalised to work the same as mine with a different configuration. It seems to me that you only consider use cases that depend on the target folder while my generalisation was to make it depend on the pair (source, target).
johannes
Quoting Johannes Berg johannes@sipsolutions.net:
I do think, however, that the two plugins could possibly converge.
I don't think they can converge, they work in very different ways, since yours work at the IMAP level while mine works at the storage level.
Actually, I work at storage level too now, I modified it.
Oh. I missed that.
I had a look at your plugin, quite some time ago, and did not think
you would have modified it in such a way.
I should definitely give it a look; how to you figure out where the
mail comes from?
No, but I'm using spam as an example of how your plugin could be generalised to work the same as mine with a different configuration. It seems to me that you only consider use cases that depend on the target folder while my generalisation was to make it depend on the pair (source, target).
I did not think it would be possible to use the (source,target) pair
at the storage level. Now that I understand your proposal a little
better, I think it looks like a good idea.
Cheers,
Nicolas
Oh. I missed that. I had a look at your plugin, quite some time ago, and did not think
you would have modified it in such a way.
http://johannes.sipsolutions.net/Projects/dovecot-antispam
I should definitely give it a look; how to you figure out where the
mail comes from?
TBH, I forgot :) The code can be browsed here: http://git.sipsolutions.net/?p=dovecot-antispam.git;a=tree
No, but I'm using spam as an example of how your plugin could be generalised to work the same as mine with a different configuration. It seems to me that you only consider use cases that depend on the target folder while my generalisation was to make it depend on the pair (source, target).
I did not think it would be possible to use the (source,target) pair
at the storage level. Now that I understand your proposal a little
better, I think it looks like a good idea.
:)
johannes
On 05.12.2007 17:29, Nicolas Boullis wrote:
Hi,
A few months ago, I sent a message to this list asking if people would be interested by a "pipe" plugin (see http://dovecot.org/pipermail/dovecot/2007-May/023005.html ). I received a few answers of people who told they were intersted.
A few months later, I sent a new message announcing the availability of my work (see http://dovecot.org/pipermail/dovecot/2007-August/024805.html ).
Since then, I have received no feedback...
So, is anyone using this plugin? Is anyone using it?
Timo, on my first message, you seemed to be interested to merge such a feature (see http://dovecot.org/pipermail/dovecot/2007-May/023055.html ). Are you still interested?
All comments are welcome. I think the feature (or a more generic one) would be a worthy addition for dovecot, be it for ham/spam learning, to implement an "outbox" or anything else I can't think about.
All comments about my work are welcome. (Please note that I have not (yet) tried to port it to dovecot 1.1...)
Cheers,
Nicolas Boullis Ecole Centrale Paris
Hi Nicolas,
Very promising for me too, not tried yet, but will add to my todo!
On Wed, 2007-12-05 at 15:29 +0100, Nicolas Boullis wrote:
Timo, on my first message, you seemed to be interested to merge such a feature (see http://dovecot.org/pipermail/dovecot/2007-May/023055.html ). Are you still interested?
I downloaded it when you first mentioned it, but looks like I never got around to actually looking at it. Yes, something like this would be nice..
One thing at least that should be changed is to configure it using virtual mailbox names instead of full mailbox paths. This isn't really possible with v1.0, but v1.1 should make it pretty easy.
A few things about the code:
save_dest_mail = mail_alloc(ctx->transaction,
MAIL_FETCH_PHYSICAL_SIZE, NULL);
Quota plugin wanted to know the message's physical size, but your plugin doesn't need it. So you could just use 0 instead of MAIL_FETCH_PHYSICAL_SIZE.
I think write() can return partially written data if the other side isn't reading it fast enough. Using write_full() instead would anyway be better/safer.
Do you really need to wait for the executed process to finish? Since this is the only plugin currently creating child processes, I'd setup a SIGCHLD handler and waitpid() there to get rid of the zombie: lib_signals_set_handler(SIGCHLD, TRUE, chld_handler, NULL);
Multiple commands are now processed sequentially. I don't know if there's real need for multiple commands, but it would be faster to read the message input just once and send it to all pipes in parallel.
"return 0 * WEXITSTATUS(status);" returns always 0 :)
I'd also leave stderr as-is for the child process so it could log errors, and for handling syscall failures use: i_fatal("dup2() failed: %m");
Timo Sirainen wrote:
I downloaded it when you first mentioned it, but looks like I never got around to actually looking at it. Yes, something like this would be nice..
One thing at least that should be changed is to configure it using virtual mailbox names instead of full mailbox paths. This isn't really possible with v1.0, but v1.1 should make it pretty easy.
You already told me about this last time, but I haven't yet come to read the source code of dovecot 1.1...
A few things about the code:
save_dest_mail = mail_alloc(ctx->transaction, MAIL_FETCH_PHYSICAL_SIZE, NULL);
Quota plugin wanted to know the message's physical size, but your plugin doesn't need it. So you could just use 0 instead of MAIL_FETCH_PHYSICAL_SIZE.
Thanks for pointing this.
I think write() can return partially written data if the other side isn't reading it fast enough. Using write_full() instead would anyway be better/safer.
I just reread my code, and I think my use of write looks safe, since only the amount that was correctly written is skipped with i_stream_skip. Do you think I'm missing something?
Do you really need to wait for the executed process to finish? Since this is the only plugin currently creating child processes, I'd setup a SIGCHLD handler and waitpid() there to get rid of the zombie: lib_signals_set_handler(SIGCHLD, TRUE, chld_handler, NULL);
Waiting is required if we want the append to fail if the command fails. I guess it should better be a configurable option, don't you think so?
Multiple commands are now processed sequentially. I don't know if there's real need for multiple commands, but it would be faster to read the message input just once and send it to all pipes in parallel.
I don't think there is real need for multiple commands, but I think it was easier for me to program things like this... ;-)
"return 0 * WEXITSTATUS(status);" returns always 0 :)
Hummm... I guess I should avoid drinking alcohol before I program... ;-) It probably should be "return WEXITSTATUS(status);"; I have no idea why I changed this in such a strange way...
I'd also leave stderr as-is for the child process so it could log errors, and for handling syscall failures use: i_fatal("dup2() failed: %m");
OK, point taken.
One question still: would you consider merging my plugin in dovecot if I ported it to 1.1?
Cheers,
Nicolas
On Thu, 2007-12-06 at 15:29 +0100, Nicolas Boullis wrote:
I think write() can return partially written data if the other side isn't reading it fast enough. Using write_full() instead would anyway be better/safer.
I just reread my code, and I think my use of write looks safe, since only the amount that was correctly written is skipped with i_stream_skip. Do you think I'm missing something?
Oh, you're right. It's probably better that way.
Do you really need to wait for the executed process to finish? Since this is the only plugin currently creating child processes, I'd setup a SIGCHLD handler and waitpid() there to get rid of the zombie: lib_signals_set_handler(SIGCHLD, TRUE, chld_handler, NULL);
Waiting is required if we want the append to fail if the command fails. I guess it should better be a configurable option, don't you think so?
Hmm. Maybe it's good the way it is now.
One question still: would you consider merging my plugin in dovecot if I ported it to 1.1?
Yes. Although I had been thinking about also some kind of a generic "event" plugin, which would allow executing commands for different kinds of events, not just for save and copy (like flag changes). But since I'm not planning on writing that myself anytime soon, I guess save/copy would be useful enough for a lot of people.
On Wed, December 5, 2007 3:29 pm, Nicolas Boullis wrote: ...
A few months ago, I sent a message to this list asking if people would be interested by a "pipe" plugin ... Since then, I have received no feedback... ... So, is anyone using this plugin? Is anyone using it?
Haven't tried it so I can add no useful comments on its use.
But a pipe plugin is in my opinion a very important feature.
On my postfix/dovecot system I'm currently forwarding mails that need piping to a qmail server, since thats the only way to do it easily on a per user basis. It seems irrational that dovecot itself doesnt support this considering dovecot being much ahead of qmail in every other aspect.
By the way I use it for much the same tasks as you do; for spam reporting but also for certain e-mails that need be put into another database besides its mailbox.
Regards, Mikkel
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Wed, 5 Dec 2007, Nicolas Boullis wrote:
It's on my todo list in order to support user-site SPAM training.
BTW: Could you please add the example of your first message into the plugin tgz? So the useage info is with the package.
Bye,
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux)
iQEVAwUBR1eswi9SORjhbDpvAQJLzwf+O9rHYeOiZsllYd/kfM/VV2UTy3ZFL+Wi cJRq3JKYSq3lYeGVKntXqHfp7HqeB6hAr4qHhGAmcStA6mVM4OgjQ5NQZMT7veyI exweEGKQDpAMvEXZ4wL09Jydf8f+TYZMOyLMmZUzdnKVJyk9tUUy8JR+2PeUbubq RgL/NGsygmuEIY8fOVLQRdZwjZ5uO5OoN1vvGFN2TUFsePo7c/ro/bq1S4dhst4f mjsKa/aqxPNlQPK+gKrnKfbX0V84e3u4XsgYYaPyCwnD3yMqnVe4nE0nIYeoGDFq DP8C8kVGtMVz300229lK/HcuxSWbUh2Z9l/XJWdIT2R9HfOT2fvE7Q== =ccab -----END PGP SIGNATURE-----
participants (7)
-
Ben Schumacher
-
Johannes Berg
-
mikkel@euro123.dk
-
Nicolas Boullis
-
Nikolay Shopik
-
Steffen Kaiser
-
Timo Sirainen