[Dovecot] sieve filtering setup - dovecot - dovecot.org

newer
[Dovecot] Shared mailboxes and...

[Dovecot] sieve filtering setup

older
[Dovecot] Managesieve...

Jonathan Siegle

30 Jul 2009 30 Jul '09

4:39 p.m.

I'm looking at implementing sieve in my environment. Software is: dovecot-1.2-sieve revision 1022:3c9a22c28156 dovecot-1.2 revision 9269:a303bb82c1c9 AIX 5.3 with sendmail mta using prescribed deliver lda.

I have a few questions. I'll have 110k sieve files(1 for each user).
Does sieve read the file each time a new message is accepted by
sendmail? Are there any measurements on cpu load for sieve filters?

Thanks, Jonathan

Attachments:

smime.p7s (application/pkcs7-signature — 2.5 KB)

Reply

Sign in to reply online Use email software

Show replies by date

Stephan Bosch

30 Jul 30 Jul

6:12 p.m.

Jonathan Siegle schreef:

I'm looking at implementing sieve in my environment. Software is: dovecot-1.2-sieve revision 1022:3c9a22c28156 dovecot-1.2 revision 9269:a303bb82c1c9 AIX 5.3 with sendmail mta using prescribed deliver lda.

I have a few questions. I'll have 110k sieve files(1 for each user). Does sieve read the file each time a new message is accepted by sendmail? Are there any measurements on cpu load for sieve filters? If all is configured correctly, the Sieve scripts are compiled once each time they are changed or created. After that, the deliver LDA only reads the compiled binary from disk for each message. Since the LDA is called separately for each message, there is no way to keep the Sieve binary in memory between messages.

I've not seen any benchmarks thus far.

Regards,

-- Stephan Bosch stephan@rename-it.nl

Reply

Sign in to reply online Use email software

Jonathan Siegle

6:30 p.m.

On Jul 30, 2009, at 11:12 AM, Stephan Bosch wrote:

Jonathan Siegle schreef:

...
I'm looking at implementing sieve in my environment. Software is: dovecot-1.2-sieve revision 1022:3c9a22c28156 dovecot-1.2 revision 9269:a303bb82c1c9 AIX 5.3 with sendmail mta using prescribed deliver lda. I have a few questions. I'll have 110k sieve files(1 for each
user). Does sieve read the file each time a new message is accepted
by sendmail? Are there any measurements on cpu load for sieve
filters? If all is configured correctly, the Sieve scripts are compiled once
each time they are changed or created. After that, the deliver LDA
only reads the compiled binary from disk for each message.

I recently found out about something called memcached. The goal of
memcached(server)[1] and libmemcached(client library)[2] is to store
maps in memory of tokens. So my key would be "jsiegle_sieve" and my
data would be my sieve file. So instead of the 10-20 million reads to
disk, we would just pull from memory. The logic looks like this:

On update of sieve file, Do validation of file. Do compilation of file. Delete key if it exists and add new.

On new mail, sieve plugin would call memcached_get() and get the
token. I could be very wrong, but I think this is a big win. memcached
is designed for small files(<1MB). So if each of my users has a 2KB
file, that would only be 20MB of memory for usage.

[1] http://www.danga.com/memcached/ [2] http://tangent.org/552/libmemcached.html

-Jonathan

Reply

Sign in to reply online Use email software

Timo Sirainen

8:29 p.m.

On Thu, 2009-07-30 at 11:30 -0400, Jonathan Siegle wrote:

I recently found out about something called memcached. The goal of
memcached(server)[1] and libmemcached(client library)[2] is to store
maps in memory of tokens. So my key would be "jsiegle_sieve" and my
data would be my sieve file. So instead of the 10-20 million reads to
disk, we would just pull from memory.

Then again, if you have enough memory your OS could be doing that automatically already. Or maybe if the Sieve plugin supports giving a separate path to downloaded scripts, the destination could be in a ramdisk or if you're using Linux: http://memcachefs.sourceforge.net/

Also I'm hopefully going to abstracting out filesystem accessing code in Dovecot's index files and dbox code. Sieve could use this same FS API, and you could implement whatever backend to actually perform the FS access, like memcached..

Reply

Sign in to reply online Use email software

Jonathan Siegle

9:16 p.m.

On Jul 30, 2009, at 1:29 PM, Timo Sirainen wrote:

On Thu, 2009-07-30 at 11:30 -0400, Jonathan Siegle wrote:

...
I recently found out about something called memcached. The goal of memcached(server)[1] and libmemcached(client library)[2] is to store maps in memory of tokens. So my key would be "jsiegle_sieve" and my data would be my sieve file. So instead of the 10-20 million reads to disk, we would just pull from memory.

Then again, if you have enough memory your OS could be doing that automatically already.

Good point. AIX does love to use memory to cache filesystems. I think
I'll try to do some formal testing. The hooks appear to be fairly easy
to put into the sieve code.

Or maybe if the Sieve plugin supports giving a separate path to downloaded scripts, the destination could be in a ramdisk or if you're using Linux: http://memcachefs.sourceforge.net/

Also I'm hopefully going to abstracting out filesystem accessing
code in Dovecot's index files and dbox code. Sieve could use this same FS API, and you could implement whatever backend to actually perform the FS access, like memcached..

Are you saying that it isn't possible to put the memcached hooks in
now until this work is done? It looks like I just put hooks into
sieve_script_init() to read from memcached before disk.

-Jonathan

Reply

Sign in to reply online Use email software

Timo Sirainen

9:18 p.m.

On Thu, 2009-07-30 at 14:16 -0400, Jonathan Siegle wrote:

...
Or maybe if the Sieve plugin supports giving a separate path to downloaded scripts, the destination could be in a ramdisk or if you're using Linux: http://memcachefs.sourceforge.net/

Also I'm hopefully going to abstracting out filesystem accessing
code in Dovecot's index files and dbox code. Sieve could use this same FS API, and you could implement whatever backend to actually perform the FS access, like memcached..

Are you saying that it isn't possible to put the memcached hooks in
now until this work is done? It looks like I just put hooks into
sieve_script_init() to read from memcached before disk.

I'm just saying what I think is a good long term solution. It's up to Stephan if he wants to add memcached-specific code. And of course you can always just do your own local modifications.

Reply

Sign in to reply online Use email software

Stephan Bosch

2 Aug 2 Aug

1:24 p.m.

Jonathan Siegle schreef:

On Jul 30, 2009, at 1:29 PM, Timo Sirainen wrote:

...
On Thu, 2009-07-30 at 11:30 -0400, Jonathan Siegle wrote:

...
I recently found out about something called memcached. The goal of memcached(server)[1] and libmemcached(client library)[2] is to store maps in memory of tokens. So my key would be "jsiegle_sieve" and my data would be my sieve file. So instead of the 10-20 million reads to disk, we would just pull from memory.

Then again, if you have enough memory your OS could be doing that automatically already.

Good point. AIX does love to use memory to cache filesystems. I think I'll try to do some formal testing. The hooks appear to be fairly easy to put into the sieve code.

...
Or maybe if the Sieve plugin supports giving a separate path to downloaded scripts, the destination could be in a ramdisk or if you're using Linux: http://memcachefs.sourceforge.net/

It doesn't do that now, but when this becomes important, I'll be able to add this quickly.

...
Also I'm hopefully going to abstracting out filesystem accessing code in Dovecot's index files and dbox code. Sieve could use this same FS API, and you could implement whatever backend to actually perform the FS access, like memcached.. Ok, that would be nice.

Are you saying that it isn't possible to put the memcached hooks in now until this work is done? It looks like I just put hooks into sieve_script_init() to read from memcached before disk.

That would miss your goal, simply because the sieve_script object refers to the actual script file, which is only used during compilation. If stat() calls are also an concern, you will need to patch this too, but your main target should be the sieve_binary, since that is used to access the compiled binary each time a message is delivered.

My priorities are not such that I am prepared implement this myself any time soon, but I am willing to accept a patch if you want to build this yourself.

Regards,

-- Stephan Bosch stephan@rename-it.nl

Reply

Sign in to reply online Use email software

5851

Age (days ago)

5854

Last active (days ago)

6 comments

3 participants

tags

participants (3)

Jonathan Siegle
Stephan Bosch
Timo Sirainen