[Dovecot] Trash expire plugin
Dovecot will soon have a plugin which allows running a nightly cronjob to expunge mails from configured mailboxes which have been in there for a configurable amount of time. For example the configuration could be:
plugin { # Trash 7d, Spam 30d expire = Trash 7 Spam 30 }
It also supports using Dovecot's lib-dict to keep track of the mailboxes and their oldest mail's timestamp, so that the nightly cronjob can quickly go through only those mailboxes that have those old mails. Berkeley DB support has been added to lib-dict for this purpose, but you could use SQL as well.
There's however one problem with it that I could use opinions on. Currently there's no way to know how long a mail has been in a mailbox. There are pretty much 3 ways to handle this:
When copying to those Trash/Spam mailboxes the message's INTERNALDATE could be replaced with the current time. That way the INTERNALDATE tells how long the message has been in there. The obvious downside of this is that if the user wants to move the message back to another mailbox, the original INTERNALDATE is lost (INTERNALDATE means the time when the message was received and is often the timestamp used to sort messages).
Preserve the INTERNALDATE while copying, but still use it anyway. Often the messages are moved to Trash (and especially to Spam) soon after they have been received. If not, then the user has time until the next nightly cronjob to move the messages back before they're lost.
Add a new field to index file which contains the date when the message was copied. This could be done only for those Trash/Spam boxes so they don't waste space for other mailboxes. The problem with this is that it won't work if there are no index files. To overcome that problem the time could be stored in yet another place, such as maildir filename and some new mbox header. I don't see supporting those as worth the trouble though, so the fallback could be either the date index file is created or the messages' INTERNALDATE.
I think either 1) or 3) would be best solutions. 3) probably, but if index files don't exist nearly permanently, it's not very reliable..
Timo Sirainen wrote:
- Add a new field to index file which contains the date when the message was copied.
This is the best solution, to my mind, as it has the extreme benefit of doing what makes the most sense to the enduser. In other words, I receive a message today and move it to Trash immediately. My sysadmin has helpfully told me that messages older than 7 days will be deleted from Trash. The act of copying it to Trash starts the clock running. Moving the file out of Trash and back in the next day restarts the clock. Anything that messes with INTERNALDATE will inevitably be wrong for the indecisive user.
I was going to argue for option #2, but then I realized that if I sit on a message for weeks and then Trash it, it would make more sense that it would be deleted 7 days hence, not that night. Sites running without indexes shouldn't use this feature (or possibly this feature should be disabled if the appropriate index doesn't exist).
John
-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4501 Forbes Boulevard Suite H Lanham, MD 20706 301-459-3366 x.5010 fax 301-429-5748
On Thu, 8 Jun 2006, Timo Sirainen wrote:
- Add a new field to index file which contains the date when the message was copied. This could be done only for those Trash/Spam boxes so they don't waste space for other mailboxes. The problem with this is that it won't work if there are no index files. To overcome that problem the time could be stored in yet another place, such as maildir filename and some new mbox header. I don't see supporting those as worth the trouble though, so the fallback could be either the date index file is created or the messages' INTERNALDATE.
Hmm, I'd prefer variant 3) with:
if there _should_ be an index (aka it was destroyed and recreated): restart the timer; if there is NO index configured at all, use the INTERNALDATE.
However, I'd store all information only found in the indexes into the Maildir filenames as well, so the indexes can be recreated from the filenames, incl. keywords and such.
Bye,
-- Steffen Kaiser
On Thu, 2006-06-08 at 17:33 +0300, Timo Sirainen wrote:
- Add a new field to index file which contains the date when the message was copied. This could be done only for those Trash/Spam boxes so they don't waste space for other mailboxes. The problem with this is that it won't work if there are no index files. To overcome that problem the time could be stored in yet another place, such as maildir filename and some new mbox header. I don't see supporting those as worth the trouble though, so the fallback could be either the date index file is created or the messages' INTERNALDATE.
Actually for maildir the time is in file's ctime, so that would be a good fallback. For dbox I could add a new field if people think it is useful (might be useful for other purposes also?).
mbox users are out of luck though. For them the fallback could be the current date, so as long as the indexes usually stay for longer than the expire days, this works well enough.
Hi Timo.
We've finally managed to get all of the storage issues resolved... using flock with GFS on a SAN and things seem stable and fast.
I noticed Dovecot doesn't track usage when IMAP quotas are unlimited. We'd really like to be able to see utilization for accounts that have an unlimited quota (this is also how CommuniGate works). What would be involved in adding this?
Thanks,
Steve
On Thu, 2006-06-08 at 12:13 -0400, Apps Lists wrote:
I noticed Dovecot doesn't track usage when IMAP quotas are unlimited. We'd really like to be able to see utilization for accounts that have an unlimited quota (this is also how CommuniGate works). What would be involved in adding this?
Perhaps I could change it so that if storage=0 is given, it tracks the quota but doesn't enforce it. If the storage=n is omitted completely, it doesn't even track it.
I'm not sure what to do if maildirsize file contains 0S. Should Dovecot still bother tracking it, or only if storage=0 parameter is given to the quota plugin?
On Thu, 2006-06-08 at 12:13 -0400, Apps Lists wrote:
I noticed Dovecot doesn't track usage when IMAP quotas are unlimited. We'd really like to be able to see utilization for accounts that have an unlimited quota (this is also how CommuniGate works). What would be involved in adding this?
Perhaps I could change it so that if storage=0 is given, it tracks the quota but doesn't enforce it. If the storage=n is omitted completely, it doesn't even track it.
I'm not sure what to do if maildirsize file contains 0S. Should Dovecot still bother tracking it, or only if storage=0 parameter is given to the quota plugin?
Hi Timo.
It seems that maildrop continues tracking storage used even if the quota is set to zero. I am currently stuffing ULLONG_MAX-1 in there now so that quotas are tracked. This is working perfectly at the moment within Dovecot, but breaks Maildrop. The maildir++ specification says bytes are int, and Maildrop sees ULLONG from Dovecot and refuses to deliver! Why they chose to implement a quota that maxes out at 2 gigabytes is beyond me. I much prefer the storage type in Dovecot, and will probably end up having to modify Maildrop to use it.
It might be advantageous to make the "track quota with 0 quota" a compile-time option.
Steve
On Thu, 2006-06-08 at 18:32 +0300, Timo Sirainen wrote:
On Thu, 2006-06-08 at 17:33 +0300, Timo Sirainen wrote:
- Add a new field to index file which contains the date when the message was copied. This could be done only for those Trash/Spam boxes so they don't waste space for other mailboxes. The problem with this is that it won't work if there are no index files. To overcome that problem the time could be stored in yet another place, such as maildir filename and some new mbox header. I don't see supporting those as worth the trouble though, so the fallback could be either the date index file is created or the messages' INTERNALDATE.
Actually for maildir the time is in file's ctime, so that would be a good fallback. For dbox I could add a new field if people think it is useful (might be useful for other purposes also?).
Ctime will change if you rename the file - don't some operations do that?
mbox users are out of luck though. For them the fallback could be the current date, so as long as the indexes usually stay for longer than the expire days, this works well enough.
The recent sql discussion made me have an off-the-wall idea... Have you considered adapting the cyrus file structure as a back-end option for dovecot? Normally wanting mbox or maildir formats would be part of the reason for choosing dovecot instead of cyrus, but if you are going to add alternatives, why not take one where someone else has already fixed the bugs, the delivery agent already exists, and there are probably some file management tools already developed?
-- Les Mikesell lesmikesell@gmail.com
On Thu, 2006-06-08 at 11:45 -0500, Les Mikesell wrote:
On Thu, 2006-06-08 at 18:32 +0300, Timo Sirainen wrote:
On Thu, 2006-06-08 at 17:33 +0300, Timo Sirainen wrote:
- Add a new field to index file which contains the date when the message was copied. This could be done only for those Trash/Spam boxes so they don't waste space for other mailboxes. The problem with this is that it won't work if there are no index files. To overcome that problem the time could be stored in yet another place, such as maildir filename and some new mbox header. I don't see supporting those as worth the trouble though, so the fallback could be either the date index file is created or the messages' INTERNALDATE.
Actually for maildir the time is in file's ctime, so that would be a good fallback. For dbox I could add a new field if people think it is useful (might be useful for other purposes also?).
Ctime will change if you rename the file - don't some operations do that?
Yes. But if you're doing operations (flag changes mostly) for the mail, I think it might be even good that the timer gets reset.
mbox users are out of luck though. For them the fallback could be the current date, so as long as the indexes usually stay for longer than the expire days, this works well enough.
The recent sql discussion made me have an off-the-wall idea... Have you considered adapting the cyrus file structure as a back-end option for dovecot?
Well, the Cyrus format's index files are somewhat similar to Dovecot's index files. There's no point in duplicating those, so either with Cyrus backend the Dovecot indexes wouldn't be used or the Cyrus indexes wouldn't be used.
Not using Cyrus indexes could be done in two ways: 1) Keep them updated, just don't really use them. Not exactly optimal since disk I/O is still wasted on them. 2) Don't keep them updated, which breaks all other Cyrus backend users.
Not using Dovecot indexes would be possible, but to make it work fast it would require rewriting a lot of Dovecot code to work differently. Really not worth the trouble.
On Thu, Jun 08, 2006 at 06:32:35PM +0300, Timo Sirainen wrote:
On Thu, 2006-06-08 at 17:33 +0300, Timo Sirainen wrote:
- Add a new field to index file which contains the date when the message was copied. This could be done only for those Trash/Spam boxes so they don't waste space for other mailboxes. The problem with this is that it won't work if there are no index files. To overcome that problem the time could be stored in yet another place, such as maildir filename and some new mbox header. I don't see supporting those as worth the trouble though, so the fallback could be either the date index file is created or the messages' INTERNALDATE.
Actually for maildir the time is in file's ctime, so that would be a good fallback. For dbox I could add a new field if people think it is useful (might be useful for other purposes also?).
Are you saying you are using ctime for the INTERNALDATE (or as its fallback)? I would think mtime would be better for that, since it shouldn't change with a folder move or other non-data-altering manipulations. And that would leave ctime free for the fallback for the expiry clock: since it *does* change with a folder move, flag change, etc, it would match what people would expect. Unless I'm wrong :)
No opinions on non-maildir, you'd have to make something up.
mm
On Thu, 2006-06-08 at 15:11 -0400, Mark E. Mallett wrote:
On Thu, Jun 08, 2006 at 06:32:35PM +0300, Timo Sirainen wrote:
On Thu, 2006-06-08 at 17:33 +0300, Timo Sirainen wrote:
- Add a new field to index file which contains the date when the message was copied. This could be done only for those Trash/Spam boxes so they don't waste space for other mailboxes. The problem with this is that it won't work if there are no index files. To overcome that problem the time could be stored in yet another place, such as maildir filename and some new mbox header. I don't see supporting those as worth the trouble though, so the fallback could be either the date index file is created or the messages' INTERNALDATE.
Actually for maildir the time is in file's ctime, so that would be a good fallback. For dbox I could add a new field if people think it is useful (might be useful for other purposes also?).
Are you saying you are using ctime for the INTERNALDATE (or as its fallback)? I would think mtime would be better for that, since it shouldn't change with a folder move or other non-data-altering manipulations.
mtime is already used for that.
And that would leave ctime free for the fallback for the expiry clock: since it *does* change with a folder move, flag change, etc, it would match what people would expect. Unless I'm wrong :)
Exactly what I meant. :)
On 6/8/06, Timo Sirainen tss@iki.fi wrote:
Dovecot will soon have a plugin which allows running a nightly cronjob to expunge mails from configured mailboxes which have been in there for a configurable amount of time. For example the configuration could be:
plugin { # Trash 7d, Spam 30d expire = Trash 7 Spam 30 }
Just wanted to let you know how grateful I am when this comes in!
Thanks a million Timo for making such a great package, and bringing useful functions in that most people want!
Tim /me goes to work out how to add a message above Spam and Trash folder in SM, to reflect this new plugin
Linux Counter user #273956
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Timo Sirainen wrote:
Dovecot will soon have a plugin which allows running a nightly cronjob to expunge mails from configured mailboxes which have been in there for a configurable amount of time. For example the configuration could be:
plugin { # Trash 7d, Spam 30d expire = Trash 7 Spam 30 }
It also supports using Dovecot's lib-dict to keep track of the mailboxes and their oldest mail's timestamp, so that the nightly cronjob can quickly go through only those mailboxes that have those old mails. Berkeley DB support has been added to lib-dict for this purpose, but you could use SQL as well.
There's however one problem with it that I could use opinions on. Currently there's no way to know how long a mail has been in a mailbox. There are pretty much 3 ways to handle this:
When copying to those Trash/Spam mailboxes the message's INTERNALDATE could be replaced with the current time. That way the INTERNALDATE tells how long the message has been in there. The obvious downside of this is that if the user wants to move the message back to another mailbox, the original INTERNALDATE is lost (INTERNALDATE means the time when the message was received and is often the timestamp used to sort messages).
Preserve the INTERNALDATE while copying, but still use it anyway. Often the messages are moved to Trash (and especially to Spam) soon after they have been received. If not, then the user has time until the next nightly cronjob to move the messages back before they're lost.
Add a new field to index file which contains the date when the message was copied. This could be done only for those Trash/Spam boxes so they don't waste space for other mailboxes. The problem with this is that it won't work if there are no index files. To overcome that problem the time could be stored in yet another place, such as maildir filename and some new mbox header. I don't see supporting those as worth the trouble though, so the fallback could be either the date index file is created or the messages' INTERNALDATE.
I think either 1) or 3) would be best solutions. 3) probably, but if index files don't exist nearly permanently, it's not very reliable..
Timo,
I think #3 is the closet to a good solution. I read through the thread but decided to reply to the original post.
In the case of the mbox backend, when a message is deleted, the actual contents of the message has to be altered to change the status: flag, right? (it's been a long time since i've used mbox, so i could be off base on this) ... what if you (optionally) added an X-Dovecot-Reserved: header to the email with the last relevant date information to use as a backup if the indexes don't exist?
I think the maildir format issues have been pretty well hashed out and using ctime (mtime?) as the fallback would probably be reasonable.
I really like this idea as it could help me help my users manage their quotas by forcing the trashes to be deleted on a regular basis.
my 2yen worth
Alan -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEiLybE2gsBSKjZHQRAu7UAKCgxwUKdKGDj2WvMJZGW6cAj6hD0ACfaj6x VpXJ9LrEa2YI/0gphDKiu7I= =8r0f -----END PGP SIGNATURE-----
Timo Sirainen wrote:
Dovecot will soon have a plugin which allows running a nightly cronjob to expunge mails from configured mailboxes which have been in there for a configurable amount of time. For example the configuration could be:
plugin { # Trash 7d, Spam 30d expire = Trash 7 Spam 30 }
I'm looking forward to this. In the meantime, for messages in Maildir format, is it safe to delete message files using a simple cron script based on "find" with a specified mtime?
For mbox style I've used the common scripts "archivemail" and "expiremail.pl", and I'm looking for something similar to use with Maildir.
Thanks, Mark
On Fri, 2006-06-09 at 14:55 -0700, Mark Nienberg wrote:
Timo Sirainen wrote:
Dovecot will soon have a plugin which allows running a nightly cronjob to expunge mails from configured mailboxes which have been in there for a configurable amount of time. For example the configuration could be:
plugin { # Trash 7d, Spam 30d expire = Trash 7 Spam 30 }
I'm looking forward to this. In the meantime, for messages in Maildir format, is it safe to delete message files using a simple cron script based on "find" with a specified mtime?
I'd use ctime instead. See the mtime vs ctime discussion in this thread. :)
On Thu, 8 Jun 2006, Timo Sirainen wrote:
Dovecot will soon have a plugin which allows running a nightly cronjob to expunge mails from configured mailboxes which have been in there for a configurable amount of time. For example the configuration could be:
plugin { # Trash 7d, Spam 30d expire = Trash 7 Spam 30 }
It would be good to have user-specific settings for this. E.g. our users uses plenty of MUAs, all of them with their own mind of where to store SPAM, Junk, etc.pp.
(I also would like to configure some users to expire _all_ folders or at least the INBOX :-)
Bye,
-- Steffen Kaiser
On Mon, 2006-06-12 at 11:07 +0200, Steffen Kaiser wrote:
On Thu, 8 Jun 2006, Timo Sirainen wrote:
Dovecot will soon have a plugin which allows running a nightly cronjob to expunge mails from configured mailboxes which have been in there for a configurable amount of time. For example the configuration could be:
plugin { # Trash 7d, Spam 30d expire = Trash 7 Spam 30 }
It would be good to have user-specific settings for this. E.g. our users uses plenty of MUAs, all of them with their own mind of where to store SPAM, Junk, etc.pp.
Well, one possibility would be to just list them all in there. If the mailbox doesn't exist it's just ignored.
Alternatively you really can make that setting user-specific simply by returning the "expire" from userdb. Then it overrides the plugin-setting.
(I also would like to configure some users to expire _all_ folders or at least the INBOX :-)
Well, I guess expiring "all" would be possible somewhat easily, but I don't know if I should bother adding that. :) INBOX can of course be listed just as well as other boxes. Hmm. Adding support for "*" wouldn't be too difficult though.
On Mon, 12 Jun 2006, Timo Sirainen wrote:
Alternatively you really can make that setting user-specific simply by returning the "expire" from userdb. Then it overrides the plugin-setting.
Hey, this is exactly what I meant - the announcement read for me as it would have global settings only.
Bye,
-- Steffen Kaiser
On Mon, 2006-06-12 at 12:12 +0200, Steffen Kaiser wrote:
On Mon, 12 Jun 2006, Timo Sirainen wrote:
Alternatively you really can make that setting user-specific simply by returning the "expire" from userdb. Then it overrides the plugin-setting.
Hey, this is exactly what I meant - the announcement read for me as it would have global settings only.
All the settings in dovecot.conf can be overridden from userdb actually. The settings are normally passed in environment as key=value. After they're set to environment, the extra-args returned by userdb are processed and put into environment, so they can override the default settings.
On 6/8/06, Timo Sirainen tss@iki.fi wrote:
Dovecot will soon have a plugin which allows running a nightly cronjob to expunge mails from configured mailboxes which have been in there for a configurable amount of time. For example the configuration could be:
plugin { # Trash 7d, Spam 30d expire = Trash 7 Spam 30 }
Timo, how is progress on this? Is there anything we can do to help, cause I'm really interested in this.
Thanks
Tim
Linux Counter user #273956
Timothy White wrote:
plugin { # Trash 7d, Spam 30d expire = Trash 7 Spam 30 }
Timo, how is progress on this? Is there anything we can do to help, cause I'm really interested in this.
If you can't wait then take a look at imapfilter: http://imapfilter.hellug.gr/
Example ~/.imapfilter/ config.lua:
#v+
mbox = { server = '127.0.0.1', username = 'blahblah', password = '**********', }
seen_one_month_ago = { 'seen', 'sentbefore ' .. date_before(32), }
check(mbox, 'Trash') results = match(mbox, 'Trash', seen_one_months_ago) delete(mbox, 'Trash', results)
#v-
Cheers,
-- Klaus Alexander Seistrup Copenhagen · Denmark http://surdej.dk/
On 6/24/06, Klaus Alexander Seistrup kseistrup@gmail.com wrote:
If you can't wait then take a look at imapfilter: http://imapfilter.hellug.gr/
Nah, I need something server side, that I can run for all the users... If worst comes to worst, I just implement a cron job, that uses ctime (I think that was decided as best) and runs through all the trash and spam folders.
Tim
Linux Counter user #273956
On Sat, 2006-06-24 at 10:00 +0800, Timothy White wrote:
On 6/8/06, Timo Sirainen tss@iki.fi wrote:
Dovecot will soon have a plugin which allows running a nightly cronjob to expunge mails from configured mailboxes which have been in there for a configurable amount of time. For example the configuration could be:
plugin { # Trash 7d, Spam 30d expire = Trash 7 Spam 30 }
Timo, how is progress on this? Is there anything we can do to help, cause I'm really interested in this.
Do you want it to expunge when user logs in or in some nightly runs? Currently it supports only doing it via nightly runs, which also means keeping a database of users' mailboxes and their expire times.
The plugin part of it is basically done, but the nightly run binary needs a bit of work. Although the plugin hasn't been tested more than that "it compiles"..
On 6/25/06, Timo Sirainen tss@iki.fi wrote:
Timo, how is progress on this? Is there anything we can do to help, cause I'm really interested in this.
Do you want it to expunge when user logs in or in some nightly runs? Currently it supports only doing it via nightly runs, which also means keeping a database of users' mailboxes and their expire times.
Nightly run is fine. What kind of database are you looking at? I saw something about lib-dict, but can we choose the back end for that?
The plugin part of it is basically done, but the nightly run binary needs a bit of work. Although the plugin hasn't been tested more than that "it compiles"..
I'll test the plugin for you if you want. It seems your plugin is in a similar state to mine, having the plugin side done, and not the extra binary! (Although I have tested my plugin, and it works).
Tim
Linux Counter user #273956
On Sun, 2006-06-25 at 08:25 +0800, Timothy White wrote:
On 6/25/06, Timo Sirainen tss@iki.fi wrote:
Timo, how is progress on this? Is there anything we can do to help, cause I'm really interested in this.
Do you want it to expunge when user logs in or in some nightly runs? Currently it supports only doing it via nightly runs, which also means keeping a database of users' mailboxes and their expire times.
Nightly run is fine. What kind of database are you looking at? I saw something about lib-dict, but can we choose the back end for that?
lib-dict yes. You can use either Berkeley DB or MySQL as the backend (or MySQL code still needs some fixing to work).
The plugin part of it is basically done, but the nightly run binary needs a bit of work. Although the plugin hasn't been tested more than that "it compiles"..
I'll test the plugin for you if you want. It seems your plugin is in a similar state to mine, having the plugin side done, and not the extra binary! (Although I have tested my plugin, and it works).
Well, I'll try to get the plugin into CVS soon.
participants (10)
-
alan premselaar
-
Apps Lists
-
John Peacock
-
Klaus Alexander Seistrup
-
Les Mikesell
-
Mark E. Mallett
-
Mark Nienberg
-
Steffen Kaiser
-
Timo Sirainen
-
Timothy White