[Dovecot] Expire plugin / expire-tool redesign for doveadm
Expire plugin / expire-tool seems annoyingly inflexible currently, so I was thinking about a more generic redesign:
The expire plugin keeps track of the oldest message in configured mailboxes. Its only configuration is the list of tracked mailboxes. There's no configuration like what the expire timeouts are or anything.
There will be a new doveadm command:
doveadm expunge [-u user | -A] <mailbox> <search query>
So when you want to expunge all mails from Trash older than 1 week for all users, you say:
doveadm expunge -A Trash savedbefore 1w
This command works even without expire plugin. To optimize it to avoid looking into all users' Trash mailbox, there's a new expire doveadm plugin, which can use the expire dict to filter out users who don't have anything to expunge.
This also supports another feature that the plugin can optimize:
doveadm move -A INBOX Archive/2009/INBOX since 2009-01-01 before 2010-01-01
And since people have had problems waiting for expire dict to fill, there could be also a new parameter that does the filling immediately.
TODO:
dbox altmove feature will be removed from expire plugin. it may need another plugin, or some other way to configure expire plugin for it. it's anyway a special case.
When using multiple mailboxes it would be more optimal to handle all mailboxes for a user at once, rather than using separate doveadm commands. Maybe the command syntax needs some more thinking to support this. Different mailboxes could have different rules though..
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Tue, 13 Apr 2010, Timo Sirainen wrote:
The expire plugin keeps track of the oldest message in configured mailboxes. Its only configuration is the list of tracked mailboxes. There's no configuration like what the expire timeouts are or anything.
My first idea, that hit me after reading through the whole post, was to keep track of all mailboxes of all users and issue the commands later. The plugin saves just one value per physical mailbox, when a) the first message arrives an empty mailbox or messages are expunged? Or each time a message is received. What would be the impact of monitoring all mailboxes?
There will be a new doveadm command:
doveadm expunge [-u user | -A] <mailbox> <search query>
What about shared mailboxes and namespaces? Can "<mailbox>" be a phyiscal path rather than a IMAP mailbox name?
So when you want to expunge all mails from Trash older than 1 week for all users, you say:
doveadm expunge -A Trash savedbefore 1w
TODO:
- When using multiple mailboxes it would be more optimal to handle all mailboxes for a user at once, rather than using separate doveadm commands. Maybe the command syntax needs some more thinking to support this. Different mailboxes could have different rules though..
Hmm, does doveadm support to read commands from stdin / file? If I imagine such scenario with some hundreds of users having configured expire for some folders, that leads to plenty of process creations etc.pp.
doveadm <<EOT expunge -A Trash savedbefore 1w expunge -u user1 INBOX savedbefore 1y expunge -u user2 INBOX savedbefore 1m EOT
Regards,
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux)
iQEVAwUBS8V2/b+Vh58GPL/cAQJloggAwxk85Y3pCeu6u6timuP25JJ1tRy/mlfr L2sgMooKtQOjz5VJZeLQnQEwDfbFItdRMMbwSg60R740h+yIVhNiJ+sQZ8FdUohj hlUpvMJzdY2XG1R3/dBqEhhHx2YXYfxszyPykvjPOXFz7v/TXqvGJUHdZG13uqRL /qVuhlstR/DBot9xlcqDWiJ/v2hiE53BQxfMn1Umfm05le8C/i9ELHCHC2NKIo6a zCURknf0ORwxoL8E2whRVpUXUCeaJHjDYcKTnGtAsZPlpFRylMiaGN/cDSTrqqpw kMpUtQ8cQbdgkaLgvpRhwxNZm5OjUaa6LyEsTSBTEHTSLa1SM6GSPw== =Yfvd -----END PGP SIGNATURE-----
On 14.4.2010, at 11.04, Steffen Kaiser wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Tue, 13 Apr 2010, Timo Sirainen wrote:
The expire plugin keeps track of the oldest message in configured mailboxes. Its only configuration is the list of tracked mailboxes. There's no configuration like what the expire timeouts are or anything.
My first idea, that hit me after reading through the whole post, was to keep track of all mailboxes of all users and issue the commands later. The plugin saves just one value per physical mailbox, when a) the first message arrives an empty mailbox or messages are expunged? Or each time a message is received. What would be the impact of monitoring all mailboxes?
Looks like the current code does a dict lookup every time messages are saved. I guess this is mainly to fix the situation when expire plugin is enabled after messages already exist in mailbox. But this check could be moved to doveadm I think.. So the dict would only be updated when:
- message arrives on empty mailbox
- first message is expunged
But I don't think it's useful to do this for all mailboxes. Even if it doesn't impact normal performance, it wastes disk space on the database for no reason. But expire plugin already supports wildcards, so just telling it to monitor "*" mailboxes would get what you want.
There will be a new doveadm command:
doveadm expunge [-u user | -A] <mailbox> <search query>
What about shared mailboxes and namespaces? Can "<mailbox>" be a phyiscal path rather than a IMAP mailbox name?
No. By shared mailboxes I guess you mean public (i.e. not another user's mailboxes)? You can expunge them using some user that has rights to do that, and using e.g. public/blah as the mailbox name.
- When using multiple mailboxes it would be more optimal to handle all mailboxes for a user at once, rather than using separate doveadm commands. Maybe the command syntax needs some more thinking to support this. Different mailboxes could have different rules though..
Hmm, does doveadm support to read commands from stdin / file?
I was actually thinking about that myself today too.
If I imagine such scenario with some hundreds of users having configured expire for some folders, that leads to plenty of process creations etc.pp.
doveadm <<EOT expunge -A Trash savedbefore 1w expunge -u user1 INBOX savedbefore 1y expunge -u user2 INBOX savedbefore 1m EOT
I don't think the process creation is much of an issue. But something like this could work:
doveadm expunge -A <<EOT Trash savedbefore 1w Trash/* savedbefore 1w Spam savedbefore 2mon EOT
Another problem I had been thinking about was how to do the search query parsing. For example with fetch:
doveadm fetch (unseen or unanswered) subject "hello world"
That won't work, because shell compains about () characters. But they could be escaped with \( and \). The next problem is the quotes. If I know that a string is supposed to follow after subject, it't not really a problem. But I'd rather not build separate parsers for command line parsing and stdin parsing (because in stdin I need to handle "" internally). But without knowing the context, I don't know if:
doveadm fetch subject "(foo)"
should be translated to string "(foo)" or if there should be a list with foo as its only element. So maybe the whole thing should be instead:
doveadm fetch '(unseen or unanswered) subject "hello world"'
but that makes it more difficult to add variables, because if you do:
doveadm fetch 'subject "$foo"'
you need to make sure $foo escapes " and \ characters.
.. difficult..
On Wed, 2010-04-14 at 12:04 +0300, Timo Sirainen wrote:
I don't think the process creation is much of an issue. But something like this could work:
doveadm expunge -A <<EOT Trash savedbefore 1w Trash/* savedbefore 1w Spam savedbefore 2mon EOT
The above syntax would probably have to use IMAP parser.
doveadm fetch subject "(foo)"
This works nowadays though. It's possible to use:
doveadm fetch INBOX \( subject "(foo)" seen \) or unseen
This works because there are now "IMAP parser" and "command line parser". The command line parser knows that after subject there must come a string, so it's not confused by the () characters. Then it also knows that when "(" or ")" comes in a separate parameter, it means a list. Actually it would have been possible to support \(subject without space after \(, but this won't work with the ending \):
\(subject "(foo)"\)
vs.
\(subject "(foo))"
would look identical to doveadm. So I thought it's better to always require the space.
but that makes it more difficult to add variables, because if you do:
doveadm fetch 'subject "$foo"'
$variables also work nicely now without having to escape them.
Now, the next problem is how to select what to fetch and what the output format should look like. I'm thinking about:
doveadm fetch INBOX "flags uid hdr.received hdr.from body" all
would look like:
===sep flags: \seen \draft $Label1 uid: 1234 hdr: Received: stuff Received: more stuff From: tss@iki.fi
body: message body .. ===sep flags: ..next message.. ===sep
The ===sep is a randomly generated separator string that begins always with "===", optionally it could be given as parameter. I was wondering about how to return hdr.* fields, if they should be returned separately or all in one "hdr". Otherwise separate fields would be nice, but if the header exists multiple times, it's not so clear anymore how it should be written. So if there's a single hdr then it's at least easy to understand that it ends with an empty line.
Besides the example parameters above there could be "hdr" = full header and "text" = alias for "hdr body".
It would be nice also to support something like:
doveadm search INBOX from tss@iki.fi
doveadm next|less doveadm next|less ..etc..
So the "next" would return the next matching message based on the previous search. I'm not really sure where the state could be kept though. Would be nice if it was terminal-specific, and would be nice if it didn't write any temporary files.
On Tue, 20 Apr 2010 18:49:10 +0300 Timo Sirainen wrote:
The above syntax would probably have to use IMAP parser.
doveadm fetch subject "(foo)"
This works nowadays though. It's possible to use:
doveadm fetch INBOX \( subject "(foo)" seen \) or unseen
This works because there are now "IMAP parser" and "command line parser". The command line parser knows that after subject there must come a string, so it's not confused by the () characters. Then it also knows that when "(" or ")" comes in a separate parameter, it means a list. Actually it would have been possible to support \(subject without space after \(, but this won't work with the ending \):
\(subject "(foo)"\)
vs.
\(subject "(foo))"
would look identical to doveadm. So I thought it's better to always require the space.
but that makes it more difficult to add variables, because if you do:
doveadm fetch 'subject "$foo"'
$variables also work nicely now without having to escape them.
Now, the next problem is how to select what to fetch and what the output format should look like. I'm thinking about:
doveadm fetch INBOX "flags uid hdr.received hdr.from body" all
would look like:
===sep flags: \seen \draft $Label1 uid: 1234 hdr: Received: stuff Received: more stuff From: tss@iki.fi
body: message body .. ===sep flags: ..next message.. ===sep
The ===sep is a randomly generated separator string that begins always with "===", optionally it could be given as parameter. I was wondering about how to return hdr.* fields, if they should be returned separately or all in one "hdr". Otherwise separate fields would be nice, but if the header exists multiple times, it's not so clear anymore how it should be written. So if there's a single hdr then it's at least easy to understand that it ends with an empty line.
Besides the example parameters above there could be "hdr" = full header and "text" = alias for "hdr body".
It would be nice also to support something like:
doveadm search INBOX from tss@iki.fi
doveadm next|less doveadm next|less ..etc..
So the "next" would return the next matching message based on the previous search. I'm not really sure where the state could be kept though. Would be nice if it was terminal-specific, and would be nice if it didn't write any temporary files.
Maybe you should take exim's queue style for such operations. Here few examples:
#exim -bp //return queue in format:
4d 99K 1O36Eu-0001Py-Ts <mykolenko.s@domain.com> is_uzh@domain.ltd
#doveadm search INBOX from tss@iki.fi <...id-1> <...id-2> ........ <...id-n> Will return just unique ids for current mailbox. Maybe also few headers/flags etc.
#doveadm fetch <...id-2> all Print full message for id, gotten from doveadm searc.
#doveadm fetch <...id-2> hrd.from hdr.subject Print corresponding header values.
Exim also have -Mvl option - which print log lines for current id. It will be nice if:
#doveadm log <id> Found all service records for message and print them to terminal.
With above scheme you don't need to construct separator - <id...> will be enough and doveadm next wont be necessary.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Tue, 20 Apr 2010, Timo Sirainen wrote:
doveadm fetch INBOX "flags uid hdr.received hdr.from body" all
would look like:
===sep flags: \seen \draft $Label1 uid: 1234 hdr: Received: stuff Received: more stuff From: tss@iki.fi
body: message body .. ===sep flags: ..next message.. ===sep
The ===sep is a randomly generated separator string that begins always with "===", optionally it could be given as parameter. I was wondering about how to return hdr.* fields, if they should be returned separately or all in one "hdr". Otherwise separate fields would be nice, but if the header exists multiple times, it's not so clear anymore how it should be written. So if there's a single hdr then it's at least easy to
I would output the headers the same way they are in the message, maybe an option to unfold lines.
understand that it ends with an empty line.
How about LDIF-like syntax:
<tag>: <text> <tag>: <text> continued line
So:
flags: \seen \draft $Label1 uid: 1234 hdr: Received: stuff hdr: Received: more stuff hdr: From: tss@iki.fi continued line of From: body: line 1 (8bit probably encoded??) line 2 line 3 <<really blank line separating items>>
It would be nice also to support something like:
doveadm search INBOX from tss@iki.fi
doveadm next|less doveadm next|less
I would agree with Nikita to not use this scheme, but something you can parse easily and put into a loop, say
doveadm search INBOX from me@example.com | while read uid trail; do doveadm fetch "$uid" done
though. Would be nice if it was terminal-specific, and would be nice if it didn't write any temporary files.
;-)
Regards,
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux)
iQEVAwUBS86oob+Vh58GPL/cAQLqgAf/futnHtRQxPBKnutd8mJmbmoSppAGqXpv QRBUkOlP+ynANuL5iL6KSYMsIcZQols5KKTRmzddX+cMu+Jo8F5pU5XjkfGJ7fYt BDVY/Of1Bxm6WoPyHK12EQaO0+X3IJw8b8jH3e4TEQi3Se0TUCJvP1q04/0HqUFp bIWBs6a4ZJTqzNrZ0A9Sr6jKBYAelCpPOlhRZ9jDHkvqlbKDcXaalQxVtPpbHNsX BM363MSxkwLr7wWpRCdaeAebsYWKRz0WpxQEOb7yX6/+Rqekkh7czzlafAEjR7TJ zgxkvImCYeC600R4J7Q/11msqgQPIpIMNwIP+jkyFNrfdTaQPoSOZw== =PzpF -----END PGP SIGNATURE-----
On Wed, 2010-04-21 at 09:26 +0200, Steffen Kaiser wrote:
doveadm search INBOX from me@example.com | while read uid trail; do doveadm fetch "$uid" done
This works now:
doveadm search from me@example.com | while read mailbox uid; do doveadm fetch text mailbox-guid $mailbox uid $uid > msg.$uid done
also note how it supports searching/fetching from multiple mailboxes :)
participants (3)
-
Nikita Koshikov
-
Steffen Kaiser
-
Timo Sirainen