[Dovecot] maildir maintenance?
Hi, I'm running version 1.2.15 (so no doveadm) with around 6000 maildir users, some of which are very large. For completeness, the details of the setup are as follows:
- The maildirs are stored via NFS.
- The indexes are on a volume local to the dovecot server.
- Only one IMAP server currently.
- A separate sendmail/procmail server delivers via NFS.
I recently wrote the attached script (could probably be improved) meant in theory to be run as a nightly cron job to do the follow:
- Remove old messages marked as deleted. (I didn't really like the expunge/trash plugins the last time I looked at them.)
- Rebuild the indexes to account of those changes. (So that the users aren't delayed by dovecot rebuilding them.)
- Rebuild FTS indexes. (Again to avoid delay.)
- Rebuild the maildirsize file. (It seems to become slightly inaccurate over time.)
To the devs, I'm wondering if this seems sane?
To other users of dovecot, I'm wondering what, if any sort of maintenance operations you tend to do in your setups?
Thanks, Brian
On 24.11.2010, at 2.55, Brian Kroth wrote:
Hi, I'm running version 1.2.15 (so no doveadm)
You could build Dovecot v2.0 and only use doveadm binary from it.
I recently wrote the attached script (could probably be improved) meant in theory to be run as a nightly cron job to do the follow: .. To the devs, I'm wondering if this seems sane?
Looks about ok. The main thing I'm worried is what happens if user creates mailboxes containing " or ' or ` characters.
Timo Sirainen tss@iki.fi 2010-11-24 18:01:
On 24.11.2010, at 2.55, Brian Kroth wrote:
Hi, I'm running version 1.2.15 (so no doveadm)
You could build Dovecot v2.0 and only use doveadm binary from it.
Does it just issue the command via IMAP? No direct filesystem operations?
I recently wrote the attached script (could probably be improved) meant in theory to be run as a nightly cron job to do the follow: .. To the devs, I'm wondering if this seems sane?
Looks about ok. The main thing I'm worried is what happens if user creates mailboxes containing " or ' or ` characters.
Yeah, that was mostly me being lazy in my wanting to deal with escaping, so I just ignored them.
In what I originally wrote, I think it just won't touch them.
Or is the issue that the find command might remove them and then the indexes don't get fixed up? I suppose I could just make sure that the find ignores those dirs, but I thought (from other maillist reading) that the next time their client SELECTs the folder it'll fix it up anyways.
I suppose another spin on this would be for me to script the preauth imap client to figure out which mailboxes have messages marked for deletion of such and such an age and then try to use EXPUNGE to wipe just them out. I'm not sure off hand if that's possible.
I suppose that's what doveadm already does.
Thanks, Brian
On 24.11.2010, at 18.59, Brian Kroth wrote:
Hi, I'm running version 1.2.15 (so no doveadm) You could build Dovecot v2.0 and only use doveadm binary from it.
Does it just issue the command via IMAP? No direct filesystem operations?
It's all direct filesystem operations, no IMAP. But v1.2.15 can read v2.0's index files just fine.
Looks about ok. The main thing I'm worried is what happens if user creates mailboxes containing " or ' or ` characters.
Yeah, that was mostly me being lazy in my wanting to deal with escaping, so I just ignored them.
In what I originally wrote, I think it just won't touch them.
Or is the issue that the find command might remove them and then the indexes don't get fixed up? I suppose I could just make sure that the find ignores those dirs, but I thought (from other maillist reading) that the next time their client SELECTs the folder it'll fix it up anyways.
I was more thinking what happens if the user creates a mailbox called rm -rf /
or something.. Also if there are " or \ characters I think the LIST output will use literals and your parsing will break more or less badly.
I suppose another spin on this would be for me to script the preauth imap client to figure out which mailboxes have messages marked for deletion of such and such an age and then try to use EXPUNGE to wipe just them out. I'm not sure off hand if that's possible.
That would be a bit difficult at least to do via IMAP..
Timo Sirainen tss@iki.fi 2010-11-24 19:04:
On 24.11.2010, at 18.59, Brian Kroth wrote:
Hi, I'm running version 1.2.15 (so no doveadm) You could build Dovecot v2.0 and only use doveadm binary from it.
Does it just issue the command via IMAP? No direct filesystem operations?
It's all direct filesystem operations, no IMAP. But v1.2.15 can read v2.0's index files just fine.
Looks about ok. The main thing I'm worried is what happens if user creates mailboxes containing " or ' or ` characters.
Yeah, that was mostly me being lazy in my wanting to deal with escaping, so I just ignored them.
In what I originally wrote, I think it just won't touch them.
Or is the issue that the find command might remove them and then the indexes don't get fixed up? I suppose I could just make sure that the find ignores those dirs, but I thought (from other maillist reading) that the next time their client SELECTs the folder it'll fix it up anyways.
I was more thinking what happens if the user creates a mailbox called
rm -rf /
or something.. Also if there are " or \ characters I think the LIST output will use literals and your parsing will break more or less badly.
That's certainly true. I guess I was just hoping to skip over those mailboxes with unpleasant characters for the moment :}
More likely I'll rewrite this more carefully in Perl.
I suppose another spin on this would be for me to script the preauth imap client to figure out which mailboxes have messages marked for deletion of such and such an age and then try to use EXPUNGE to wipe just them out. I'm not sure off hand if that's possible.
That would be a bit difficult at least to do via IMAP..
So I'm finding. I guess I was thinking I could find the messages in a SELECTed mailbox via some parsing of either
- UID SEARCH DELETED X-SINCE $N_days_ago (where X-SINCE search X-SAVEDATE instead of INTERNALDATE), or
- UID FETCH 1:* (INTERNALDATE X-SAVEDATE FLAGS) as I've seen bantered about, or
- combine the two and SEARCH DELETED, then UID FETCH $initial_uid_list (X-SAVEDATE FLAGS) to refine the list.
Then use the (U?)IDs I get back from that to do
- UID EXPUNGE $uid_list
Of course I've only started researching that avenue, so maybe that's not so reasonable.
I'm starting to see why so much effort has been expended on this front.
Thanks, Brian
Brian Kroth bpkroth@gmail.com 2010-11-24 13:28:
Timo Sirainen tss@iki.fi 2010-11-24 19:04:
On 24.11.2010, at 18.59, Brian Kroth wrote:
Hi, I'm running version 1.2.15 (so no doveadm) You could build Dovecot v2.0 and only use doveadm binary from it.
Does it just issue the command via IMAP? No direct filesystem operations?
It's all direct filesystem operations, no IMAP. But v1.2.15 can read v2.0's index files just fine.
Looks about ok. The main thing I'm worried is what happens if user creates mailboxes containing " or ' or ` characters.
Yeah, that was mostly me being lazy in my wanting to deal with escaping, so I just ignored them.
In what I originally wrote, I think it just won't touch them.
Or is the issue that the find command might remove them and then the indexes don't get fixed up? I suppose I could just make sure that the find ignores those dirs, but I thought (from other maillist reading) that the next time their client SELECTs the folder it'll fix it up anyways.
I was more thinking what happens if the user creates a mailbox called
rm -rf /
or something.. Also if there are " or \ characters I think the LIST output will use literals and your parsing will break more or less badly.That's certainly true. I guess I was just hoping to skip over those mailboxes with unpleasant characters for the moment :}
More likely I'll rewrite this more carefully in Perl.
I suppose another spin on this would be for me to script the preauth imap client to figure out which mailboxes have messages marked for deletion of such and such an age and then try to use EXPUNGE to wipe just them out. I'm not sure off hand if that's possible.
That would be a bit difficult at least to do via IMAP..
So I'm finding. I guess I was thinking I could find the messages in a SELECTed mailbox via some parsing of either
- UID SEARCH DELETED X-SINCE $N_days_ago (where X-SINCE search X-SAVEDATE instead of INTERNALDATE), or
- UID FETCH 1:* (INTERNALDATE X-SAVEDATE FLAGS) as I've seen bantered about, or
- combine the two and SEARCH DELETED, then UID FETCH $initial_uid_list (X-SAVEDATE FLAGS) to refine the list.
Then use the (U?)IDs I get back from that to do
- UID EXPUNGE $uid_list
Of course I've only started researching that avenue, so maybe that's not so reasonable.
I'm starting to see why so much effort has been expended on this front.
Thanks, Brian
So, I redid this in Perl to only use IMAP rather than any sudo or find calls. In theory then one doesn't need to worry about the indexes being out of sync. I still skipped over the "strange characters" mailboxes for the moment. I'm wondering what you think of this second rendition?
The only thing I'm not quite sure about is if there's some sort of race between clients accessing/altering UIDs that may or may not get reused. But I think this one is at least clear of the problem you mentioned earlier.
In theory one would call it from cron like so:
30 0 * * * root /opt/cron/dovecot-maintenance.sh | logger -i -p mail.info -t dovecot-maintenance
Which loops over all the relevant users calling dovecot-maintenance.pl on them in turn. Could probably even fork off some small number to run in parallel.
Thanks, Brian
On Fri, 2010-11-26 at 09:03 -0600, Brian Kroth wrote:
So, I redid this in Perl to only use IMAP rather than any sudo or find calls. In theory then one doesn't need to worry about the indexes being out of sync. I still skipped over the "strange characters" mailboxes for the moment. I'm wondering what you think of this second rendition?
Using some IMAP parser would be much more robust than doing it via regexps. Then you wouldn't have to worry about strange characters either.
if ($line =~ qr/^\* (LIST|LSUB) \((\\Has(No)?Children)?\) "\/" "(.+)"\s*$/) {
Just ignore the flags in the middle, there might be others: \([^)]*\)
if ($line =~ /^\* [0-9]+ FETCH \(UID ([0-9]+) X-SAVEDATE "([0-9]{1,2}-[A-Z][a-z][a-z]-[0-9]{4} [0-9][0-9]:[0-9][0-9]:[0-9][0-9] [0-9+-]+)" FLAGS \(([^)]+)\) ENVELOPE \((.*)\)\)\s*$/) {
ENVELOPE reply might not be in one line either. For example see what happens if subject has " character.
If the list of UIDs is really large, sending a command might fail because the command line is too long (imap_max_line_length setting).
Timo Sirainen tss@iki.fi 2010-11-26 15:15:
On Fri, 2010-11-26 at 09:03 -0600, Brian Kroth wrote:
So, I redid this in Perl to only use IMAP rather than any sudo or find calls. In theory then one doesn't need to worry about the indexes being out of sync. I still skipped over the "strange characters" mailboxes for the moment. I'm wondering what you think of this second rendition?
Using some IMAP parser would be much more robust than doing it via regexps. Then you wouldn't have to worry about strange characters either.
*sigh*
Should have started there. It appears there's a nice module that even has an example for talking to Dovecot via preauth: http://search.cpan.org/~jettero/Net-IMAP-Simple-1.2018/Simple.pod#PREAUTH
Thanks again, Brian
if ($line =~ qr/^\* (LIST|LSUB) \((\\Has(No)?Children)?\) "\/" "(.+)"\s*$/) {
Just ignore the flags in the middle, there might be others: \([^)]*\)
if ($line =~ /^\* [0-9]+ FETCH \(UID ([0-9]+) X-SAVEDATE "([0-9]{1,2}-[A-Z][a-z][a-z]-[0-9]{4} [0-9][0-9]:[0-9][0-9]:[0-9][0-9] [0-9+-]+)" FLAGS \(([^)]+)\) ENVELOPE \((.*)\)\)\s*$/) {
ENVELOPE reply might not be in one line either. For example see what happens if subject has " character.
If the list of UIDs is really large, sending a command might fail because the command line is too long (imap_max_line_length setting).
On Tue, 2010-11-23 at 20:55 -0600, Brian Kroth wrote:
I recently wrote the attached script (could probably be improved) meant in theory to be run as a nightly cron job to do the follow:
- Remove old messages marked as deleted. (I didn't really like the expunge/trash plugins the last time I looked at them.)
- Rebuild the indexes to account of those changes. (So that the users aren't delayed by dovecot rebuilding them.)
- Rebuild FTS indexes. (Again to avoid delay.)
- Rebuild the maildirsize file. (It seems to become slightly inaccurate over time.)
To the devs, I'm wondering if this seems sane?
To other users of dovecot, I'm wondering what, if any sort of maintenance operations you tend to do in your setups?
Cleanup spam and trash folders is really all needs doing... I wrote this years ago and its never let us down, it could be written entirely in perl without system calls, but that requires a lot more code, I like keeping it simple, as it leaves little to go wrong, and I believe that if it aint borked, don't bork it, kinda thing :)
system("/usr/bin/find /var/vmail/*/*/*/*/*/Maildir/.Trash/ -name \"*,ST \" -mtime +7 -print | /usr/bin/perl -nle 'unlink;'"); system("/usr/bin/find /var/vmail/*/*/*/*/*/Maildir/.Junk/ -name \"*,S\" -mtime +30 -print | /usr/bin/perl -nle 'unlink;'");
@olddirs = /usr/bin/find /var/vmail/*/*/*/*/*/Maildir/.Trash.* -type d -mtime +7
;
if ((!@olddirs) or (@olddirs =~ "No such" )) {
print "nothing to do\n";
} else {
use File::Path qw(remove_tree);
foreach $odirdel (@olddirs){
chomp $odirdel;
print "Found: $odirdel \n";
remove_tree($odirdel);
}
}
participants (3)
-
Brian Kroth
-
Noel Butler
-
Timo Sirainen