OK, an update on the progress with this.
I finally settled on a python script which does the stripping
based on code here:
http://code.activestate.com/recipes/302086-strip-attachments-from-an-email-message/
And then a bash script using find that allows me to select
candidate files with 'find' and pass them to the python script,
eg.
find $DIR -type f -mtime +$OLDERTHANDAYS -size +$LARGERTHAN !
-name 'dovecot*'
After a bit of debugging to do with UTF characters etc, I seem to
have got the script working and it will process a directory or
entire account without complaining. My coding is not good, but if
anyone wants a copy, contact me off list, to spare my blushes.
I'm now experiencing an issue when I go to check the emails,
using Thunderbird IMAP. The mails were cached in Thunderbird, and
indexed by dovecot on the server. I've been trying to figure out
the minimum I need to do to get Thunderbird to pick up the
changes.
Errors in the logs were
Apr 05 12:15:33 imap(user@domain.com) Error: Corrupted record in
index cache file /mail/path/dovecot.index.cache: UID 1298: Broken
physical size in mailbox INBOX:
read(/mail/path/cur/1615880838.M742750P25731.mail.domain.com,S=12893560,W=13061037:2,Se)
failed: Cached message size larger than expected (12893560 >
2937, box=INBOX, UID=1298)
Apr 05 12:15:33 imap(user@domain.com): Info: FETCH read() failed
in=10718 out=7471947 deleted=0 expunged=0 trashed=0 hdr_count=1647
hdr_bytes=645910 body_count=448 body_bytes=6371591
Apr 05 12:15:36 imap(user@domain.com): Error: Corrupted record in
index cache file /mail/path/dovecot.index.cache: UID 1298: Broken
physical size in mailbox INBOX:
read(/mail/path/cur/1615880838.M742750P25731.mail.domain.com,S=12893560,W=13061037:2,Se)
failed: Cached message size larger than expected (12893560 >
2937, box=INBOX, UID=1298)
It seems the only way to do this is to disconnect, delete all
dovecot.* files on the server, delete all Thunderbird cache files
on the PC, and then reconnect and wait for them to figure it out.
Does that seem correct?
Finally, and relatedly, the maildir files on the server are
tagged with a size field eg S=12893560. Is it possible to
regenerate them with the new correct file sizes?
If I leave them alone, will it affect anything?
P.