OK, an update on the progress with this.

I finally settled on a python script which does the stripping based on code here:
   http://code.activestate.com/recipes/302086-strip-attachments-from-an-email-message/

And then a bash script using find that allows me to select candidate files with 'find' and pass them to the python script, eg.

    find $DIR -type f -mtime +$OLDERTHANDAYS -size +$LARGERTHAN ! -name 'dovecot*'

After a bit of debugging to do with UTF characters etc, I seem to have got the script working and it will process a directory or entire account without complaining. My coding is not good, but if anyone wants a copy, contact me off list, to spare my blushes.

I'm now experiencing an issue when I go to check the emails, using Thunderbird IMAP. The mails were cached in Thunderbird, and indexed by dovecot on the server. I've been trying to figure out the minimum I need to do to get Thunderbird to pick up the changes.

Errors in the logs were

Apr 05 12:15:33 imap(user@domain.com) Error: Corrupted record in index cache file /mail/path/dovecot.index.cache: UID 1298: Broken physical size in mailbox INBOX: read(/mail/path/cur/1615880838.M742750P25731.mail.domain.com,S=12893560,W=13061037:2,Se) failed: Cached message size larger than expected (12893560 > 2937, box=INBOX, UID=1298)
Apr 05 12:15:33 imap(user@domain.com): Info: FETCH read() failed in=10718 out=7471947 deleted=0 expunged=0 trashed=0 hdr_count=1647 hdr_bytes=645910 body_count=448 body_bytes=6371591
Apr 05 12:15:36 imap(user@domain.com): Error: Corrupted record in index cache file /mail/path/dovecot.index.cache: UID 1298: Broken physical size in mailbox INBOX: read(/mail/path/cur/1615880838.M742750P25731.mail.domain.com,S=12893560,W=13061037:2,Se) failed: Cached message size larger than expected (12893560 > 2937, box=INBOX, UID=1298)

It seems the only way to do this is to disconnect, delete all dovecot.* files on the server, delete all Thunderbird cache files on the PC, and then reconnect and wait for them to figure it out. Does that seem correct?


Finally, and relatedly, the maildir files on the server are tagged with a size field eg S=12893560.  Is it possible to regenerate them with the new correct file sizes?
If I leave them alone, will it affect anything?

P.