For example - store the attachments individually when they first come in, then every night at 3:00am, do a precise comparison on all of the attachments that came in that day and delete_duplicate->add_link on all duplicates found.
That could be a possibility too. Although that way delivery would use more disk I/O than really needed for the shared attachments.
As long as it wasn't permanent, I really don't see that as an issue, but of course, *ideally* if it could happen at delivery time that would obviously be best.
But, I still really like the idea of being able to process an existing mailstore (since we have a huge one), rather than only processing new messages.
What are the chances of providing for both? Meaning, 'process on delivery', and/or 'delayed process' (ie nightly) / 'process existing' (process an existing mailstore)?
Maybe even complicate it, and provide a way of testing for how busy the server is, and if it is too busy when a message comes in, delay processing, but when it isn't very busy, process immediately?
;)
--
Best regards,
Charles