On 08 May 2020, at 12:54, asai@globalchangemusic.org wrote:
It depends on what you consider reasonable.
The processing time of file operation that iterates through a mailbox will generally go up proportinately with size. If you do a text search without some indexing system like Solr, it will take a very long time.
If the mailbox is just some archive that you pile up and forget about it except for once in a blue moon retrieval, then it might be reasonable.
If it's an active mailbox, it will be a pain to navigate, in the same way a single folder with 100K files or a file cabinet with huge stacks of envelopes.
I would guess some partioning of the large mailboxes into smaller mailboxes would help with active mailboxes. Most people spend most of their time on new/recent messages, so making time or size or subject based volmes wouldn't be a bad idea.
If the bulk of the size are redundant copies of attachments, then Dovecot's *dbox support de-duping which would aso help.
So, generally speaking, you don't want to have inboxes that just sync all day long, due to massive amounts of small files in the inbox. This may be OK in the case of a rarely accessed archive folder, but not good for regularly accessed inboxes, etc.?
Not really since most GUI clients keep all the folders synced, so moving files to different, smaller count mailboxes doesn’t reduce the number of files accessed.
The issue is if you have a folder with millions of files in it, most file systems don’t deal well with this.
But with mbox, each “folder” is a single file, and making a single multi-GB text file that has to be parsed is a definitely issue on any file system.
-- ALL WORK AND NO PLAY MAKES BART A DULL BOY ALL WORK AND NO PLAY MAKES BART A DULL BOY ALL WORK AND NO PLAY MAKES BART A DULL BOY Bart chalkboard Ep. 1F07