Hi Timo
I have a few comments. Please just disregard them if I have misunderstood your design.
Regarding your storage plan I find it very important that users can be stored in different locations because:
- Discount users could be placed on cheap storage while others are offered premium service on expensive hardware
- It's easy to scale if you just add another LUN from your SAN or mount from NAS
- In order to avoing huge directories you can put users into subdirs with each subdir containing only say 1000 users each All this is very easy to achieve in 1.1 because you can return individual storage dirs for indexes and data from the user db. I'm not sure from reading your post whether this will still be possible but I believe it’s a very important thing.
Regarding 7. I very much for all the self healing you describe. There is nothing worse than huge complex systems that fail just because of some minor error that could easily be fixed without manual intervention. But also I'm a little worried in this regard.
Maildir is so robust that nothing can really go wrong. But here you have index files and data files located in different places. Imagine the index file being on one NFS mount whilst the data resides on another. Or if the administrator is purposely loading a different index file or data file from a backup.
Worst case scenario is that the self healing takes a manual operation for a failure and breaks something. It should be very resilient to temporarily losing access to all files in this operation (could happen very often on NFS mounts).
Also I imagine the self-healing going into loops if it doesn't understand what’s going on. If the data changes dues to manual intervention or par of the file system can be accessed you could imagine the self healing process trying again and again to fix something that isn't its job to fix. In that case it would be better if it just skipped the apparent failures.
I'm also wondering if it's better for each mailbox to have its separate dovecot.index.cache file or if there should be one cache file for the map index. I think you should consider more files as the general choice (not only regarding cache files). Imagine many dovecot servers accessing the same storage simultaneously. I figure it would be a lot easier if they weren’t all trying to read/update one essential file at the same time (with only one file, load can’t be spread across multiple mounts and everything goes down if
Timo wrote: the mount with the essential file is inaccessible). If there is serious data corruption and you have only one file then all operations are paused while the self healing is trying to figure out what went wrong (and what happens if different servers decide to do self-healing on this one file at the same time?). With one file per maildir only a small portion of the users are affected, the load is spread and really bad file corruption doesn’t break everything for thousands of users.
Other than that I’m just really glad that dbox is progressing. I consider it the feature. Dbox is the email administrator’s wet dream. I’m already dreaming of completely avoiding the scalability issues of large Maildirs (which is the biggest challenge today in my opinion) and reducing the IO. Buying more IO is an order of magnitude more expensive than getting more RAM or CPU power (and dovecot barely needs any RAM and CPU anyway).
Best wishes, Mikkel