Hello everyone,
I am currently evaluating dovecot for our new email production servers (20k+ mailboxes) and found out something strange.
I'm using those settings on Dovecot 2.2.4 (x86_64 / Slackware / compiled from sources)
mdbox_rotate_size = 128M mdbox_rotate_interval = 1d mdbox_preallocate_space = yes with virtual users and location like : mail_location = mdbox:~/mdbox
I don't think the remaining config is relevant but ask me if you need some other parts.
Using test accounts for 2 weeks now I've figured that the 128M preallocated space is never 'hole punched" (to use a similar term than "man fallocate" on Linux), even when rotating m.* files. From what I understand those files will never be appended again because of the mdbox_rotate_interval. Then doveadm purge creates new files so old ones would never grow again.
Here is an example of a mdbox storage using ls -ls (which shows allocated VS used space) total 4065176 1884 -rw------- 1 mail mail 1926656 Jul 29 10:55 dovecot.map.index 4 -rw------- 1 mail mail 460 Jul 29 11:26 dovecot.map.index.log 48 -rw------- 1 mail mail 44304 Jul 29 10:55 dovecot.map.index.log.2 131072 -rw------- 1 mail mail 133165066 Jul 19 15:31 m.10 131072 -rw------- 1 mail mail 133507393 Jul 19 15:32 m.13 131072 -rw------- 1 mail mail 134155182 Jul 19 15:33 m.14 131072 -rw------- 1 mail mail 134213403 Jul 19 15:30 m.2 131072 -rw------- 1 mail mail 46464 Jul 21 04:30 m.21 131072 -rw------- 1 mail mail 134215030 Jul 19 15:30 m.3 131072 -rw------- 1 mail mail 25852 Jul 25 01:54 m.32 131072 -rw------- 1 mail mail 2360 Jul 26 00:05 m.34 131072 -rw------- 1 mail mail 169073 Jul 27 23:18 m.35 131072 -rw------- 1 mail mail 31624 Jul 27 01:55 m.36 131072 -rw------- 1 mail mail 134216982 Jul 28 04:30 m.37 131076 -rw------- 1 mail mail 134217804 Jul 28 04:30 m.38 131072 -rw------- 1 mail mail 134217341 Jul 28 04:30 m.39 131072 -rw------- 1 mail mail 134213719 Jul 19 15:30 m.4 131072 -rw------- 1 mail mail 29740970 Jul 28 04:30 m.40 131072 -rw------- 1 mail mail 129175917 Jul 28 04:30 m.41 131072 -rw------- 1 mail mail 133174937 Jul 28 04:30 m.42 131072 -rw------- 1 mail mail 633436 Jul 28 04:30 m.43 131072 -rw------- 1 mail mail 3154623 Jul 28 04:30 m.44 131072 -rw------- 1 mail mail 3676879 Jul 28 04:30 m.45 131072 -rw------- 1 mail mail 468158 Jul 28 04:30 m.46 131072 -rw------- 1 mail mail 26964 Jul 28 04:30 m.47 131072 -rw------- 1 mail mail 3574599 Jul 28 04:30 m.48 131072 -rw------- 1 mail mail 3789133 Jul 28 04:30 m.49 131072 -rw------- 1 mail mail 134215016 Jul 19 15:30 m.5 131072 -rw------- 1 mail mail 1280074 Jul 28 04:30 m.50 131076 -rw------- 1 mail mail 635459 Jul 28 22:47 m.51 131072 -rw------- 1 mail mail 1459418 Jul 29 10:55 m.52 131072 -rw------- 1 mail mail 132941013 Jul 29 11:26 m.53 131072 -rw------- 1 mail mail 134213475 Jul 19 15:30 m.7 131072 -rw------- 1 mail mail 132240074 Jul 19 15:31 m.9
There's a lot of "lost" space since preallocated space would only be reclaimed when *all* emails in m.X file have refcount=0 and after a doveadm purge call, if I read well the dovecot docs.
On mailboxes patterns with low incoming mail (< 100kb / day) this would waste much space. Of course I can decrease rotate size a lot but it would then produce a lot of files and would certainly become similar performance-wise to sdbox/maildir/...
There would certainly be smart to use something similar to "FALLOC_FL_PUNCH_HOLE" on rotation (when doing close() ?) so that when we're sure there won't be anymore data appended to file that the allocated space == used space.
I will disable space preallocation for our next tests since it wastes much storage for us ; did you have any feedback on how much it may affect performance ? I found in this ML archives some messages about the implementation but didn't see anyone clearly stating how much better preallocation is.
Thanks, best regards, Stephane Berthelot.