[Dovecot] The docs a re a bit weird on "Directory hashing"
In squid we use a double layer of hashed directories on the FS to allow storage of millions of files. I was reading the "Directory hashing" section but never understood it.. since it's written.. in a way I could not understand. I am using this line: mail_location = maildir:/home/vmail/%d/%n/Maildir/
and I want to migrate to a hash based directory scheme. While trying to understand how that hash will work I stumbled a old thread at: http://www.dovecot.org/list/dovecot/2010-June/049695.html there they used: mail_location=maildir:/buzones/us.es/%1Hu/%2.1u/%n
so I assume it should be used like that: mail_location=maildir:/home/vmail/%H/%2.256Hn/%d_%n/Maildir/ or: mail_location=maildir:/home/vmail/%1Mu/%2.1Mu/%d_%n/Maildir/
It's a bit hard to think alone so I hope you can assist me.
let say I want to follow the model of squid cache_dir which has: cache_dir aufs /usr/local/squid/var/cache/squid 40000 16 256
And means a two layers cache of max 16 directories on the first layer and 256 directories on the second layer. The above allows millions of files storage and can benefit from all ext4 lower kernel levels of compatibly rather then do stuff on the user-land.. Since I am not 100% sure that the scheme I understood is indeed what I think I assume the above will need a small correction.
Eliezer
On Thu, Aug 08, 2013 at 01:42:43AM +0300, Eliezer Croitoru wrote:
And means a two layers cache of max 16 directories on the first layer and 256 directories on the second layer. The above allows millions of files storage and can benefit from all ext4 lower kernel levels of compatibly rather then do stuff on the user-land.. Since I am not 100% sure that the scheme I understood is indeed what I think I assume the above will need a small correction.
I use:
mail_home = /srv/mailstore/%256LRHu/%Ld/%Ln
which gives me 256 buckets containing domainname/username/, and the buckets are a hash of Lowercase Reverse usernames. To get the same layout as squid, I would try:
mail_home = /srv/mailstore/%16LRHu/%256LRHu/%Lu
Ref: http://wiki2.dovecot.org/Variables for variables and modifiers.
BTW: I'm lowercasing everything, because I once got bitten by a variable not being lowercased in one version, and suddenly this changing in another version. It's probably redundant here -- but it was painful to fix when it happened..
-jf
Hey,
On 08/08/2013 10:29 AM, Jan-Frode Myklebust wrote:
On Thu, Aug 08, 2013 at 01:42:43AM +0300, Eliezer Croitoru wrote:
And means a two layers cache of max 16 directories on the first layer and 256 directories on the second layer. The above allows millions of files storage and can benefit from all ext4 lower kernel levels of compatibly rather then do stuff on the user-land.. Since I am not 100% sure that the scheme I understood is indeed what I think I assume the above will need a small correction.
I use:
mail_home = /srv/mailstore/%256LRHu/%Ld/%Ln "R" what for?? I do understand a Lower case on the names and have seen the effect but how would R be helpful??
Eliezer
which gives me 256 buckets containing domainname/username/, and the buckets are a hash of Lowercase Reverse usernames. To get the same layout as squid, I would try:
mail_home = /srv/mailstore/%16LRHu/%256LRHu/%Lu
Ref: http://wiki2.dovecot.org/Variables for variables and modifiers.
BTW: I'm lowercasing everything, because I once got bitten by a variable not being lowercased in one version, and suddenly this changing in another version. It's probably redundant here -- but it was painful to fix when it happened..
-jf
On Fri, Aug 09, 2013 at 12:02:34AM +0300, Eliezer Croitoru wrote:
I use:
mail_home = /srv/mailstore/%256LRHu/%Ld/%Ln
"R" what for?? I do understand a Lower case on the names and have seen the effect but how would R be helpful??
According to http://wiki2.dovecot.org/Variables
"%H hash function is a bit bad if all the strings end with the same text, so if you're hashing usernames being in user@domain form, you probably want to reverse the username to get better hash value variety, e.g. %3RHu. "
-jf
BTW. If you're using v2.2.3+ %N hash works better than the old %H hash. I updated http://wiki2.dovecot.org/Variables for it also.
On 8.8.2013, at 1.42, Eliezer Croitoru <eliezer@ngtech.co.il> wrote:
In squid we use a double layer of hashed directories on the FS to allow storage of millions of files. I was reading the "Directory hashing" section but never understood it.. since it's written.. in a way I could not understand. I am using this line: mail_location = maildir:/home/vmail/%d/%n/Maildir/
and I want to migrate to a hash based directory scheme. While trying to understand how that hash will work I stumbled a old thread at: http://www.dovecot.org/list/dovecot/2010-June/049695.html there they used: mail_location=maildir:/buzones/us.es/%1Hu/%2.1u/%n
so I assume it should be used like that: mail_location=maildir:/home/vmail/%H/%2.256Hn/%d_%n/Maildir/ or: mail_location=maildir:/home/vmail/%1Mu/%2.1Mu/%d_%n/Maildir/
It's a bit hard to think alone so I hope you can assist me.
let say I want to follow the model of squid cache_dir which has: cache_dir aufs /usr/local/squid/var/cache/squid 40000 16 256
And means a two layers cache of max 16 directories on the first layer and 256 directories on the second layer. The above allows millions of files storage and can benefit from all ext4 lower kernel levels of compatibly rather then do stuff on the user-land.. Since I am not 100% sure that the scheme I understood is indeed what I think I assume the above will need a small correction.
Eliezer
participants (3)
-
Eliezer Croitoru
-
Jan-Frode Myklebust
-
Timo Sirainen