[Dovecot] /var/spool/mail directory size and subdirectories
(Complete newbie to dovecot. I hope what follows isn't something I've missed in some FAQ somewhere...)
On a traditional UNIX filesystem with UW-IMAP several years ago, we encountered major performance problems when "/var/spool/mail/" got big (we would currently be ~20,000 entries). This was due to the inefficiency of the UNIX filesystem when creating and deleting the lockfiles (etc.) in a directory of that size.
We worked around that (all those years ago) with a local source-code mod to UW-IMAP ("c-client") to use subdirectories "/var/spool/mail/00/" to "/var/spool/mail/99/" based on a trivially simply "uid mod 100" algorithm. (Our uids are random; our usernames are not.) Thus we had 100 subdirs each of about 200 entires. Efficiency and performance vastly improved, and we've been running like that ever since.
Now we are considering migrating to dovecot...
Does dovecot do file creation/deletion in this (or similar shared) directory? (And so would it be liable to the same inefficiency problems?)
Is there some means in the "dovecot.conf" file to specify an INBOX pattern as "/var/spool/mail/%ZZZ%/%u" where "%ZZZ%" could be algorithmically specified as (say) "user-uid mod 100"?
--
: David Lee I.T. Service : : Senior Systems Programmer Computer Centre : : Durham University : : http://www.dur.ac.uk/t.d.lee/ South Road : : Durham DH1 3LE : : Phone: +44 191 334 2752 U.K. :
On Wed, 2006-07-19 at 15:49 +0100, David Lee wrote:
(Complete newbie to dovecot. I hope what follows isn't something I've missed in some FAQ somewhere...)
On a traditional UNIX filesystem with UW-IMAP several years ago, we encountered major performance problems when "/var/spool/mail/" got big (we would currently be ~20,000 entries). This was due to the inefficiency of the UNIX filesystem when creating and deleting the lockfiles (etc.) in a directory of that size.
We worked around that (all those years ago) with a local source-code mod to UW-IMAP ("c-client") to use subdirectories "/var/spool/mail/00/" to "/var/spool/mail/99/" based on a trivially simply "uid mod 100" algorithm. (Our uids are random; our usernames are not.) Thus we had 100 subdirs each of about 200 entires. Efficiency and performance vastly improved, and we've been running like that ever since.
Now we are considering migrating to dovecot...
Does dovecot do file creation/deletion in this (or similar shared) directory? (And so would it be liable to the same inefficiency problems?)
Is there some means in the "dovecot.conf" file to specify an INBOX pattern as "/var/spool/mail/%ZZZ%/%u" where "%ZZZ%" could be algorithmically specified as (say) "user-uid mod 100"?
look in variables.txt that comes in the tarball.
I think the %H will have some of what you're looking for - or segments of the %u
-sv
On Wed, 2006-07-19 at 15:49 +0100, David Lee wrote:
Is there some means in the "dovecot.conf" file to specify an INBOX pattern as "/var/spool/mail/%ZZZ%/%u" where "%ZZZ%" could be algorithmically specified as (say) "user-uid mod 100"?
If all your uids are a fixed number of digits you could do default_mail_env = /var/spool/mail/%3.2i/%u (if all uids are 5 digits long)
The closest other scheme based on UIDs I can think of is %.2Ri, which gives you the last two digits of the UID (i.e. uid mod 100) but in reverse, so for uid 1234 it'd give you 43.
Depending on what other tools you use, you could also use the hash (H) modifier, but maybe your delivery agent can't do that (unless of course you plan to use dovecot-lda too)
Hope that helps, johannes
On Wed, 19 Jul 2006, Johannes Berg wrote:
On Wed, 2006-07-19 at 15:49 +0100, David Lee wrote:
Is there some means in the "dovecot.conf" file to specify an INBOX pattern as "/var/spool/mail/%ZZZ%/%u" where "%ZZZ%" could be algorithmically specified as (say) "user-uid mod 100"?
If all your uids are a fixed number of digits you could do default_mail_env = /var/spool/mail/%3.2i/%u (if all uids are 5 digits long)
Close. But not quite, I think. Our significant uids range from about 1,000 to about 60,000. So they are a mix of four- and five-digit, not a fixed number of digits.
Both "2345" and "12345" need to map onto the same result: "45". But I gather (although I'm new to dovecot) that the "%3.2i" notation acts on the uid as a string rather than as a number, therefore would yield different values.
Background: This exercise is to do a minimally invasive black-box replacement of UW-IMAP with Dovecot. We are trying to avoid any major reworking of our directory structure. (The longer-term strategy at our site is Exchange (don't ask!); this proposed UW->dovecot work on the existing service is a temporary "get us by for the moment" fix.)
The closest other scheme based on UIDs I can think of is %.2Ri, which gives you the last two digits of the UID (i.e. uid mod 100) but in reverse, so for uid 1234 it'd give you 43.
From "1234" I'd need "34" as the answer. And from "1204" (and "12304") I'd need "04" including its leading "0". ("1200"->"00" etc.)
Depending on what other tools you use, you could also use the hash (H) modifier, but maybe your delivery agent can't do that (unless of course you plan to use dovecot-lda too)
With our UW set-up, everything goes through UW's c-client library (sendmail local delivery through UW's "tmail"). So "dovecot-lda" (which I've not yet investigated) would probably be the natural route for us.
Perhaps I should further investigate the "%H"-like stuff (which seems to be hex rather than base-10) to see whether I can do a generalised extension which could enable a base-10 modulo thing, including leading zeroes. Or a "final n-char substring" thing. (Naturally, I'd take advice, and would hope to feed back any patch.)
--
: David Lee I.T. Service : : Senior Systems Programmer Computer Centre : : Durham University : : http://www.dur.ac.uk/t.d.lee/ South Road : : Durham DH1 3LE : : Phone: +44 191 334 2752 U.K. :
On Wed, 2006-07-19 at 17:37 +0100, David Lee wrote:
Both "2345" and "12345" need to map onto the same result: "45". But I gather (although I'm new to dovecot) that the "%3.2i" notation acts on the uid as a string rather than as a number, therefore would yield different values.
Yeah, that's right, it acts on a string and on 2345 just returns '5'.
Background: This exercise is to do a minimally invasive black-box replacement of UW-IMAP with Dovecot. We are trying to avoid any major reworking of our directory structure.
Makes sense.
Perhaps I should further investigate the "%H"-like stuff (which seems to be hex rather than base-10) to see whether I can do a generalised extension which could enable a base-10 modulo thing, including leading zeroes. Or a "final n-char substring" thing. (Naturally, I'd take advice, and would hope to feed back any patch.)
Give me a minute, I'm just testing a small patch.
johannes
On Wed, 2006-07-19 at 19:01 +0200, Johannes Berg wrote:
Give me a minute, I'm just testing a small patch.
Done. Now you can use %-2.02i and it should work for all lengths, even length 1 :) Timo, it'd be nice if you could apply this :) --- This patch makes it possible to use a negative offset to count from the end of a variable. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> --- dovecot-1.0.rc2/doc/variables.txt 2006-04-12 09:39:31.000000000 +0200 +++ dovecot-1.0.rc2.mod/doc/variables.txt 2006-07-19 19:27:15.611041184 +0200 @@ -35,7 +35,10 @@ You can take a substring of the variable by giving optional offset followed by '.' and width after the '%' character. For example %2u gives first two characters of the username. %2.1u gives third character of the username. If -offset points outside the value, empty string is returned. +the offset is negative, it counts from the end, for example %-2.02i gives +the UID mod 100 (last two characters of the UID printed in a string). If a +positive offset points outside the value, empty string is returned, if a +negative offset does then the string is taken from the start. For login_log_format_elements there are also these variables: --- dovecot-1.0.rc2/src/lib/var-expand.c 2006-04-12 09:39:32.000000000 +0200 +++ dovecot-1.0.rc2.mod/src/lib/var-expand.c 2006-07-19 19:11:14.661041184 +0200 @@ -11,7 +11,8 @@ #include <stdlib.h> struct var_expand_context { - unsigned int offset, width; + int offset; + unsigned int width; }; struct var_expand_modifier { @@ -108,7 +109,7 @@ struct var_expand_context ctx; const char *(*modifier[MAX_MODIFIER_COUNT]) (const char *, struct var_expand_context *); - unsigned int i, modifier_count; + unsigned int i, modifier_count, sign = 1; bool zero_padding = FALSE; memset(&ctx, 0, sizeof(ctx)); @@ -120,12 +121,16 @@ /* [<offset>.]<width>[<modifiers>]<variable> */ ctx.width = 0; + if (*str == '-') { + sign = -1; + str++; + } if (*str == '0') { zero_padding = TRUE; str++; } while (*str >= '0' && *str <= '9') { - ctx.width = ctx.width*10 + (*str - '0'); + ctx.width = ctx.width*10 + sign * (*str - '0'); str++; } @@ -178,9 +183,21 @@ if (var != NULL) { for (i = 0; i < modifier_count; i++) var = modifier[i](var, &ctx); - while (*var != '\0' && ctx.offset > 0) { - ctx.offset--; - var++; + if (ctx.offset < 0) { + /* if offset is < 0 then we want to + * start at the end */ + size_t len = strlen(var); + var += len; /* point to trailing NUL byte */ + while (ctx.offset < 0 && len > 0) { + ctx.offset++; + var--; + len--; + } + } else { + while (*var != '\0' && ctx.offset > 0) { + ctx.offset--; + var++; + } } if (ctx.width == 0) str_append(dest, var); @@ -205,7 +222,7 @@ const struct var_expand_modifier *m; /* [<offset>.]<width>[<modifiers>]<variable> */ - while (*str >= '0' && *str <= '9') + while ((*str >= '0' && *str <= '9') || *str == '-') str++; if (*str == '.') {
On July 19, 2006 3:49:17 PM +0100 David Lee <t.d.lee@durham.ac.uk> wrote:
to UW-IMAP ("c-client") to use subdirectories "/var/spool/mail/00/" to "/var/spool/mail/99/" based on a trivially simply "uid mod 100" algorithm. (Our uids are random; our usernames are not.) Thus we had 100 subdirs each of about 200 entires. Efficiency and performance vastly improved, and we've been running like that ever since.
Now we are considering migrating to dovecot...
Does dovecot do file creation/deletion in this (or similar shared) directory? (And so would it be liable to the same inefficiency problems?)
If you are migrating, why not change to maildir in $HOME?
-frank
On Wed, 19 Jul 2006, Frank Cusack wrote:
On July 19, 2006 3:49:17 PM +0100 David Lee <t.d.lee@durham.ac.uk> wrote:
to UW-IMAP ("c-client") to use subdirectories "/var/spool/mail/00/" to "/var/spool/mail/99/" based on a trivially simply "uid mod 100" algorithm. (Our uids are random; our usernames are not.) Thus we had 100 subdirs each of about 200 entires. Efficiency and performance vastly improved, and we've been running like that ever since.
Now we are considering migrating to dovecot...
Does dovecot do file creation/deletion in this (or similar shared) directory? (And so would it be liable to the same inefficiency problems?)
If you are migrating, why not change to maildir in $HOME?
Other local political constraints mean that this proposed UW->dovecot transition has to be a minimally invasive thing. It is viewed as a band-aid onto the existing service rather than as a major upgrade.
(Yes, I'd love to re-work the whole lot, including the folder locations. But that's not an option...)
--
: David Lee I.T. Service : : Senior Systems Programmer Computer Centre : : Durham University : : http://www.dur.ac.uk/t.d.lee/ South Road : : Durham DH1 3LE : : Phone: +44 191 334 2752 U.K. :
participants (4)
-
David Lee
-
Frank Cusack
-
Johannes Berg
-
seth vidal