Re: [Dovecot] Direct groups of users to pairs of backend servers

25 Mar 2014


      On 3/25/2014 8:18 AM, Murray Trainer wrote:
...
On 25/03/14 15:06, Stan Hoeppner wrote:
...
On 3/24/2014 10:02 PM, Murray Trainer wrote:
...
Hi All,
I am using dovecot in the Director setup with multiple proxy and
backend mailstores and user information stored in LDAP.  I am aware
users can be directed to a single backend server.  It would be useful
to be able to direct groups of users to pairs of backend servers to
give some fault tolerance against NFS issues and make the whole thing
more scalable.
Your description says you currently have a "shared nothing" storage
architecture.  You can't get any more scalable than that.  To enable
"groups of users" to be directed to "pairs of backend servers" you'll
need each member of the pair to mount the NFS path of the partner server.
Then you will have two different mailbox locations to deal with.  Do you
have per user mailbox paths configured in LDAP?  You will have to do
that for this "pairing" to work.
...
Otherwise each backend mailstore will need all
the NFS mounts and the whole cluster will be affected if one NFS mount
has an issue.
The whole cluster will not be affected.  Only users whose mail in on the
problem mount will be affected.  This is no different that your current
setup in that regard.
...
I am not sure if this possible with the current
dovecot implementation?  If not it would be a great enhancement.
So, in a nutshell, you want Dovecot to be able to overcome faults in
your NFS architecture because you did not build in redundancy?  Is this
correct?
Why are you concerned about NFS mount failures?  Most folks running NFS
Dovecot clusters share a single mount with all mailboxes among all the
cluster nodes.  You seem to have multiple mounts, one for each backend
node.  If mount failures were a common occurrence, we'd see frequent
reports of that.  But we don't.  Did you home brew your NFS servers and
they're not reliable?
Cheers,
Hi Stan,
Sorry I didn't properly explain my setup.
...
...
...
...
The backend mailstores each have the same set of 5 NFS mounts from
EMC VNX storage where the mailboxes are located...
...
...
...
...
There is no relation between the number of NFS
mounts and backend mailstores.
Surely you see the contradiction here.
You're talking in present tense.  Have you already set this up, or is
this 5 mounts per mailbox host simply a potential architectural idea
right now?
...
We are talking about migrating a large
amount of users and mailboxes - 100,000+ and 50TB+ and don't want to put
that all on one NFS filesystem.  We want to break it down into redundant
parts so that all the mailstores don't stop functioning  if there is a
problem with the one NFS filesystem.
Sounds reasonable.  But you just traded horses, going from "mount point
down" to "NFS filesystem" problem.  By that do you mean the actual EMC
proprietary filesystem that is exported?  Filesystem as in run fsck if
broken?  And if so, you're simply wanting to mirror those filesystems
within the EMC, create a different export for each, and have two servers
in a "pair" each mount one of these mirrored filesystems?
Never heard of such a thing...
...
Our NFS storage should be pretty
reliable but the email below on this list about a week ago made me
concerned about all our mailstores hanging if there is a problem with
one of the NFS mounts.
Mounts are client side.  Exports are server side.  If a mount hangs only
that client host has a problem.  Are you concerned about a mount failing
or an export failing?
...
Hence the query about breaking up the NFS mounts
into groups per pair of mailstores.
You need to explain this concept in technical detail.  As stated it
makes no sense, because both NFSv3 and v4 support export failover.
Surely the EMC supports this.  Actually, in v4 mode, is -must- because
it's part of the protocol itself.
...
We will eventually set mail servers
and redundant EMC storage between separate data centres and use pNFS
which will make the whole thing more fault tolerant but that won't
happen for a while.
Thanks for your response.
Murray
...
[Dovecot] NFS not responding generates authantication crash
I am facing dovecot authentication problems caused by unresponding NFS
server. If there is even short break in communication with NFS server
keeping maildirs, the dovecot generates the avalanche of processes
(dovecot/imap and dovecot/pop3). The real number of connections was
about 50
and after the problems occurs it rises to 1000. After about 3 hours the
limit of connections is filled up:
dovecot: master: Warning: service(auth): client_limit (1000) reached,
client connections are being dropped
and next:
imap-login: Warning: Auth process not responding, delayed sending greeting
pop3-login: Warning: Error sending handshake to auth server: Broken pipe
imap-login: Warning: Error sending handshake to auth server: Broken pipe
NFSv4 has a 90 second failover grace period.  If the user above was
using NFSv4 clustering this breakage would not have happened, at least
not to this degree.
Cheers,
Stan