[Dovecot] Direct groups of users to pairs of backend servers

Tue Mar 25 21:51:47 UTC 2014

On 3/25/2014 8:18 AM, Murray Trainer wrote:
> On 25/03/14 15:06, Stan Hoeppner wrote:
>> On 3/24/2014 10:02 PM, Murray Trainer wrote:
>>> Hi All,
>>>
>>> I am using dovecot in the Director setup with multiple proxy and
>>> backend mailstores and user information stored in LDAP.  I am aware
>>> users can be directed to a single backend server.  It would be useful
>>> to be able to direct groups of users to pairs of backend servers to
>>> give some fault tolerance against NFS issues and make the whole thing
>>> more scalable.
>> Your description says you currently have a "shared nothing" storage
>> architecture.  You can't get any more scalable than that.  To enable
>> "groups of users" to be directed to "pairs of backend servers" you'll
>> need each member of the pair to mount the NFS path of the partner server.
>>
>> Then you will have two different mailbox locations to deal with.  Do you
>> have per user mailbox paths configured in LDAP?  You will have to do
>> that for this "pairing" to work.
>>
>>> Otherwise each backend mailstore will need all
>>> the NFS mounts and the whole cluster will be affected if one NFS mount
>>> has an issue.
>> The whole cluster will not be affected.  Only users whose mail in on the
>> problem mount will be affected.  This is no different that your current
>> setup in that regard.
>>
>>> I am not sure if this possible with the current
>>> dovecot implementation?  If not it would be a great enhancement.
>> So, in a nutshell, you want Dovecot to be able to overcome faults in
>> your NFS architecture because you did not build in redundancy?  Is this
>> correct?
>>
>> Why are you concerned about NFS mount failures?  Most folks running NFS
>> Dovecot clusters share a single mount with all mailboxes among all the
>> cluster nodes.  You seem to have multiple mounts, one for each backend
>> node.  If mount failures were a common occurrence, we'd see frequent
>> reports of that.  But we don't.  Did you home brew your NFS servers and
>> they're not reliable?
>>
>> Cheers,
> Hi Stan,
> 
> Sorry I didn't properly explain my setup.  

>>>> The backend mailstores each have the same set of 5 NFS mounts from 
>>>> EMC VNX storage where the mailboxes are located...  

>>>> There is no relation between the number of NFS
>>>> mounts and backend mailstores.  

Surely you see the contradiction here.

You're talking in present tense.  Have you already set this up, or is
this 5 mounts per mailbox host simply a potential architectural idea
right now?

> We are talking about migrating a large
> amount of users and mailboxes - 100,000+ and 50TB+ and don't want to put
> that all on one NFS filesystem.  We want to break it down into redundant
> parts so that all the mailstores don't stop functioning  if there is a
> problem with the one NFS filesystem. 

Sounds reasonable.  But you just traded horses, going from "mount point
down" to "NFS filesystem" problem.  By that do you mean the actual EMC
proprietary filesystem that is exported?  Filesystem as in run fsck if
broken?  And if so, you're simply wanting to mirror those filesystems
within the EMC, create a different export for each, and have two servers
in a "pair" each mount one of these mirrored filesystems?

Never heard of such a thing...

> Our NFS storage should be pretty
> reliable but the email below on this list about a week ago made me
> concerned about all our mailstores hanging if there is a problem with
> one of the NFS mounts.  

Mounts are client side.  Exports are server side.  If a mount hangs only
that client host has a problem.  Are you concerned about a mount failing
or an export failing?

> Hence the query about breaking up the NFS mounts
> into groups per pair of mailstores.  

You need to explain this concept in technical detail.  As stated it
makes no sense, because both NFSv3 and v4 support export failover.
Surely the EMC supports this.  Actually, in v4 mode, is -must- because
it's part of the protocol itself.

> We will eventually set mail servers
> and redundant EMC storage between separate data centres and use pNFS
> which will make the whole thing more fault tolerant but that won't
> happen for a while.
> 
> Thanks for your response.
>
> Murray
> 
>> [Dovecot] NFS not responding generates authantication crash
>>I am facing dovecot authentication problems caused by unresponding NFS
>>server. If there is even short break in communication with NFS server
>>keeping maildirs, the dovecot generates the avalanche of processes
>>(dovecot/imap and dovecot/pop3). The real number of connections was
> about 50
>>and after the problems occurs it rises to 1000. After about 3 hours the
>>limit of connections is filled up:
>>dovecot: master: Warning: service(auth): client_limit (1000) reached,
>>client connections are being dropped
>>and next:
>>imap-login: Warning: Auth process not responding, delayed sending greeting
>>pop3-login: Warning: Error sending handshake to auth server: Broken pipe
>>imap-login: Warning: Error sending handshake to auth server: Broken pipe

NFSv4 has a 90 second failover grace period.  If the user above was
using NFSv4 clustering this breakage would not have happened, at least
not to this degree.

Cheers,

Stan