hi all, i'm managing a large installation of a dovecot cluster in director + NFS backend architecture and we are moving from NFSv3 to NFSv4. Our NFS server is a Netapp in clustered mode and reading technical specs of NFSv4 delegation feature it seems that enabling delegation in this type of dovecot architecture should bring great benefits: only one backend server access a specific mailbox at a time (even deliveries are managed by director hashing via LMTP) so the getattr rpc calls should decrease significantly while the risk of conflicting delegations is very low (impacting negatively on the benefits obtainable). Can anyone confirm this hypothesis?
I've heard that the linux implementation of NFSv4 delegation is mature only for read delegation (and not yet for write delegation): is this true? And, if true, could this fact limit the benefits of activating NFS delegation feature?
Best regards
-brd
Il 11/06/2015 16:03, brd ha scritto:
hi all, i'm managing a large installation of a dovecot cluster in director + NFS backend architecture and we are moving from NFSv3 to NFSv4. Our NFS server is a Netapp
Hi,
I'm running a similar configuration, expect for the size (medium), with Dovecot/Director and NetApp (but without clustered mode), mailbox are in Maildir format.
Have you already try to run NFSv4? When we switch to netapp and nfsv4 we had many problems (lock problems and instability) and we had to go immediately to NFSv3. I don't know if was a netapp problem or nfs client (Debian with 2.6 kernel), now we are using Centos 6 as NFS client and we should re-try to mount mailbox as nfsv4.
Let me know if nfsv4 works fine for you. Ciao
Alessio Cecchi http://www.linkedin.com/in/alessice
Alessio Cecchi wrote: [...]
Have you already try to run NFSv4? it's in place on a (very) small sample of mailboxes in dbox format, no issues up to now (Debian Wheezy mainline kernel)
When we switch to netapp and nfsv4 we had many problems (lock problems and instability) and we had to go immediately to NFSv3. I don't know if was a netapp problem or nfs client (Debian with 2.6 kernel), now we are using Centos 6 as NFS client and we should re-try to mount mailbox as nfsv4.
when you've had problems, the "delegation feature" was active on Netapp filers? (AFAIK it is disabled by default)
you had a look here for known bugs? https://kb.netapp.com/support/index?id=3014338&page=content
ciao -brd
Il 12/06/2015 13:02, brd ha scritto:
Alessio Cecchi wrote: [...]
Have you already try to run NFSv4? it's in place on a (very) small sample of mailboxes in dbox format, no issues up to now (Debian Wheezy mainline kernel)
Good to know, let me know what will be when the load will grow.
When we switch to netapp and nfsv4 we had many problems (lock problems and instability) and we had to go immediately to NFSv3. I don't know if was a netapp problem or nfs client (Debian with 2.6 kernel), now we are using Centos 6 as NFS client and we should re-try to mount mailbox as nfsv4.
when you've had problems, the "delegation feature" was active on Netapp filers? (AFAIK it is disabled by default)
Never enabled "delegation feature" on my netapp.
Probably my issue was on client side. I had to switch immediately to NFSv3 without being able to investigate the problem.
Ciao
Alessio Cecchi http://www.linkedin.com/in/alessice
just a quick update:
no issues with NFSv4.0 (load is slowly growing, currently ~7k mailboxes)
instead, bad news from delegation front, we enabled it for a couple of days but we ran in ugly issues: processes went in "uninterruptible sleep" state, load average gets huge, reboot was the only escape :-(
-brd
Alessio Cecchi wrote:
Il 12/06/2015 13:02, brd ha scritto:
Alessio Cecchi wrote: [...]
Have you already try to run NFSv4? it's in place on a (very) small sample of mailboxes in dbox format, no issues up to now (Debian Wheezy mainline kernel)
Good to know, let me know what will be when the load will grow.
When we switch to netapp and nfsv4 we had many problems (lock problems and instability) and we had to go immediately to NFSv3. I don't know if was a netapp problem or nfs client (Debian with 2.6 kernel), now we are using Centos 6 as NFS client and we should re-try to mount mailbox as nfsv4.
when you've had problems, the "delegation feature" was active on Netapp filers? (AFAIK it is disabled by default)
Never enabled "delegation feature" on my netapp.
Probably my issue was on client side. I had to switch immediately to NFSv3 without being able to investigate the problem.
Ciao
Alessio Cecchi http://www.linkedin.com/in/alessice
Good to know.
Please update us during the moving of mailboxes.
I'm also interested into know if you have find benefit from switch to NFSv4 from NFSv3 (load on storage, cache benefits or others).
Thanks
Il 09/07/2015 17:08, brd ha scritto:
just a quick update:
no issues with NFSv4.0 (load is slowly growing, currently ~7k mailboxes)
instead, bad news from delegation front, we enabled it for a couple of days but we ran in ugly issues: processes went in "uninterruptible sleep" state, load average gets huge, reboot was the only escape :-(
-brd
Alessio Cecchi wrote:
Il 12/06/2015 13:02, brd ha scritto:
Alessio Cecchi wrote: [...]
Have you already try to run NFSv4? it's in place on a (very) small sample of mailboxes in dbox format, no issues up to now (Debian Wheezy mainline kernel)
Good to know, let me know what will be when the load will grow.
When we switch to netapp and nfsv4 we had many problems (lock problems and instability) and we had to go immediately to NFSv3. I don't know if was a netapp problem or nfs client (Debian with 2.6 kernel), now we are using Centos 6 as NFS client and we should re-try to mount mailbox as nfsv4.
when you've had problems, the "delegation feature" was active on Netapp filers? (AFAIK it is disabled by default)
Never enabled "delegation feature" on my netapp.
Probably my issue was on client side. I had to switch immediately to NFSv3 without being able to investigate the problem.
Ciao
Alessio Cecchi http://www.linkedin.com/in/alessice
-- Alessio Cecchi http://www.linkedin.com/in/alessice
Hi,
last day I tried to enable "nfsv4 delegation" on my cluster (enable nfsv4.0 and read delegation on Netapp and remount the volume on Linux Centos 6.7 clients with Dovecot+Director).
It was a failure, after some minutes the load on clients was high, and in dovecot.log I found these errors (repeated continuously):
Aug 17 20:28:17 pop01eeh dovecot: imap(info@domain.com): Error: mail_index_wait_lock_fd() failed with file /home/domains/domain.com/info/Maildir/dovecot.index.log: Input/output error Aug 17 20:28:21 pop01eeh dovecot: imap(info@domain.com): Error: fcntl(/home/domains/domain.com/info/Maildir/dovecot.index.cache, write-lock, F_SETLKW) locking failed: Input/output error
Is really impossible to run NFSv4 with delegation with dovecot? If it were possible the number of NFS operations would drop so much with many benefits on performance.
Il 09/07/2015 17:08, brd ha scritto:
just a quick update:
no issues with NFSv4.0 (load is slowly growing, currently ~7k mailboxes)
instead, bad news from delegation front, we enabled it for a couple of days but we ran in ugly issues: processes went in "uninterruptible sleep" state, load average gets huge, reboot was the only escape :-(
-brd
Alessio Cecchi wrote:
Il 12/06/2015 13:02, brd ha scritto:
Alessio Cecchi wrote: [...]
Have you already try to run NFSv4? it's in place on a (very) small sample of mailboxes in dbox format, no issues up to now (Debian Wheezy mainline kernel)
Good to know, let me know what will be when the load will grow.
When we switch to netapp and nfsv4 we had many problems (lock problems and instability) and we had to go immediately to NFSv3. I don't know if was a netapp problem or nfs client (Debian with 2.6 kernel), now we are using Centos 6 as NFS client and we should re-try to mount mailbox as nfsv4.
when you've had problems, the "delegation feature" was active on Netapp filers? (AFAIK it is disabled by default)
Never enabled "delegation feature" on my netapp.
Probably my issue was on client side. I had to switch immediately to NFSv3 without being able to investigate the problem.
Ciao
Alessio Cecchi http://www.linkedin.com/in/alessice
-- Alessio Cecchi http://www.linkedin.com/in/alessice
Hi,
Just out of curiosity what is in nfsv4 delegation that you think would give a benefit on your configuration?
If I read back the thread you seem to have dovecot configured with director ring in front of the backends. In that case Dovecot already manages storage in a way that only one of the backends is accessing each users data at a time. So I can’t see anything but problems form enabling delegations.
Sorry but I have zero experience in running dovegot with nfsv4 delegations since in general it is not needed.
Sami
On 18 Aug 2015, at 15:42, Alessio Cecchi <alessio@skye.it> wrote:
Hi,
last day I tried to enable "nfsv4 delegation" on my cluster (enable nfsv4.0 and read delegation on Netapp and remount the volume on Linux Centos 6.7 clients with Dovecot+Director).
It was a failure, after some minutes the load on clients was high, and in dovecot.log I found these errors (repeated continuously):
Aug 17 20:28:17 pop01eeh dovecot: imap(info@domain.com): Error: mail_index_wait_lock_fd() failed with file /home/domains/domain.com/info/Maildir/dovecot.index.log: Input/output error Aug 17 20:28:21 pop01eeh dovecot: imap(info@domain.com): Error: fcntl(/home/domains/domain.com/info/Maildir/dovecot.index.cache, write-lock, F_SETLKW) locking failed: Input/output error
Is really impossible to run NFSv4 with delegation with dovecot? If it were possible the number of NFS operations would drop so much with many benefits on performance.
Il 09/07/2015 17:08, brd ha scritto:
just a quick update:
no issues with NFSv4.0 (load is slowly growing, currently ~7k mailboxes)
instead, bad news from delegation front, we enabled it for a couple of days but we ran in ugly issues: processes went in "uninterruptible sleep" state, load average gets huge, reboot was the only escape :-(
-brd
Alessio Cecchi wrote:
Il 12/06/2015 13:02, brd ha scritto:
Alessio Cecchi wrote: [...]
Have you already try to run NFSv4? it's in place on a (very) small sample of mailboxes in dbox format, no issues up to now (Debian Wheezy mainline kernel)
Good to know, let me know what will be when the load will grow.
When we switch to netapp and nfsv4 we had many problems (lock problems and instability) and we had to go immediately to NFSv3. I don't know if was a netapp problem or nfs client (Debian with 2.6 kernel), now we are using Centos 6 as NFS client and we should re-try to mount mailbox as nfsv4.
when you've had problems, the "delegation feature" was active on Netapp filers? (AFAIK it is disabled by default)
Never enabled "delegation feature" on my netapp.
Probably my issue was on client side. I had to switch immediately to NFSv3 without being able to investigate the problem.
Ciao
Alessio Cecchi http://www.linkedin.com/in/alessice
-- Alessio Cecchi http://www.linkedin.com/in/alessice
Hi,
in this tipical setup (Dovecot/Director thate share Maildir via NFS) on your NFS Server you have (about) 90% of read operations and only 10% of write operations.
If you see detailed stats for NFS operations you have 40-50% of GETATTR, this means that NFS/Dovecot clients are caching data (mainly dovecot index files) but they have to revalidate the cache frequently asking to the NFS Server if the file is changed (via GETATTR) but the file never change because only this client opens it.
So the NFS Server is wasting operations with (unnecessary) GETATTR request.
With NFSv4 and delegation you can practically eliminate these (GETATTR) requests and speed up your NFS server (instead of buy SSD disk).
This because with delegation a client open file and since is the only client to open these file (that is true with Director) the NFS server delegate the client to manage the file without check if was changed (via GETATTR) on the NFS share.
You can find more information (on NFSv4 and delegation) here: http://www.fsl.cs.stonybrook.edu/docs/nfs4perf/nfs4perf-login.pdf
It would be very useful if NFSv4 Delegation if it worked fine with dovecot.
Please talks about this with Timo.
Il 18/08/2015 16:20, Sami Ketola ha scritto:
Hi,
Just out of curiosity what is in nfsv4 delegation that you think would give a benefit on your configuration?
If I read back the thread you seem to have dovecot configured with director ring in front of the backends. In that case Dovecot already manages storage in a way that only one of the backends is accessing each users data at a time. So I can’t see anything but problems form enabling delegations.
Sorry but I have zero experience in running dovegot with nfsv4 delegations since in general it is not needed.
Sami
On 18 Aug 2015, at 15:42, Alessio Cecchi <alessio@skye.it> wrote:
Hi,
last day I tried to enable "nfsv4 delegation" on my cluster (enable nfsv4.0 and read delegation on Netapp and remount the volume on Linux Centos 6.7 clients with Dovecot+Director).
It was a failure, after some minutes the load on clients was high, and in dovecot.log I found these errors (repeated continuously):
Aug 17 20:28:17 pop01eeh dovecot: imap(info@domain.com): Error: mail_index_wait_lock_fd() failed with file /home/domains/domain.com/info/Maildir/dovecot.index.log: Input/output error Aug 17 20:28:21 pop01eeh dovecot: imap(info@domain.com): Error: fcntl(/home/domains/domain.com/info/Maildir/dovecot.index.cache, write-lock, F_SETLKW) locking failed: Input/output error
Is really impossible to run NFSv4 with delegation with dovecot? If it were possible the number of NFS operations would drop so much with many benefits on performance.
Il 09/07/2015 17:08, brd ha scritto:
just a quick update:
no issues with NFSv4.0 (load is slowly growing, currently ~7k mailboxes)
instead, bad news from delegation front, we enabled it for a couple of days but we ran in ugly issues: processes went in "uninterruptible sleep" state, load average gets huge, reboot was the only escape :-(
-brd
Alessio Cecchi wrote:
Il 12/06/2015 13:02, brd ha scritto:
Alessio Cecchi wrote: [...]
Have you already try to run NFSv4? it's in place on a (very) small sample of mailboxes in dbox format, no issues up to now (Debian Wheezy mainline kernel)
Good to know, let me know what will be when the load will grow.
When we switch to netapp and nfsv4 we had many problems (lock problems and instability) and we had to go immediately to NFSv3. I don't know if was a netapp problem or nfs client (Debian with 2.6 kernel), now we are using Centos 6 as NFS client and we should re-try to mount mailbox as nfsv4.
when you've had problems, the "delegation feature" was active on Netapp filers? (AFAIK it is disabled by default)
Never enabled "delegation feature" on my netapp.
Probably my issue was on client side. I had to switch immediately to NFSv3 without being able to investigate the problem.
Ciao
Alessio Cecchi http://www.linkedin.com/in/alessice
-- Alessio Cecchi http://www.linkedin.com/in/alessice
-- Alessio Cecchi http://www.linkedin.com/in/alessice
On 8/18/2015 10:46 AM, Alessio Cecchi wrote:
Hi,
in this tipical setup (Dovecot/Director thate share Maildir via NFS) on your NFS Server you have (about) 90% of read operations and only 10% of write operations.
If you see detailed stats for NFS operations you have 40-50% of GETATTR, this means that NFS/Dovecot clients are caching data (mainly dovecot index files) but they have to revalidate the cache frequently asking to the NFS Server if the file is changed (via GETATTR) but the file never change because only this client opens it.
So the NFS Server is wasting operations with (unnecessary) GETATTR request.
With NFSv4 and delegation you can practically eliminate these (GETATTR) requests and speed up your NFS server (instead of buy SSD disk).
This because with delegation a client open file and since is the only client to open these file (that is true with Director) the NFS server delegate the client to manage the file without check if was changed (via GETATTR) on the NFS share.
Enabling delegations on Maildir is going to scale very badly. The NFS client will end up requesting one on every message open, and because an NFS server can only support a limited number of active delegations, it will be forced to constantly recall them, only to issue new short-lived ones to the next message open.
While it's a decent idea for indexes, the client has no way to request delegations selectively for them. Perhaps if you used mdbox where the file count is lower, but even then, there will be many users and many mailboxes so it is important to be sure there are enough delegations available at the NFS server. That's a NetApp question in your case, and not a Dovecot one.
You can find more information (on NFSv4 and delegation) here: http://www.fsl.cs.stonybrook.edu/docs/nfs4perf/nfs4perf-login.pdf
It would be very useful if NFSv4 Delegation if it worked fine with dovecot.
Please talks about this with Timo.
Il 18/08/2015 16:20, Sami Ketola ha scritto:
Hi,
Just out of curiosity what is in nfsv4 delegation that you think would give a benefit on your configuration?
If I read back the thread you seem to have dovecot configured with director ring in front of the backends. In that case Dovecot already manages storage in a way that only one of the backends is accessing each users data at a time. So I can’t see anything but problems form enabling delegations.
Sorry but I have zero experience in running dovegot with nfsv4 delegations since in general it is not needed.
Sami
On 18 Aug 2015, at 15:42, Alessio Cecchi <alessio@skye.it> wrote:
Hi,
last day I tried to enable "nfsv4 delegation" on my cluster (enable nfsv4.0 and read delegation on Netapp and remount the volume on Linux Centos 6.7 clients with Dovecot+Director).
It was a failure, after some minutes the load on clients was high, and in dovecot.log I found these errors (repeated continuously):
Aug 17 20:28:17 pop01eeh dovecot: imap(info@domain.com): Error: mail_index_wait_lock_fd() failed with file /home/domains/domain.com/info/Maildir/dovecot.index.log: Input/output error Aug 17 20:28:21 pop01eeh dovecot: imap(info@domain.com): Error: fcntl(/home/domains/domain.com/info/Maildir/dovecot.index.cache, write-lock, F_SETLKW) locking failed: Input/output error
Is really impossible to run NFSv4 with delegation with dovecot? If it were possible the number of NFS operations would drop so much with many benefits on performance.
Il 09/07/2015 17:08, brd ha scritto:
just a quick update:
no issues with NFSv4.0 (load is slowly growing, currently ~7k mailboxes)
instead, bad news from delegation front, we enabled it for a couple of days but we ran in ugly issues: processes went in "uninterruptible sleep" state, load average gets huge, reboot was the only escape :-(
-brd
Alessio Cecchi wrote:
Il 12/06/2015 13:02, brd ha scritto:
Alessio Cecchi wrote: [...] > Have you already try to run NFSv4? it's in place on a (very) small sample of mailboxes in dbox format, no issues up to now (Debian Wheezy mainline kernel)
Good to know, let me know what will be when the load will grow.
> When we switch to netapp and nfsv4 we had many problems (lock > problems > and instability) and we had to go immediately to NFSv3. I don't > know if > was a netapp problem or nfs client (Debian with 2.6 kernel), now > we are > using Centos 6 as NFS client and we should re-try to mount > mailbox as nfsv4.
when you've had problems, the "delegation feature" was active on Netapp filers? (AFAIK it is disabled by default)
Never enabled "delegation feature" on my netapp.
Probably my issue was on client side. I had to switch immediately to NFSv3 without being able to investigate the problem.
Ciao
Alessio Cecchi http://www.linkedin.com/in/alessice
-- Alessio Cecchi http://www.linkedin.com/in/alessice
Il 18/08/2015 17:25, Tom Talpey ha scritto:
On 8/18/2015 10:46 AM, Alessio Cecchi wrote:
Hi,
in this tipical setup (Dovecot/Director thate share Maildir via NFS) on your NFS Server you have (about) 90% of read operations and only 10% of write operations.
If you see detailed stats for NFS operations you have 40-50% of GETATTR, this means that NFS/Dovecot clients are caching data (mainly dovecot index files) but they have to revalidate the cache frequently asking to the NFS Server if the file is changed (via GETATTR) but the file never change because only this client opens it.
So the NFS Server is wasting operations with (unnecessary) GETATTR request.
With NFSv4 and delegation you can practically eliminate these (GETATTR) requests and speed up your NFS server (instead of buy SSD disk).
This because with delegation a client open file and since is the only client to open these file (that is true with Director) the NFS server delegate the client to manage the file without check if was changed (via GETATTR) on the NFS share.
Enabling delegations on Maildir is going to scale very badly. The NFS client will end up requesting one on every message open, and because an NFS server can only support a limited number of active delegations, it will be forced to constantly recall them, only to issue new short-lived ones to the next message open.
Sure, but in my test there wasn't a scale problem, only few users was online but the "lock" error in dovecot.log come immediately.
While it's a decent idea for indexes, the client has no way to request delegations selectively for them. Perhaps if you used mdbox where the file count is lower, but even then, there will be many users and many mailboxes so it is important to be sure there are enough delegations available at the NFS server. That's a NetApp question in your case, and not a Dovecot one.
An idea could be to have a NFS share for index only and enable delegation on it.
Alessio Cecchi http://www.linkedin.com/in/alessice
On 8/18/2015 11:37 AM, Alessio Cecchi wrote:
Il 18/08/2015 17:25, Tom Talpey ha scritto:
On 8/18/2015 10:46 AM, Alessio Cecchi wrote:
Hi,
in this tipical setup (Dovecot/Director thate share Maildir via NFS) on your NFS Server you have (about) 90% of read operations and only 10% of write operations.
If you see detailed stats for NFS operations you have 40-50% of GETATTR, this means that NFS/Dovecot clients are caching data (mainly dovecot index files) but they have to revalidate the cache frequently asking to the NFS Server if the file is changed (via GETATTR) but the file never change because only this client opens it.
So the NFS Server is wasting operations with (unnecessary) GETATTR request.
With NFSv4 and delegation you can practically eliminate these (GETATTR) requests and speed up your NFS server (instead of buy SSD disk).
This because with delegation a client open file and since is the only client to open these file (that is true with Director) the NFS server delegate the client to manage the file without check if was changed (via GETATTR) on the NFS share.
Enabling delegations on Maildir is going to scale very badly. The NFS client will end up requesting one on every message open, and because an NFS server can only support a limited number of active delegations, it will be forced to constantly recall them, only to issue new short-lived ones to the next message open.
Sure, but in my test there wasn't a scale problem, only few users was online but the "lock" error in dovecot.log come immediately.
Ok, but I don't see how this is a Dovecot problem. Centos is returning EIO to the lock request, you need to track down why that's happening.
NFS v4.0 delegations require a callback port to be open on the client, have you verified that it's set up properly?
While it's a decent idea for indexes, the client has no way to request delegations selectively for them. Perhaps if you used mdbox where the file count is lower, but even then, there will be many users and many mailboxes so it is important to be sure there are enough delegations available at the NFS server. That's a NetApp question in your case, and not a Dovecot one.
An idea could be to have a NFS share for index only and enable delegation on it.
participants (4)
-
Alessio Cecchi
-
brd
-
Sami Ketola
-
Tom Talpey