On Sat, 2011-06-11 at 00:24 -0400, Aliet Santiesteban Sifontes wrote:
Hello list, we continue our tests using Dovecot on a RHEL 6.1 Cluster Backend with GFS2, also we are using dovecot as a Director for user node persistence, everything was ok until we started stress testing the solution with imaptest, we had many deadlocks, cluster filesystems corruptions and hangs, specially in index filesystem, we have configured the backend as if they were on a NFS like setup but this seems not to work at least on GFS2 on rhel 6.1.
Since you're using director, you shouldn't really need any special Dovecot config.
The specific configs for NFS or cluster filesystem we used:
mmap_disable = yes mail_fsync = always mail_nfs_storage = yes mail_nfs_index = yes fsync_disable=no lock_method = fcntl
fsync_disable is a deprecated setting, which was replaced by mail_fsync. The mail_nfs_* settings will only slow things down, you should keep them as "no".
... If you mmap() a file on GFS2 with a read/write mapping, but only read from it, this only counts as a read. On GFS though, it counts as a write, so GFS2 is much more scalable with mmap() I/O...
But in our config we are using mmap_disable=yes, do we have to use mmap_disable=no with GFS2???
There are more potential bugs with mmap_disable=no, since it uses both read()/write() and mmap(), while with mmap_disable=yes it only uses read()/write().
Also, how dovecot manage the cache flush on GFS2 filesystem???
There shouldn't be any need for that with directors.
Why, if we are doing user node persistence, dovecot indexes gets corrupted???
Looks to me like GFS is still pretty buggy.
One thing you could test is if running imaptest directly against one backend server for one user triggers this. If not, run simultaneously another imaptest against another user on another server. Maybe then? The point being that try to find the simplest test that can break GFS, and once you have that try to get Redhat people to fix it.