Re: [Dovecot] mmap in GFS2 on rhel 6.1

12 Jun 2011 · _filesystem_


      https://bugzilla.redhat.com/show_bug.cgi?id=712139
Furhter investigating this bug I have tested all kinds of config with dovecot,
and all of them gets gfs2 hanged, I have tested this scenario with bare metal
hardware cluster, with virtualized cluster guests in vmware esxi 4.1, with a
cluster test in vmware workstation and I can reproduce the problem in all the
tests, even in different enviroments, we are testing if dovecot can be deployed
on a Redhat Cluster of Active-Active Nodes doing user session persistence.
This was my last test, I simplify the scenario with a cluster in my own laptop:
1- Used a two node rhel 6.1 cluster, virtualized in VMWare Workstation.
2- Used two shared iscsi devices from a NAS.
3- Used fence_scsi.
Cluster.conf
<?xml version="1.0"?>
<cluster config_version="9" name="MailCluster">
<clusternodes>
<clusternode name="node0.local" nodeid="1">
<fence>
<method name="fn_mt_scsi">
<device name="fn_scsi"/>
</method>
</fence>
<unfence>
<device action="on" name="fn_scsi"/>
</unfence>
</clusternode>
<clusternode name="node1.local" nodeid="2">
<fence>
<method name="fn_mt_scsi">
<device name="fn_scsi"/>
</method>
</fence>
<unfence>
<device action="on" name="fn_scsi"/>
</unfence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_scsi"
logfile="/var/log/cluster/fence_scsi.log" name="fn_scsi"/>
</fencedevices>
</cluster>
4- Used the iscsi devices for the LVM stuff and created there the GFS2
filesystems.
fstab fragment
GFS2 filesystem
/dev/vg_indexes/lv_indexes /var/vmail/indexes  gfs2
noatime,quota=off,errors=withdraw         0 0
/dev/vg_mailbox/lv_mailbox /var/vmail/mailbox  gfs2
noatime,quota=off,errors=withdraw         0 0
5- Dovecot configured with users in ldap, in this case we tested the mbox
mailbox format with fnctl and mmap_disable=yes, we have also tested all other
mailboxes formats, indexes and mailboxes stored in gfs2 filesystems, here the
conf:
[root@node0 ~]# dovecot -n
2.0.9: /etc/dovecot/dovecot.conf
OS: Linux 2.6.32-131.2.1.el6.x86_64 x86_64 Red Hat Enterprise Linux Server
release 6.1 (Santiago) gfs2
auth_default_realm = example.com
auth_mechanisms = plain login
auth_worker_max_count = 60
disable_plaintext_auth = no
listen = *
mail_fsync = always
mail_gid = vmail
mail_location =
mbox:/var/vmail/mailbox/%d/%3n/%n:INDEX=/var/vmail/indexes/%d/%3n/%n
mail_nfs_index = yes
mail_nfs_storage = yes
mail_uid = vmail
mbox_write_locks = fcntl
mmap_disable = yes
passdb {
args = /etc/dovecot/dovecot-ldap.conf.ext
driver = ldap
}
ssl_cert = </etc/pki/dovecot/certs/dovecot.pem
ssl_key = </etc/pki/dovecot/private/dovecot.pem
userdb {
args = /etc/dovecot/dovecot-ldap-userdb.conf.ext
driver = ldap
}
[root@node0 ~]#
6- started dovecot service in the nodes.
7- from another hosts tested with imaptest the first node:
imaptest host=192.168.164.95 userfile=userfile port=143 mbox=mail/dovecot-crlf
no_tracking logout=0 clients=20 secs=30 seed=123
8- Repeated many times against that node.
9- Run the test against the second node:
imaptest host=192.168.164.96 userfile=userfile port=143 mbox=mail/dovecot-crlf
no_tracking logout=0 clients=20 secs=30 seed=123
10- First node hangs
GFS2: fsid=MailCluster:indexes.0: fatal: filesystem consistency error
GFS2: fsid=MailCluster:indexes.0:   inode = 468 525144
GFS2: fsid=MailCluster:indexes.0:   function = gfs2_dinode_dealloc, file =
fs/gfs2/inode.c, line = 352
GFS2: fsid=MailCluster:indexes.0: about to withdraw this file system
GFS2: fsid=MailCluster:indexes.0: telling LM to unmount
GFS2: fsid=MailCluster:indexes.0: withdrawn
Pid: 3808, comm: delete_workqueu Not tainted 2.6.32-131.2.1.el6.x86_64 #1
Call Trace:
[<ffffffffa064bfd2>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2]
[<ffffffffa0621209>] ? trunc_dealloc+0xa9/0x130 [gfs2]
[<ffffffffa064c1dd>] ? gfs2_consist_inode_i+0x5d/0x60 [gfs2]
[<ffffffffa0631584>] ? gfs2_dinode_dealloc+0x64/0x210 [gfs2]
[<ffffffffa064a1da>] ? gfs2_delete_inode+0x1ba/0x280 [gfs2]
[<ffffffffa064a0ad>] ? gfs2_delete_inode+0x8d/0x280 [gfs2]
[<ffffffffa064a020>] ? gfs2_delete_inode+0x0/0x280 [gfs2]
[<ffffffff8118cfbe>] ? generic_delete_inode+0xde/0x1d0
[<ffffffffa062e940>] ? delete_work_func+0x0/0x80 [gfs2]
[<ffffffff8118d115>] ? generic_drop_inode+0x65/0x80
[<ffffffffa0648c4e>] ? gfs2_drop_inode+0x2e/0x30 [gfs2]
[<ffffffff8118bf82>] ? iput+0x62/0x70
[<ffffffffa062e994>] ? delete_work_func+0x54/0x80 [gfs2]
[<ffffffff810887d0>] ? worker_thread+0x170/0x2a0
[<ffffffff8108e100>] ? autoremove_wake_function+0x0/0x40
[<ffffffff81088660>] ? worker_thread+0x0/0x2a0
[<ffffffff8108dd96>] ? kthread+0x96/0xa0
[<ffffffff8100c1ca>] ? child_rip+0xa/0x20
[<ffffffff8108dd00>] ? kthread+0x0/0xa0
[<ffffffff8100c1c0>] ? child_rip+0x0/0x20
no_formal_ino = 468
no_addr = 525144
i_disksize = 65536
blocks = 0
i_goal = 525170
i_diskflags = 0x00000000
i_height = 1
i_depth = 0
i_entries = 0
i_eattr = 0
GFS2: fsid=MailCluster:indexes.0: gfs2_delete_inode: -5
I I change to differents mailbox formats, they also hangs, only that messages
in the kernel are little differents as the first post.
any ideas???
Best regards
2011/6/11 Stan Hoeppner <stan@hardwarefreak.com>
...
On 6/10/2011 11:24 PM, Aliet Santiesteban Sifontes wrote:
...
Hello list, we continue our tests using Dovecot on a RHEL 6.1 Cluster
Backend with GFS2, also we are using dovecot as a Director for user node
persistence, everything was ok until we started stress testing the
solution
with imaptest, we had many deadlocks, cluster filesystems corruptions and
hangs, specially in index filesystem, we have configured the backend as
if
they were on a NFS like setup but this seems not to work at least on GFS2
on
rhel 6.1.
Actual _filesystem_ corruption is typically unrelated to user space
applications.  You should be looking at a lower level for the cause,
i.e. kernel, device driver, hardware, etc.  Please post details of your
shared storage hardware environment, including HBAs, SAN array
brand/type, if you're using GFS2 over DRBD, etc.
...
We have a two node cluster sharing two GFS2 filesystem

Index GFS2 filesystem to store users indexes
Mailbox data on a GFS2 filesystem

Experience of many users has shown that neither popular cluster
filesystems such as GFS2/OCFS, nor NFS, handle high metadata/IOPS
workloads very well, especially those that make heavy use of locking.
...
The specific configs for NFS or cluster filesystem we used:
mmap_disable = yes
mail_fsync = always
mail_nfs_storage = yes
mail_nfs_index = yes
fsync_disable=no
lock_method = fcntl
mail location :
mail_location =
mdbox:/var/vmail/%d/%3n/%n/mdbox:INDEX=/var/indexes/%d/%3n/%n
For a Dovecot cluster using shared storage, you are probably better off
using a mailbox format for which indexes are independent of mailbox
files and are automatically [re]generated if absent.
Try using mbox or maildir and store indexes on local node disk/SSD
instead of on the cluster filesystem.  Only store the mailboxes on the
cluster filesystem.  If for any reason a user login gets bumped to a
node lacking the index files they're automatically rebuilt.
Since dbox indexes aren't automatically generated if missing you can't
do what I describe above with dbox storage.  Given the limitations of
cluster filesystem (and NFS) metadata IOPS and locking, you'll likely
achieve best performance and stability using local disk index files and
mbox format mailboxes on GFS2.  Maildir format works in this setup as
well, but the metadata load on the cluster filesystem is much higher,
and thus peak performance will typically be lower.
--
Stan