[Dovecot] performance of maildir on ocfs2
Hi,
I would like to run my imap service on a active-active cluster. I wonder how well OCFS2 performs in read and write with millions of smallest files involved. Has anybody got any experience?
Thanks John
On 26.04.2010 14:37, mailinglists@belfin.ch wrote:
Hi,
I would like to run my imap service on a active-active cluster. I wonder how well OCFS2 performs in read and write with millions of smallest files involved. Has anybody got any experience?
Thanks John
I have some experience.
I have done some test and benchmarks here for OCFS2 and NFS4. My understanding is that GFS2 limitations and performance are similar to OCFS2's, so I have not included it in the tests. My benckmarks showed that OCFS2 has an order of magnitude better performance for IMAP server (for benchmarks I have used imaptest).
Though, there are some concerns:
For better performance you would like to stick one account to one server. If one account is checked concurrently from different nodes of OCFS2 cluster they invalidates each-other the direcotory cache for the Maildir. It result in a lot more IO. I use nginx for load balancing incomming pop3/imap connections if front of dovecot servers.
OCFS2 has a limit for 32k files per directory. There is directory index support in recent kernels (> 2.6.33) that will remove this limit, but the userland tools are not yet ready for production (not in the master branch of ocfs2-tools)
My understanding is that OCFS2 uses a global lock for move/rename. As you know, Maildir format uses a lot of such operations. I think that dbox format (dovecot native) will be better choice, because there are no file moves/renames. I am planning migration to dbox now. If I have to start the service now, I would choose dbox for mail storage.
Filesystem quota. OCFS2 has support in recent kernels. It has integrated in ocfs2-tools master branch but there is not yet release of the tools. So I do not hurry to push it in production.
I hope it hepls Best regards
-- Luben Karavelov Research and development Spectrum Net JSC
36, D-r G. M. Dimitrov Blvd. 1797 Sofia Mobile: +359 884332140 url: www.spnet.net
On 26.04.2010 14:51, karavelov wrote:
On 26.04.2010 14:37, mailinglists@belfin.ch wrote:
Hi,
I would like to run my imap service on a active-active cluster. I wonder how well OCFS2 performs in read and write with millions of smallest files involved. Has anybody got any experience?
Thanks John
I have some experience.
I have done some test and benchmarks here for OCFS2 and NFS4. My understanding is that GFS2 limitations and performance are similar to OCFS2's, so I have not included it in the tests. My benckmarks showed that OCFS2 has an order of magnitude better performance for IMAP server (for benchmarks I have used imaptest).
Though, there are some concerns:
For better performance you would like to stick one account to one server. If one account is checked concurrently from different nodes of OCFS2 cluster they invalidates each-other the direcotory cache for the Maildir. It result in a lot more IO. I use nginx for load balancing incomming pop3/imap connections if front of dovecot servers.
OCFS2 has a limit for 32k files per directory. There is directory index support in recent kernels (> 2.6.33) that will remove this limit, but the userland tools are not yet ready for production (not in the master branch of ocfs2-tools)
My understanding is that OCFS2 uses a global lock for move/rename. As you know, Maildir format uses a lot of such operations. I think that dbox format (dovecot native) will be better choice, because there are no file moves/renames. I am planning migration to dbox now. If I have to start the service now, I would choose dbox for mail storage.
Filesystem quota. OCFS2 has support in recent kernels. It has integrated in ocfs2-tools master branch but there is not yet release of the tools. So I do not hurry to push it in production.
So bottom line is to use a filesystem such as XFS, distribute and dedicate mailboxes to a number of backend imap servers optimally with direct access to the storage and do imap proxying and loadbalancing in front of those servers.
I hope it hepls
Thanks, it did!
Best wishes
On 26.04.2010 21:42, Philipp Snizek wrote:
So bottom line is to use a filesystem such as XFS, distribute and dedicate mailboxes to a number of backend imap servers optimally with direct access to the storage and do imap proxying and loadbalancing in front of those servers.
Then you should also balance incoming mail and local deliveries and this part is tricky. Also, there should be a hearthbeat/pacemaker for filesystem/IP/services failover.
Every choice is a compromise. So you should balance a lot of factors: complexity, administrative overhead, FS limits, performance, stability etc.
Best regards luben
On Mon, 2010-04-26 at 15:51 +0300, karavelov wrote:
- My understanding is that OCFS2 uses a global lock for move/rename. As you know, Maildir format uses a lot of such operations. I think that dbox format (dovecot native) will be better choice, because there are no file moves/renames. I am planning migration to dbox now. If I have to start the service now, I would choose dbox for mail storage.
Wonder what the performance difference is then between v2.0's single-dbox and multi-dbox? I'd guess mdbox is faster.
On 29.04.2010 21:02, Timo Sirainen wrote:
On Mon, 2010-04-26 at 15:51 +0300, karavelov wrote:
- My understanding is that OCFS2 uses a global lock for move/rename. As you know, Maildir format uses a lot of such operations. I think that dbox format (dovecot native) will be better choice, because there are no file moves/renames. I am planning migration to dbox now. If I have to start the service now, I would choose dbox for mail storage.
Wonder what the performance difference is then between v2.0's single-dbox and multi-dbox? I'd guess mdbox is faster.
Here are some benchmarks that were done with imaptest. The used commands are imaptest host=rhp2 mbox=dovecot.mbox user=test@example.com pass="test" seed=123 secs=10 imaptest host=rhp2 mbox=dovecot.mbox user=test@example.com pass="test" seed=123 secs=10 logout=0
The volume is an iscsi export (4 SATA disks in a stripe) mounted on a imap test server (no other processes are running). On OCFS2 setup, the filesystem is mounted also on another node (2 node test cluster). The other node was also idle.
Here are my results:
Logi List Stat Sele Fetc Fet2 Stor Dele Expu Appe Logo
100% 50% 50% 100% 100% 100% 50% 100% 100% 100% 100%
30% 5%
nologout 10 139 130 10 248 350 87 196 248 248 XFS
maildir
logout 227 121 127 227 216 323 60 170 216 221 454 XFS
maildir
nologout 10 733 713 10 1438 2094 467 1161 1438 1438 OCFS2
maildir
logout 584 300 282 584 547 780 170 428 547 580 1168 OCFS2
maildir
nologout 10 930 892 10 1825 2614 527 1489 1825 1825 OCFS2 dbox
logout 570 290 298 569 564 838 226 452 564 568 1140 OCFS2 dbox
DISCLAIMER: Dovecot server is tuned for best performance with OCFS2 as far as I can because my current production setup is OCFS2 based. XFS is included for comparison without much of tuning. Mount options are:
XFS: noatime,nodiratime,logbufs=8,logbsize=131072 OCFS2: noatime,data=writeback,commit=30
I have tested also NFS4 but the results were disappointing so I abandoned further tests because no tuning could make a x10 difference.
My expectation is that pushing dbox in production will have even more gains than my tests show because it will lower internal OCFS2 locking on move and rename.
My tests and benchmarks were done using v1.2.11. May be I should make some benchmarks for mdbox also using dovecot v2. My understanding is that dbox is forward compatible with mdbox and there will be no need to convert mailboxes from dbox to mdbox. Is it that ot there will be another pain in migrationg mailboxes from one format in another?
Best regards luben
On 1.5.2010, at 0.25, luben karavelov wrote:
My tests and benchmarks were done using v1.2.11. May be I should make some benchmarks for mdbox also using dovecot v2. My understanding is that dbox is forward compatible with mdbox and there will be no need to convert mailboxes from dbox to mdbox. Is it that ot there will be another pain in migrationg mailboxes from one format in another?
v1.2 dbox is similar to v2.0's dbox, but not identical. v2.0 dbox is simpler and faster. Also dbox and mdbox are different, although they share some code. http://wiki.dovecot.org/MailboxFormat/dbox
Anyway, v2.0 is supposed to be able to read v1.2's dbox, but 1) I haven't tested it recently and 2) that's only if the dbox doesn't contain any maildir-migration files (so all mail files are u.* files, no maildir files). I'm kind of hoping the dbox/maildir hybrids aren't all that popular and maybe I don't need to worry about them.. :)
On 1.05.2010 00:32, Timo Sirainen wrote:
v1.2 dbox is similar to v2.0's dbox, but not identical. v2.0 dbox is simpler and faster. Also dbox and mdbox are different, although they share some code. http://wiki.dovecot.org/MailboxFormat/dbox
Anyway, v2.0 is supposed to be able to read v1.2's dbox, but 1) I haven't tested it recently and 2) that's only if the dbox doesn't contain any maildir-migration files (so all mail files are u.* files, no maildir files). I'm kind of hoping the dbox/maildir hybrids aren't all that popular and maybe I don't need to worry about them.. :)
I have done some test here.
1st. There is some problem with imap + quota plugin. the corresponding logs:
May 1 03:46:33 rho2 dovecot: imap(luben@test.dpv.bg): Panic: file index-transaction.c: line 145 (index_transaction_rollback): assertion failed: (box->transaction_count > 0 || box->view->transactions == 0) May 1 03:46:33 rho2 dovecot: imap(luben@test.dpv.bg): Raw backtrace: /usr/lib/dovecot/libdovecot.so.0 [0x7f6e230114c2] -> /usr/lib/dovecot/libdovecot.so.0 [0x7f6e2301152a] -> /usr/lib/dovecot/libdovecot.so.0(i_error+0) [0x7f6e230118d3] -> /usr/lib/dovecot/libdovecot-storage.so.0 [0x7f6e232c0d24] -> /usr/lib/dovecot/modules/lib10_quota_plugin.so [0x7f6e21294021] -> /usr/lib/dovecot/modules/lib10_quota_plugin.so [0x7f6e21293a90] -> /usr/lib/dovecot/libdovecot-storage.so.0(sdbox_sync_begin+0x45e) [0x7f6e232c2bee] -> /usr/lib/dovecot/libdovecot-storage.so.0(sdbox_transaction_save_commit_pre+0x70) [0x7f6e232c33f0] -> /usr/lib/dovecot/libdovecot-storage.so.0 [0x7f6e232c1138] -> /usr/lib/dovecot/libdovecot-storage.so.0(mail_index_transaction_commit_full+0x97) [0x7f6e23290957] -> /usr/lib/dovecot/libdovecot-storage.so.0(index_transaction_commit+0x8b) [0x7f6e232c0dbb] -> /usr/lib/dovecot/modules/lib10_quota_plugin.so [0x7f6e212940a4] -> /usr/lib/dovecot/libdovecot-storage.so.0(mailbox_transaction_commit_get_changes+ May 1 03:46:33 rho2 dovecot: master: service(imap): child 5621 killed with signal 6 (core dumps disabled) May 1 03:46:33 rho2 dovecot: imap(luben@test.dpv.bg): dbox: File unexpectedly lost: /var/www/149444/mail/122-dbox/mailboxes/INBOX/dbox-Mails/u.3226 ...
So I have disabled all imap plugins except autocreate and have run the dbox/mdbox test with dovecot v2.0b4 and with v1.2.11. Here are the results:
nologout 10 1111 1036 10 2111 3071 755 1680 2111 2111 1.2.11 dbox
logout 544 272 263 542 538 753 191 426 538 540 1088 1.2.11 dbox
nologout 10 1182 1182 10 2367 3389 808 1919 2367 2367 2.0b4 dbox
logout 531 266 265 529 517 720 76 414 517 526 1072 2.0b4 dbox
nologout 10 1074 1012 10 2087 3045 725 1622 2087 2087 2.0b4
mdbox
logout 504 265 242 503 491 660 135 397 491 502 1012 2.0b4
mdbox
So, on this test setup, there is no much difference between dbox/mdbox. May be in other setups it will show differend results. I have seen differend comparison proportions using diffened servers (pre-core 64bit xeons vs core2-quad) even when using the same storage and filesystem.
When v2 stabilizes, I will consider migrating for the greater flexibility (altstorage, LMTP etc.)
Best regards and thanks for the great work luben
On Sat, 2010-05-01 at 04:44 +0300, luben karavelov wrote:
1st. There is some problem with imap + quota plugin. the corresponding logs:
May 1 03:46:33 rho2 dovecot: imap(luben@test.dpv.bg): Panic: file index-transaction.c: line 145 (index_transaction_rollback): assertion failed: (box->transaction_count > 0 || box->view->transactions == 0) .. /usr/lib/dovecot/libdovecot-storage.so.0(sdbox_sync_begin+0x45e) [0x7f6e232c2bee] ->
So this was with single-dbox. What quota configuration did you use? It seems to work with me (now, anyway).
participants (5)
-
karavelov
-
luben karavelov
-
mailinglists@belfin.ch
-
Philipp Snizek
-
Timo Sirainen