[Dovecot] Dovecot write activity (mostly 1.1.x)
Im experiencing write activity thats somewhat different from my previous qmail/courier-imap/Maildir setup. This more outspoken in v.1.1.x than v1.0.x (Im using Maildir).
Write activity is about half that of read activity when measuring throughput. But when measuring operations its about 5-7 times as high (measured with zpool iostat on ZFS).
I think this might be due to the many small updates to index and cache files. Anyway since writing is much more demanding to the disk than reads dovecots ends up being slower (only a little though) than my old qmail/courier-imap/Maildir was. And the old setup didnt even benefit from indexes like Dovecot does. (What I mean saying slower is that it can service fewer users before it hits the wall).
Of course theres also lots of benefits using Dovecot. Im just wondering whether this was a thing that should be focused on for later versions (maybe writes could be grouped together or something like that). Dovecot is very cheap on the CPU side so the only real limit in terms of scalability is the storage.
Regards, Mikkel
On Sun, 2007-11-04 at 13:02 +0100, mikkel@euro123.dk wrote:
Write activity is about half that of read activity when measuring throughput. But when measuring operations it’s about 5-7 times as high (measured with zpool iostat on ZFS).
Have you tried with fsync_disable=yes? ZFS's fsyncing performance apparently isn't all that great. I'm guessing this is also the reason for the I/O stalls in your other mail.
On Sun, November 4, 2007 1:51 pm, Timo Sirainen wrote:
On Sun, 2007-11-04 at 13:02 +0100, mikkel@euro123.dk wrote:
Write activity is about half that of read activity when measuring throughput. But when measuring operations itâs about 5-7 times as high (measured with zpool iostat on ZFS).
Have you tried with fsync_disable=yes? ZFS's fsyncing performance apparently isn't all that great. I'm guessing this is also the reason for the I/O stalls in your other mail.
I'm using fsync_disable=yes already. I know ZFS has issues. In my opinion it was never ready when it was released but it has such nice features Im trying to cope with its specialities. I've also disabled "flush cache requests" in ZFS since it made the performance horrible.
If Im the only one experiencing this then I guess I'll just have to accept it as yet another ZFS curiosity :|
(Possibly this is also the answer to my other post regarding stalled/delayed I/O)
- Mikkel
On Sun, 2007-11-04 at 14:10 +0100, mikkel@euro123.dk wrote:
Write activity is about half that of read activity when measuring throughput. But when measuring operations it’s about 5-7 times as high (measured with zpool iostat on ZFS).
Have you tried with fsync_disable=yes? ZFS's fsyncing performance apparently isn't all that great. I'm guessing this is also the reason for the I/O stalls in your other mail.
I'm using fsync_disable=yes already.
Well, if you use only clients that don't really need indexes they could just slow things down. You could try disabling indexes to see how it works then (:INDEX=MEMORY to mail_location).
Although v1.1 should be writing less to disk than v1.0, because v1.1 doesn't anymore update dovecot.index as often. v1.1 versions before beta6 had some problems updating cache file though. They could have updated it more often than necessary, and beta5 didn't really even update it at all.
(Possibly this is also the answer to my other post regarding stalled/delayed I/O)
You could truss the hanging process to see what it's doing.
On Sun, November 4, 2007 2:20 pm, Timo Sirainen wrote:
Well, if you use only clients that don't really need indexes they could just slow things down. You could try disabling indexes to see how it works then (:INDEX=MEMORY to mail_location).
I tried that earlier and it did result in less writes. It would be a nice-to-have option to be able to individually tell deliver, imapd and popd whether they should update indexes and cache.
Although v1.1 should be writing less to disk than v1.0, because v1.1 doesn't anymore update dovecot.index as often. v1.1 versions before beta6 had some problems updating cache file though. They could have updated it more often than necessary, and beta5 didn't really even update it at all.
Okay, then I really need to wait and see if things change (it'll probably take a few days before the majority of e-mail accounts are re-indexed and cached).
By the way writes increased noticeably when i upgraded from 1.0 to 1.1. On the other hand reads decreased a lot as well. I guess the fail-to-update-cache bug you mentioned could have a lot to do with that.
(Possibly this is also the answer to my other post regarding stalled/delayed I/O)
You could truss the hanging process to see what it's doing.
It's not an easy task since the delay is sometimes just a few (5-10) seconds. And when there is a complete stall the client aborts before I can find the process. But I'll give it a go.
Thanks for your input.
On Sun, November 4, 2007 2:54 pm, mikkel@euro123.dk wrote:
You could truss the hanging process to see what it's doing. It's not an easy task since the delay is sometimes just a few (5-10) seconds. And when there is a complete stall the client aborts before I can find the process. But I'll give it a go.
I tried trussing a normal running process. Here's what I see all the time:
stat64("[path]/Maildir/new", 0xFFBFF470) = 0 stat64("[path]/Maildir/cur", 0xFFBFF4E0) = 0 stat64("[path]/Maildir/new", 0xFFBFF2F0) = 0 stat64("[path]/Maildir/cur", 0xFFBFF258) = 0 stat64("[path]/Maildir/dovecot-uidlist", 0xFFBFF1D0) = 0 chown("[path]/Maildir/dovecot-uidlist", 105, -1) = 0 stat64("[path]/Maildir/dovecot-uidlist", 0xFFBFF2F0) = 0 stat64("[path]/Maildir/dovecot.index.log", 0xFFBFDAE0) = 0 chown("[path]/Maildir/dovecot.index.log", 105, -1) = 0 stat64("[path]/Maildir/dovecot.index.log", 0xFFBFDBF0) = 0
What i notice is that stat64 is very often called twice in a row on the same file. Also I notice that chown() is always run before a file is accessed.
Regarding chown it looks like dovecot either thinks that the file hasn't got the rights it should have, or it just calls chown anyway to be sure.
I'm not a C-programmer so I have no idea whether its supposed to be like that. But if it isnt then perhaps it could explain the many writes (chowning constantly)?
What do you think?
On Sun, 2007-11-04 at 15:27 +0100, mikkel@euro123.dk wrote:
chown("[path]/Maildir/dovecot-uidlist", 105, -1) = 0 stat64("[path]/Maildir/dovecot-uidlist", 0xFFBFF2F0) = 0 stat64("[path]/Maildir/dovecot.index.log", 0xFFBFDAE0) = 0 chown("[path]/Maildir/dovecot.index.log", 105, -1) = 0 stat64("[path]/Maildir/dovecot.index.log", 0xFFBFDBF0) = 0
What i notice is that stat64 is very often called twice in a row on the same file. Also I notice that chown() is always run before a file is accessed.
Regarding chown it looks like dovecot either thinks that the file hasn't got the rights it should have, or it just calls chown anyway to be sure.
I'm not sure about the double stat(), but chown() most means you've set mail_nfs_index=yes. I'll see if I can do something about the second stat, but it's unlikely that it makes any difference to disk I/O.
participants (2)
-
mikkel@euro123.dk
-
Timo Sirainen