[Dovecot] RFC: grouped (f)sync
Timo Sirainen
tss at iki.fi
Thu Jan 6 00:38:55 EET 2011
On 6.1.2011, at 0.27, Attila Nagy wrote:
> With this, 10 syncs would happen in every second from Dovecot LDA processes, meaning if a client connects in t=0 it will immediately got the response 250, if a client connects in t=0.05, it will get the response in 50 ms (in an ideal world, where syncing does not take time), and the committed blocks could accumulate for a maximum of 100 ms.
> In a busy system (where this setting would make sense), it means it would be possible to write more data with less IOPS needed.
I guess this could work. Although earlier I thought about delaying fsyncs so that when saving/copying multiple mails in a single transaction with maildir it would delay about 10 or so close()s and then fsync() them all at the same time at the end. This ended up being slower (but I only tested it with a single user - maybe in real world setups it might have worked better).
> I can see two problems:
> 1. there is no call for committing a lot of file descriptors in one transaction, so instead of fsync() for each of the modified FDs, a sync() would be needed. sync() writes all buffers to stable storage, which is bad if you have a mixed workload, where there are a lot of non-fsynced data, or other heavy fsync users. But modern file systems, like ZFS will write those back too, so there an fsync(fd) is -AFAIK- mostly equivalent with a sync(pool on which fd is). sync() of course is system wide, so if you have other file systems, those will be synced as well. (this setting isn't for everybody)
> 2. in a multiprocess environment this would need coordination, so instead of doing fsyncs in distinct processes, there would be one process needed, which does the sync and returns OK for the others, so they can notify the client about the commit to the stable storage.
It's possible for you to send the fds to another process via UNIX socket that does fsync() on them. I was also hoping for using lib-fs for at least some mailbox formats at some point (either some modified dbox, or a new one), and for that it would be even easier to add an fsync plugin that does this kind of fsync-transfer to another process.
More information about the dovecot
mailing list