[Dovecot] keeping indexes in tmpfs
While doing some testing with converting accounts while simulating incoming mail load (no other pop/imap processes going but 4 processes converting users), we found that we were maxing out the local disk in the server with the index activity. To find out that it was the index activity, I mounted a tmpfs for dovecot to keep indexes on, and the system load dropped from 70 to 3 :)
Anyway, Timo and others, do you have any thoughts about keeping indexes on a tmpfs partition? I realize the obvious issues with this: running out of space, and the fact that the partition is lost on reboot. However, we plan to run a cluster of these servers anyway, where we will keep domains/accounts going to a particular server but fail over to one of the others in case of an outage. Recreating the indexes seems pretty trivial when writing to tmpfs anyway, plus we could copy the indexes to disk (restore after reboot) in the case of planned outages... unplanned outages result in another server rebuilding them all anyway.
So, I'm looking for non-obvious issues we may have with this approach. I need to do more testing to see how much memory we would actually need to feasibly keep the indexes there, but otherwise is there a good reason not to do this?
On Mon, 2007-04-16 at 09:37 -0400, Justin McAleer wrote:
While doing some testing with converting accounts while simulating incoming mail load (no other pop/imap processes going but 4 processes converting users), we found that we were maxing out the local disk in the server with the index activity. To find out that it was the index activity, I mounted a tmpfs for dovecot to keep indexes on, and the system load dropped from 70 to 3 :)
How exactly were you converting the users? I guess if the system is building new index files for tons of users that could take a while. fsync_disable=yes could help a lot with reducing the disk writes, and maybe also mmap_disable=yes.
For Dovecot v2.0 I'm planning on reducing at least dovecot.index.log file sizes. I think currently it uses way too much space when building the initial indexes.
Anyway, Timo and others, do you have any thoughts about keeping indexes on a tmpfs partition? I realize the obvious issues with this: running out of space, and the fact that the partition is lost on reboot.
I think memory would be better used for keeping mailbox data cached that's actually useful at the time.
Also if you've POP3 users who keep messages in the server and dovecot.index.cache file is lost, all the messages are read to calculate the messages' virtual sizes when the user logs in the next time. That could be expensive.
Timo Sirainen wrote:
On Mon, 2007-04-16 at 09:37 -0400, Justin McAleer wrote:
While doing some testing with converting accounts while simulating incoming mail load (no other pop/imap processes going but 4 processes converting users), we found that we were maxing out the local disk in the server with the index activity. To find out that it was the index activity, I mounted a tmpfs for dovecot to keep indexes on, and the system load dropped from 70 to 3 :)
How exactly were you converting the users? I guess if the system is building new index files for tons of users that could take a while. fsync_disable=yes could help a lot with reducing the disk writes, and maybe also mmap_disable=yes.
Ok, I'll give more detail about the setup. I plan to use the convert plugin to migrate from CommuniGate to dovecot for our ~160,000 accounts. We will probably migrate one domain at a time, letting users basically migrate their mail at first login. But, we will also go through all the domain's accounts logging in to pop3 just to ensure everything gets moved.
So, for testing, I've copied a few of our domains' mail spools over to the test server and started a script that forks and goes through all their accounts and simply logs in to pop3 and waits for a message list to come back. We have the maildirs being stored on an NFS mount though, and indexes on localdisk, so the only localdisk activity was logging and indexes. I'll try setting both of those options and see how much difference it makes, although I'm not sure that losing mail upon server reset is acceptible for us (disabling fsync).
For Dovecot v2.0 I'm planning on reducing at least dovecot.index.log file sizes. I think currently it uses way too much space when building the initial indexes.
Anyway, Timo and others, do you have any thoughts about keeping indexes on a tmpfs partition? I realize the obvious issues with this: running out of space, and the fact that the partition is lost on reboot.
I think memory would be better used for keeping mailbox data cached that's actually useful at the time.
Perhaps, but one way or another, we apparently cannot use a simple localdisk for indexes. So, the options are either memory or some sort of raid setup. Since we use maildirs, isn't the disk cache aspect less of a concern?
Also if you've POP3 users who keep messages in the server and dovecot.index.cache file is lost, all the messages are read to calculate the messages' virtual sizes when the user logs in the next time. That could be expensive.
On Mon, 2007-04-16 at 10:41 -0400, Justin McAleer wrote:
Ok, I'll give more detail about the setup. I plan to use the convert plugin to migrate from CommuniGate to dovecot for our ~160,000 accounts. We will probably migrate one domain at a time, letting users basically migrate their mail at first login. But, we will also go through all the domain's accounts logging in to pop3 just to ensure everything gets moved.
OK, that also causes it to read the mailboxes and save the message sizes to cache files.
So, for testing, I've copied a few of our domains' mail spools over to the test server and started a script that forks and goes through all their accounts and simply logs in to pop3 and waits for a message list to come back. We have the maildirs being stored on an NFS mount though, and indexes on localdisk, so the only localdisk activity was logging and indexes.
And reading the mail spools? :)
I'll try setting both of those options and see how much difference it makes, although I'm not sure that losing mail upon server reset is acceptible for us (disabling fsync).
It could be done pretty easily only for index files by modifying the sources. I guess another option should be added for this.
Anyway, Timo and others, do you have any thoughts about keeping indexes on a tmpfs partition? I realize the obvious issues with this: running out of space, and the fact that the partition is lost on reboot.
I think memory would be better used for keeping mailbox data cached that's actually useful at the time.
Perhaps, but one way or another, we apparently cannot use a simple localdisk for indexes. So, the options are either memory or some sort of raid setup. Since we use maildirs, isn't the disk cache aspect less of a concern?
Depends on how mailboxes are accessed. If message contents are read only once then I guess it doesn't matter. Probably the worst offender here is SEARCH TEXT/BODY command.
Timo Sirainen wrote:
On Mon, 2007-04-16 at 10:41 -0400, Justin McAleer wrote:
OK, that also causes it to read the mailboxes and save the message sizes to cache files.
I expected as much. But just to make sure we're on the same page, after converting a user, only dovecot.index and dovecot.index.log exist in a user's index directory, no dovecot.index.cache. However, all of the user's folders do have indexes, not just the inbox. That is the expected result, correct?
And reading the mail spools? :)
I have the source mail spools on NFS as well.
It could be done pretty easily only for index files by modifying the sources. I guess another option should be added for this.
Fair enough, I'll have a look at this if it does make a significant difference, although I'm trying to keep source changes minimal :)
Depends on how mailboxes are accessed. If message contents are read only once then I guess it doesn't matter. Probably the worst offender here is SEARCH TEXT/BODY command.
Understood... that's going to be nasty though, it's just a matter of degree :)
I have the source mail spools on NFS as well. Just asking - not looking for a flame war. I had a miserable time
Justin McAleer wrote: trying to get NFS working with just my simple LAN - I've had much better results via Samba with either SMBFS or CIFS. Off-topic - but can I ask why you're using NFS?
-- Daniel
Daniel L. Miller wrote:
I have the source mail spools on NFS as well. Just asking - not looking for a flame war. I had a miserable time
Justin McAleer wrote: trying to get NFS working with just my simple LAN - I've had much better results via Samba with either SMBFS or CIFS. Off-topic - but can I ask why you're using NFS?
Going back to my first mail, the long and short of it is:
"However, we plan to run a cluster of these servers anyway, where we will keep domains/accounts going to a particular server but fail over to one of the others in case of an outage."
We have a new Netapp, and have been running our email off NFS for years without problems. To be fair, we haven't needed shared storage before, but we spent a lot of money on reliable storage, so there the data stayed. What sort of problems have you run into? So far in my dovecot testing I haven't had any issues either.
We did try a few clustered filesystems on a SAN mount instead, but the results were, at best, poor.
Justin McAleer wrote:
Daniel L. Miller wrote:
I have the source mail spools on NFS as well. Just asking - not looking for a flame war. I had a miserable time
Justin McAleer wrote: trying to get NFS working with just my simple LAN - I've had much better results via Samba with either SMBFS or CIFS. Off-topic - but can I ask why you're using NFS?
Going back to my first mail, the long and short of it is:
"However, we plan to run a cluster of these servers anyway, where we will keep domains/accounts going to a particular server but fail over to one of the others in case of an outage."
We have a new Netapp, and have been running our email off NFS for years without problems. To be fair, we haven't needed shared storage before, but we spent a lot of money on reliable storage, so there the data stayed. What sort of problems have you run into? So far in my dovecot testing I haven't had any issues either. I had issues just trying to get it to work at all. Either portmap wouldn't start, wouldn't share, wouldn't talk - something. Or I'd get it to work, and then my joke of a wiring closet would get bumped and the clients would freeze when their connection was interrupted. That last was actually one of my biggest problems (admittedly not an NFS fault - but an NFS overreaction). When the wires were repaired NFS settled down
- but I had far more success with Samba, including automatically restoring broken connections without having the clients re-mount and/or re-boot.
-- Daniel
participants (3)
-
Daniel L. Miller
-
Justin McAleer
-
Timo Sirainen