Btrfs RAID-10 performance
Miloslav Hůla
miloslav.hula at gmail.com
Tue Sep 15 12:14:03 EEST 2020
Dne 10.09.2020 v 17:40 John Stoffel napsal(a):
>>> So why not run the backend storage on the Netapp, and just keep the
>>> indexes and such local to the system? I've run Netapps for many years
>>> and they work really well. And then you'd get automatic backups using
>>> schedule snapshots.
>>>
>>> Keep the index files local on disk/SSDs and put the maildirs out to
>>> NFSv3 volume(s) on the Netapp(s). Should do wonders. And you'll stop
>>> needing to do rsync at night.
>
> Miloslav> It's the option we have in minds. As you wrote, NetApp is very solid.
> Miloslav> The main reason for local storage is, that IMAP server is completely
> Miloslav> isolated from network. But maybe one day will use it.
>
> It's not completely isolated, it can rsync data to another host that
> has access to the Netapp. *grin*
:o)
> Miloslav> Unfortunately, to quickly fix the problem and make server
> Miloslav> usable again, we already added SSD and moved indexes on
> Miloslav> it. So we have no measurements in old state.
>
> That's ok, if it's better, then its better. How is the load now?
> Looking at the output of 'iostat -x 30' might be a good thing.
Load is between 1 and 2. We can live with that for now.
> Miloslav> Situation is better, but I guess, problem still exists. I
> Miloslav> takes some time to load be growing. We will see.
>
> Hmm... how did you setup the new indexes volume? Did you just use
> btrfs again? Did you mirror your SSDs as well?
Yes. Just two SSD into free slots, propagate them as two RAID-0 into OS
and btrfs RAID-1.
It is a nasty, I know, but without outage. It is a just quick attempt to
improve the situation. Our next plan is to buy more controllers,
schedule an outage on weekend and do it properly.
> Do the indexes fill the SSD, or is there 20-30% free space? When an
> SSD gets fragmented, it's performance can drop quite a bit. Did you
> put the SSDs onto a seperate controller? Probably not. So now you've
> just increased the load on the single controller, when you really
> should be spreading it out more to improve things.
SSD are almost empty, 2.4GB of 93GB is used after 'doveadm index' on all
mailboxes.
> Another possible hack would be to move some stuff to a RAM disk,
> assuming your server is on a UPS/Generator incase of power loss. But
> that's an unsafe hack.
>
> Also, do you have quotas turned on? That's a performance hit for
> sure.
No, we are running without quotas.
> Miloslav> Thank you for the fio tip. Definetly I'll try that.
>
> It's a good way to test and measure how the system will react.
> Unfortunately, you will need to do your testing outside of normal work
> hours so as to not impact your users too much.
>
> Good luck! Please post some numbers if you get them. If you see
> only a few disks are 75% or more busy, then *maybe* you have a bad
> disk in the system, and moving off that disk or replacing it might
> help. Again, hard to know.
>
> Rebalancing btrfs might also help, especially now that you've moved
> the indexes off that volume.
>
> John
Thank you
Milo
More information about the dovecot
mailing list