ZFS storage and backup
Hi listmembers,
I am about to migrate our mailservices to FreeBSD + ZFS. Thus, before entering the sheer endless stage of performance testing, I thought I would ask here kindly for all kinds of information.
My setups are nothing special with few users, however, I would like to have a nice setup, maybe some of you could contribute to this thread. We are using slow spinning disks, but we may consider using ssds in a not-so-distant future.
*) storages: any infos on ZFS options or whether to use mdbox or sdbox, and what configs/options regarding compression etc.
*) backup: what is a best practice regarding backups? - using only the dovecot tools or leveraging the great features of ZFS (or both) with snapshots etc.?
Thanks for all sorts of infos, probably saving me quite some time evaluating different options!
Robert
On Sun, Nov 14, 2021 at 03:14:44PM +0100, infoomatic wrote:
I am about to migrate our mailservices to FreeBSD + ZFS. Thus, before entering the sheer endless stage of performance testing, I thought I would ask here kindly for all kinds of information.
[..]
*) storages: any infos on ZFS options [..]
In addition to FreeBSD's excellent handbook, plus of course man-pages, you may find the following helpful:
https://arstechnica.com/information-technology/2020/05/zfs-101-understanding...
and
https://jrs-s.net/category/open-source/zfs/
especially
https://jrs-s.net/2015/02/06/zfs-you-should-use-mirror-vdevs-not-raidz/
-- A: When it messes up the order in which people normally read text. Q: When is top-posting a bad thing?
() ASCII ribbon campaign. Please avoid HTML emails & proprietary /\ file formats. (Why? See e.g. https://v.gd/jrmGbS ). Thank you.
Op 14 nov. 2021 om 15:15 heeft infoomatic <infoomatic@gmx.at> het volgende geschreven:
Hi listmembers,
I am about to migrate our mailservices to FreeBSD + ZFS. Thus, before entering the sheer endless stage of performance testing, I thought I would ask here kindly for all kinds of information.
My setups are nothing special with few users, however, I would like to have a nice setup, maybe some of you could contribute to this thread. We are using slow spinning disks, but we may consider using ssds in a not-so-distant future
If performance is a concern, why not now?
*) storages: any infos on ZFS options or whether to use mdbox or sdbox, and what configs/options regarding compression etc.
*) backup: what is a best practice regarding backups? - using only the dovecot tools or leveraging the great features of ZFS (or both) with snapshots etc.?
Thanks for all sorts of infos, probably saving me quite some time evaluating different options!
Robert
On 14/11/2021 14:14, infoomatic wrote:
My setups are nothing special with few users, however, I would like to have a nice setup, maybe some of you could contribute to this thread. We are using slow spinning disks, but we may consider using ssds in a not-so-distant future.
*) storages: any infos on ZFS options or whether to use mdbox or sdbox, and what configs/options regarding compression etc.
OmniOS with ZFS here.
I use maildir - just a personal choice and inertia, I have no performance data, no problem and no reason to change. I like being able to see emails as plain files.
zfs set compress=gzip
and no other changes from default, oh and atime=off on the whole machine. Email gzips well, most other ZFSes I leave on lz4. I say it is better to use the file system to compress rather than getting dovecot to do it.
$ zfs get compress,compressratio,used ... NAME PROPERTY VALUE SOURCE ...../..../vmail compression gzip received ...../..../vmail compressratio 1.82x - ...../..../vmail used 8.55G -
25 mailbox users ("nothing special with few users").
I moved the storage from HDD (mirror plus log) to SSD (mirror) and no one noticed, not even me knowing it had been done and over a local network. I have enough RAM such that repeated reads are cached.
I will use native ZFS encryption soon. I see no performance issues in test.
Don't get hung up on ZFS tuning, mostly ZFS just works.
*) backup: what is a best practice regarding backups? - using only the dovecot tools or leveraging the great features of ZFS (or both) with snapshots etc.?
I use automated snapshots and zfs send/receive to a remote backup machine. I auto copy many ZFSes this way so it is minimal effort to do email too.
James
On 15.11.21 11:04, James wrote:
I will use native ZFS encryption soon. I see no performance issues in test.
Don't get hung up on ZFS tuning, mostly ZFS just works.
yes I know, I love working with it, I have used it for > 10 years now, but it happened that none of my mailserver projects used ZFS. Regarding storage I tend to use sdbox, from what I have read it seems to be the better option when using a COW filesystem compared to mdbox. One more question is: compression at file system level or in dovecot storage?
The reason I am not sure to switch to ssds is that most servers are for non-profit organisations, sports clubs etc. - they also need some storage for pictures, their budget is quite low (so performance testing would only be done out of my interest), and if spinning rust with optimized settings suffices why not.
Thanks for your input so far, hope more will come ;-)
On 15/11/2021 16:18, infoomatic wrote:
Regarding storage I tend to use sdbox, from what I have read it seems to be the better option when using a COW filesystem compared to mdbox. One more
https://doc.dovecot.org/admin_manual/mailbox_formats/ sdbox single-dbox, one message per file. mdbox multi-dbox, multiple messages per file.
so I guess sdbox is better with ZFS. I could test each but I think I will find the IO used by dovecot is low for each. I have one user with 32,164 emails in INBOX and IO is not a problem.
question is: compression at file system level or in dovecot storage?
System. The OS compresses using all CPUs in a separate process. - does dovecot? Dovecot is smaller and simpler (--with-zlib=no etc). You can change the ZFS compression anytime. Text files remain plain text files even though they are compressed on disc.
When available, zstd in ZFS should be a better option than gzip.
The reason I am not sure to switch to ssds is that most servers are for non-profit organisations, sports clubs etc. - they also need some storage for pictures, their budget is quite low (so performance testing would only be done out of my interest), and if spinning rust with optimized settings suffices why not.
As you have the HDDs already wait until there is a problem before fixing it. Over the internet I doubt anyone will notice and more importantly care enough to pay. Your HDDs might be old and about to fail so other factors rise in importance. Data security and continuity of service are more important than latency.
Do you have enough RAM for read cache? A separate log for writes? L2ARC will only help if you have more active data than fits in RAM.
James
On Fri, 19 Nov 2021, James wrote:
On 15/11/2021 16:18, infoomatic wrote:
Regarding storage I tend to use sdbox, from what I have read it seems to be the better option when using a COW filesystem compared to mdbox. One more
https://doc.dovecot.org/admin_manual/mailbox_formats/ sdbox single-dbox, one message per file. mdbox multi-dbox, multiple messages per file.
so I guess sdbox is better with ZFS. I could test each but I think I will find the IO used by dovecot is low for each. I have one user with 32,164 emails in INBOX and IO is not a problem.
It depends on what aspect of performance you're talking about and how it is implemented, but as I understand it, ZFS snapshots are done at the block level, and just as long as mdbox leaves message blocks in situ (by manipulating indices instead?) and doesn't shuffle them around, unchanged messages won't bloat snapshot storage, unlike MBOx where a one message insertion/deletion at the beginning will cause the entire mailbox to end up in snapshot storage.
question is: compression at file system level or in dovecot storage?
This relates to my comment -- if the compression is done at the message level rather than the whole MDBOX, the above is not applicable as any change to a byte will affect all subsequent bytes.
I think MDBOX is a compromise in data granularity that tries to strike a balance between various aspects of I/O performance.
Joseph Tam <jtam.home@gmail.com>
participants (5)
-
infoomatic
-
James
-
Joseph Tam
-
Sam Kuper
-
William Edwards