Hi,
I'm evaluating to switch from NetApp to a ZFS appliance (like Qsan). Our setup is Dovecot, Maildir for email storage and NFS to share mailboxes (more than 30k users) across POP/IMAP and MX servers.
NetApp NFS works fine also under high load but have some limitation for inode numbers per Volume and is expensive (but recently their prices have dropped).
ZFS, I read, suggest to create many small Raid Group to increase IOPS, but this configuration (N Raid instead of one RAID-DP like NetApp) is more complex to manage, or not?
Someone has experiences with ZFS and NFS(v3) in high load environments?
Thanks
Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
El 11/03/2016 a las 11:22 a.m., Alessio Cecchi escribió:
Hi,
I'm evaluating to switch from NetApp to a ZFS appliance (like Qsan). Our setup is Dovecot, Maildir for email storage and NFS to share mailboxes (more than 30k users) across POP/IMAP and MX servers.
NetApp NFS works fine also under high load but have some limitation for inode numbers per Volume and is expensive (but recently their prices have dropped).
ZFS, I read, suggest to create many small Raid Group to increase IOPS, but this configuration (N Raid instead of one RAID-DP like NetApp) is more complex to manage, or not?
Someone has experiences with ZFS and NFS(v3) in high load environments?
Thanks
Be careful to no do any synchronous writes under ZFS. Every sync write can take up to 3 seconds of latency (under freebsd, I didnt test ZFS in linux). Im using it in a 3k user environment and works great with a 4TB raid 10, and dovecot cache files in a SSD disk.
Saludos, Juan.
On 3/11/2016 9:58 AM, Juan Bernhard juan@inti.gob.ar wrote:
Be careful to no do any synchronous writes under ZFS. Every sync write can take up to 3 seconds of latency (under freebsd, I didnt test ZFS in linux). Im using it in a 3k user environment and works great with a 4TB raid 10, and dovecot cache files in a SSD disk.
From what I've heard you should not use hardware based RAID (ie, RAID10) setups with ZFS, you should let ZFS handle it.
Maybe that is the source of your latency issues?
El 11/03/2016 a las 12:04 p.m., Charles Marcus escribió:
On 3/11/2016 9:58 AM, Juan Bernhard juan@inti.gob.ar wrote:
Be careful to no do any synchronous writes under ZFS. Every sync write can take up to 3 seconds of latency (under freebsd, I didnt test ZFS in linux). Im using it in a 3k user environment and works great with a 4TB raid 10, and dovecot cache files in a SSD disk.
From what I've heard you should not use hardware based RAID (ie, RAID10) setups with ZFS, you should let ZFS handle it.
Maybe that is the source of your latency issues?
There are actually two ZFS raid1 but in the same zpool, and the SSD in local, not in the ZFS dtorage. The latency was because I tried to use a ZFS for datasore of a vmware undres NFS, and NFS under vmware is always synchronous. I improved 10 times the performance under iSCSI
On Fri, 11 Mar 2016 11:58:00 -0300 Juan Bernhard juan@inti.gob.ar wrote:
El 11/03/2016 a las 11:22 a.m., Alessio Cecchi escribió:
Hi,
I'm evaluating to switch from NetApp to a ZFS appliance (like Qsan). Our setup is Dovecot, Maildir for email storage and NFS to share mailboxes (more than 30k users) across POP/IMAP and MX servers.
NetApp NFS works fine also under high load but have some limitation for inode numbers per Volume and is expensive (but recently their prices have dropped).
ZFS, I read, suggest to create many small Raid Group to increase IOPS, but this configuration (N Raid instead of one RAID-DP like NetApp) is more complex to manage, or not?
Someone has experiences with ZFS and NFS(v3) in high load environments?
Thanks
Be careful to no do any synchronous writes under ZFS. Every sync write can take up to 3 seconds of latency (under freebsd, I didnt test ZFS in linux). Im using it in a 3k user environment and works great with a 4TB raid 10, and dovecot cache files in a SSD disk.
Saludos, Juan.
zfs set sync=disabled ?
-- Regards, Stephan
On Sun, 13 Mar 2016 09:45:06 +0000 James lista@xdrv.co.uk wrote:
On 11/03/2016 15:17, Stephan von Krawczynski wrote:
zfs set sync=disabled ?
Only if you are happy to loose data on power failure.
I don't know the actual setup, but if you have no UPC you shouldn't host email services anyway.
-- Regards, Stephan
On Sun, 13 Mar 2016 11:47:23 +0100 Stephan von Krawczynski skraw@ithnet.com wrote:
On Sun, 13 Mar 2016 09:45:06 +0000 James lista@xdrv.co.uk wrote:
On 11/03/2016 15:17, Stephan von Krawczynski wrote:
zfs set sync=disabled ?
Only if you are happy to loose data on power failure.
I don't know the actual setup, but if you have no UPC you shouldn't host email services anyway.
That should read "UPS" of course ...
-- Regards, Stephan
On 13/03/2016 20:47, Stephan von Krawczynski wrote:
On Sun, 13 Mar 2016 09:45:06 +0000 James lista@xdrv.co.uk wrote:
On 11/03/2016 15:17, Stephan von Krawczynski wrote:
zfs set sync=disabled ?
Only if you are happy to loose data on power failure.
I don't know the actual setup, but if you have no UPC you shouldn't host email services anyway.
I'm guessing you meant UPS, anyway, a UPS wont protect you from human error.
Also, most buildings, at least in this country, have a fire emergency shutoff requirement, meaning mains is isolated from the building, and the back up gennies are also forbidden to be engaged - UPS's dont last forever.
-- If you have the urge to reply to all rather than reply to list, you best first read http://members.ausics.net/qwerty/
On Mon, 14 Mar 2016 09:32:42 +1000 Noel Butler noel.butler@ausics.net wrote:
On 13/03/2016 20:47, Stephan von Krawczynski wrote:
On Sun, 13 Mar 2016 09:45:06 +0000 James lista@xdrv.co.uk wrote:
On 11/03/2016 15:17, Stephan von Krawczynski wrote:
zfs set sync=disabled ?
Only if you are happy to loose data on power failure.
I don't know the actual setup, but if you have no UPC you shouldn't host email services anyway.
I'm guessing you meant UPS, anyway, a UPS wont protect you from human error.
Also, most buildings, at least in this country, have a fire emergency shutoff requirement, meaning mains is isolated from the building, and the back up gennies are also forbidden to be engaged - UPS's dont last forever.
Guys, please don't argue on kindergarten level. The UPS is for backing a sudden death, but not for running five days. Of course you can do a controlled shutdown if battery level falls below a trigger value. And this is about all you need: control. There is no fs error as long as you perform a regular shutdown. If UPS-backup is forbidden in your country then I suggest moving to civilized regions of the planet ;-)
-- Regards, Stephan
On 14/03/2016 09:59, Stephan von Krawczynski wrote:
On Mon, 14 Mar 2016 09:32:42 +1000 Noel Butler noel.butler@ausics.net wrote:
On 13/03/2016 20:47, Stephan von Krawczynski wrote:
On Sun, 13 Mar 2016 09:45:06 +0000 James lista@xdrv.co.uk wrote:
On 11/03/2016 15:17, Stephan von Krawczynski wrote:
zfs set sync=disabled ?
Only if you are happy to loose data on power failure.
I don't know the actual setup, but if you have no UPC you shouldn't host email services anyway.
I'm guessing you meant UPS, anyway, a UPS wont protect you from human error.
Also, most buildings, at least in this country, have a fire emergency shutoff requirement, meaning mains is isolated from the building, and the back up gennies are also forbidden to be engaged - UPS's dont last forever.
Guys, please don't argue on kindergarten level. The UPS is for backing a sudden death, but not for running five days. Of course you can do a controlled shutdown if battery level falls below a trigger value. And this is about all you need: control. There is no fs error as long as you perform a regular
and you've never seen these cause problems with FS? then you must be a newbie, in over 25 years I've seen it happen several times - yes even after an apparent controlled shutdown.
shutdown. If UPS-backup is forbidden in your country then I suggest moving to civilized regions of the planet ;-)
Now whos on kindergarten level, do you really want fireman pouring water on fire on a level of a building thats powered up because some lamer has a generator running? really? I'm sure those firemen would gladly hand YOU the hose, the best UPS systems runtime we've seen under average load for a large ISP data centre is 21 mins, usually ample time to allow the generators to start up, come to full power, and switch in taking over the load, but thats not going to help during a building fire, once their depleted, their depleted.
-- If you have the urge to reply to all rather than reply to list, you best first read http://members.ausics.net/qwerty/
On Mon, 14 Mar 2016 16:59:28 +1000 Noel Butler noel.butler@ausics.net wrote:
On 14/03/2016 09:59, Stephan von Krawczynski wrote:
On Mon, 14 Mar 2016 09:32:42 +1000 Noel Butler noel.butler@ausics.net wrote:
On 13/03/2016 20:47, Stephan von Krawczynski wrote:
On Sun, 13 Mar 2016 09:45:06 +0000 James lista@xdrv.co.uk wrote:
On 11/03/2016 15:17, Stephan von Krawczynski wrote:
zfs set sync=disabled ?
Only if you are happy to loose data on power failure.
I don't know the actual setup, but if you have no UPC you shouldn't host email services anyway.
I'm guessing you meant UPS, anyway, a UPS wont protect you from human error.
Also, most buildings, at least in this country, have a fire emergency shutoff requirement, meaning mains is isolated from the building, and the back up gennies are also forbidden to be engaged - UPS's dont last forever.
Guys, please don't argue on kindergarten level. The UPS is for backing a sudden death, but not for running five days. Of course you can do a controlled shutdown if battery level falls below a trigger value. And this is about all you need: control. There is no fs error as long as you perform a regular
and you've never seen these cause problems with FS? then you must be a newbie, in over 25 years I've seen it happen several times - yes even after an apparent controlled shutdown.
Maybe you're doing something wrong then. because in my last 21 years working exactly in this business I've not seen a single deadly fs-crash because of a power-outage. Not one. And we had of course several, all backed by UPS.
shutdown. If UPS-backup is forbidden in your country then I suggest moving to civilized regions of the planet ;-)
Now whos on kindergarten level, do you really want fireman pouring water on fire on a level of a building thats powered up because some lamer has a generator running? really? I'm sure those firemen would gladly hand YOU the hose, the best UPS systems runtime we've seen under average load for a large ISP data centre is 21 mins, usually ample time to allow the generators to start up, come to full power, and switch in taking over the load, but thats not going to help during a building fire, once their depleted, their depleted.
If your servers get drowned with water during a fire your fs is probably the least of your worries. You don't really plan to re-enable servers with water- or fire-damage, do you? That's probably why there shouldn't be a fireman pouring water in the first place. Please lets stop this here as it has pretty much nothing to do with dovecot...
-- Regards, Stephan
On 14/03/2016 18:49, Stephan von Krawczynski wrote:
and you've never seen these cause problems with FS? then you must be a newbie, in over 25 years I've seen it happen several times - yes even after an apparent controlled shutdown.
Maybe you're doing something wrong then. because in my last 21 years working exactly in this business I've not seen a single deadly fs-crash because of a power-outage. Not one. And we had of course several, all backed by UPS.
Consider yourself lucky, Most network admins whove been around large busy ISP DC's have seen this in their lifetime, to not have seen one is rare, go buy yourself a lotto ticket :)
If your servers get drowned with water during a fire your fs is probably the least of your worries. You don't really plan to re-enable servers with water- or fire-damage, do you? That's probably why there shouldn't be a fireman pouring water in the first place.
This shows you dont understand structural engineering, the fire does not have to be on your floor, it can be far away as two or so levels above, with the high pressure water used - equating to a shitload of water, there are ducts, shafts, other risers and so on that with a shit-tone of water can easily penetrate fireblocks of floors below - dont take my work, go ask a fireman, or maybe watch the nightly news sometime (building fire - many levels water affected blah blah blah)... so keeping those boxes on via UPS's is asking for lots of charcoaled boards and fried drives. IOW, total stupidity.
Should those machines be depowered as required by our building codes, well, might take a few days of drying out but at least they will power back up without error - yes, done it in risk assessments.
-- If you have the urge to reply to all rather than reply to list, you best first read http://members.ausics.net/qwerty/
On Sat, 19 Mar 2016 17:37:04 +1000 Noel Butler noel.butler@ausics.net wrote:
On 14/03/2016 18:49, Stephan von Krawczynski wrote:
and you've never seen these cause problems with FS? then you must be a newbie, in over 25 years I've seen it happen several times - yes even after an apparent controlled shutdown.
Maybe you're doing something wrong then. because in my last 21 years working exactly in this business I've not seen a single deadly fs-crash because of a power-outage. Not one. And we had of course several, all backed by UPS.
Consider yourself lucky, Most network admins whove been around large busy ISP DC's have seen this in their lifetime, to not have seen one is rare, go buy yourself a lotto ticket :)
If your servers get drowned with water during a fire your fs is probably the least of your worries. You don't really plan to re-enable servers with water- or fire-damage, do you? That's probably why there shouldn't be a fireman pouring water in the first place.
This shows you dont understand structural engineering, the fire does not have to be on your floor, it can be far away as two or so levels above, with the high pressure water used - equating to a shitload of water, there are ducts, shafts, other risers and so on that with a shit-tone of water can easily penetrate fireblocks of floors below - dont take my work, go ask a fireman, or maybe watch the nightly news sometime (building fire - many levels water affected blah blah blah)... so keeping those boxes on via UPS's is asking for lots of charcoaled boards and fried drives. IOW, total stupidity.
Should those machines be depowered as required by our building codes, well, might take a few days of drying out but at least they will power back up without error - yes, done it in risk assessments.
Obviously you must work for people that have not the slightest idea about using hardware in a correct way and don't know when the time has come to throw it away. Man, there is no way to let a drowned box survive. It is not back to normal when it is dry. If you don't get that I am pretty happy to be no customer. This can only be an idea born in the sick mind of a controller who didn't want to pay insurance in the first place. We are talking about serious corrosion effects here let alone that you have a hard time even knowning when your boxes are really dry. Your fireman on the other hand seem to be stuck in the 80ths. Today there are solar panels almost everywhere _which you cannot turn off_. Sure you have a switch somewhere, but it does not help you for the space between the switch and the roof (which can be a pretty long distance). Really, sorry, I don't want to listen to more horror stories from you operating drowned equipment. And in the end: considering your "large busy ISP DC's" they should have backup DCs located elsewhere with mirrored data, right? Lets please end that now and for all.
-- Regards, Stephan
On 19/03/2016 08:11, Stephan von Krawczynski wrote:
Obviously you must work for people that have not the slightest idea about using hardware ...
So you have UPSes, power supplies and motherboards that never fail. Good luck to you, you are running on it.
For everyone else reading this, do not set sync off. If sync writes are taking 3 seconds, or more than a few milliseconds, there is something else that needs fixing.
It seems its troll time again on this list, ohh maybe its Harry in disguise... So I will play along, for today anyway :)
On 19/03/2016 18:11, Stephan von Krawczynski wrote:
On Sat, 19 Mar 2016 17:37:04 +1000 Noel Butler noel.butler@ausics.net wrote:
On 14/03/2016 18:49, Stephan von Krawczynski wrote:
and you've never seen these cause problems with FS? then you must be a newbie, in over 25 years I've seen it happen several times - yes even after an apparent controlled shutdown.
Maybe you're doing something wrong then. because in my last 21 years working exactly in this business I've not seen a single deadly fs-crash because of a power-outage. Not one. And we had of course several, all backed by UPS.
Consider yourself lucky, Most network admins whove been around large busy ISP DC's have seen this in their lifetime, to not have seen one is rare, go buy yourself a lotto ticket :)
If your servers get drowned with water during a fire your fs is probably the least of your worries. You don't really plan to re-enable servers with water- or fire-damage, do you? That's probably why there shouldn't be a fireman pouring water in the first place.
This shows you dont understand structural engineering, the fire does not have to be on your floor, it can be far away as two or so levels above, with the high pressure water used - equating to a shitload of water, there are ducts, shafts, other risers and so on that with a shit-tone of water can easily penetrate fireblocks of floors below - dont take my work, go ask a fireman, or maybe watch the nightly news sometime (building fire - many levels water affected blah blah blah)... so keeping those boxes on via UPS's is asking for lots of charcoaled boards and fried drives. IOW, total stupidity.
Should those machines be depowered as required by our building codes, well, might take a few days of drying out but at least they will power back up without error - yes, done it in risk assessments.
Obviously you must work for people that have not the slightest idea about using hardware in a correct way and don't know when the time has come to throw
it away. Man, there is no way to let a drowned box survive. It is not back to
Wow, how long did you allege to have been in network/sys admin? 20 years? Really? I think you made a typo and and it should have read 20 minutes, ya know I have refrained from posting no here for a long time (apart from fact I rarely read the list), and I was not going to feed the trolls, but sometimes the smart mouthed know nothing, need to bitch slap upside the head so thats why I am devoting about 60 seconds to you.
Of course there is, networks dont throw away many hundreds of servers valued $7K to $10K, nor $100K+ storage systems, or $40K routers, LB's or switches, just because they got drenched - with power isolated.
normal when it is dry. If you don't get that I am pretty happy to be no customer. This can only be an idea born in the sick mind of a controller who
You will never be a customer _or_employee_ of mine, trust me on that one!
didn't want to pay insurance in the first place. We are talking about serious
Got nothing to with insurance, it might take 2 days to dry out and get back up and running, it will take an awful lot longer to get offsite backups and restore every last one of them.
I hope your employer reads this list, because he/she should be seeing alarm bells from your comments.
corrosion effects here let alone that you have a hard time even knowning when
yep, you sure did fail basic engineering
your boxes are really dry. Your fireman on the other hand seem to be stuck in the 80ths. Today there are solar panels almost everywhere _which you cannot turn off_.
Wow, you really are clutching the fantasy straws arnt you, perhaps your country lacks modernisation, I can go to the side of my house and isolate the panels with a flick of a switch, strangely enough and I guess in your eyes horrifyingly called "solar isolator" that stops the panels providing power to my electrical circuits, yes, there might be power from panels to it, but thats not going to affect my power circuits or equipment
-- If you have the urge to reply to all rather than reply to list, you best first read http://members.ausics.net/qwerty/
On 11/03/2016 14:58, Juan Bernhard wrote:
Someone has experiences with ZFS and NFS(v3) in high load environments?
Thanks
Be careful to no do any synchronous writes under ZFS.
By default all NFS writes are synchronous but I assume dovcot sync writes all data anyway so in this case the NFS sync doesn't matter.
Every sync write can take up to 3 seconds of latency (under freebsd, I didnt test ZFS in linux).
sync writes should take a few ms (they do for me). If you have enough load for them to be a problem you should have enough revenue to afford an SSD as a ZFS write cache / SLOG and then they will no longer be a problem.
participants (6)
-
Alessio Cecchi
-
Charles Marcus
-
James
-
Juan Bernhard
-
Noel Butler
-
Stephan von Krawczynski