ZIL/SLOG device sharing

Added by Matt Breitbach 10 months ago

I know this has been discussed in different circles before, but I thought I'd throw it out here again. I would love to see the ZIL/SLOG be able to be shared by different pools.

When I create a pool, a raid group, or anything on an EMC array or other enterprise level array, the write cache (which by the way, is coherent from head to head) is shared among all devices. I don't have to have a separate write cache for each RAID device or pool. In Nexenta, if I want to be safe, I have to mirror my SLOG devices, and have a mirror of devices per zpool. When I'm using STEC ZeusRAM devices, this costs a lot of money, and burns up valuable slots in my JBOD's. I currently have 6x Stec ZeusRAM disks in my system, tying up 15k in capital, where if I could have shared those devices across the three pools I have, I would only have had to buy 2 devices.

Obviously 4GB and 8GB devices are small. Write cache on most enterprise storage arrays isn't terribly big either. EMC arrays typically have somewhere between 4-20GB of write cache, depending upon the size of the array (20GB is a 500 spindle array). With a 24 drive NL-SAS pool, a 24 drive 600GB 15k SAS pool, and a 24 drive 900GB 10k SAS pool, I'm forced into using 6x Stec drives. I'd like to see this mitigated in future releases. I know that I could slice each drive into 8 parts, and then mirror those slices, and use those as the ZIL, but it's not supported out of the box in Nexenta, and I'd have to build everything from the command line, negating the benefit of the admin interface. Not to mention the nightmare of having a disk fail, rebuilding, and re-mirroring each and every failed slice if a drive dies.


Replies

RE: ZIL/SLOG device sharing - Added by N A 10 months ago

Simplified sharing across pools would be interesting, though you have the catch that it's a shared pain point. All pools sharing a particular ZIL set would nominally need to migrate as a set, or else the GUI gets kinda hairy.

If that OCZ AeonDrive ever pans out, that's all the more justification for a shared ZIL, since that will be 64GB. You are looking at what, maybe 6 ZIL sets on a single shared ZIL device for say a 10Gbit environment using an AeonDrive? So you are looking at only two slots consumed to support a lot of pools, if mirroring (though that single threaded ZIL write may be a gotcha). Though considering OCZ's checkered history, I can see why people would be hesitant to use them for this application. At that kind of high IOPS situation for multiple pools in parallel, OCZ should just team up with the DDRdrive guys for direct PCIe access (at the loss of shared SAS capability, which could potentially be countered with an external PCIe x16 cable linking cards on different heads which might enable a mirror capability as well since the cards would share a private PCIe bus)

RE: ZIL/SLOG device sharing - Added by FREDY . 10 months ago

I think I've seen this discussed here before and no sign that would be added to the features. Not something major for them really I guess. Plus you are comparing Enterprise level arrays with Nexenta which has a good path to walk through in order to get to Enterprise level as most storage vendors in several aspects.

Dermatologist - Frequently Asked Questions About Skin Tags - Added by geryavoirerry geryavoirerry 5 months ago

tag away gels and cosmetic creams are chemicals free. That is, they contain no run out and buy a scrub, take the time to learn about all the leading to infections and sometimes acne. The fewer cosmetics one or acne. It's highly recommended because it addresses numerous skin results that were beyond their best expectations. Scientists

RE: ZIL/SLOG device sharing - Added by Anthony Glidic 5 months ago

hello i have some question here, so if you want to put zil everywhere it's propably because you use nfs on each pool. So first question why you don't just put the zil on the 15k pool and after that you use auto tiering to move your old data on the slow pool?

Another point is, for me the main goal of the hybrid pool is to have the possibilty to use only nl-sas drive. I mean the zil is here to avoid latency and sustend the IOPs peak, l2arc is here to have no latency and a lot of read IOPs. In fact if you make a good calculation of your real workload you can have something > 90% cache hits in write and read so the 15K drives just become useless.

Have you ever considered to build 1 pool with 2 zeus ram as ZIL, 60 nl sas disk (raid 10) and 4 zeus IOPS as l2arc?

I know IBM, EMC NETTAP and all that stuff always say use raid 5 or raid 6 of 15k drives, but it's the old way, i mean in IOPS term you will have much performance with raid 10 nl sas drive and more capacity for the same prize compare to raid 5/6 of 15k disks (if the pool are more than 20 disks). So the only good point for 15k disk are the latency but now you have zil and l2arc to make large caching so you don't care anymore about this parameter. And by the way you said emc have between 4-20 gb write cache, it's just wrong. The 20gb ram by blade are not only write cache but write and read cache. And if you enable functionality like auto tiering, replication ... every time it take part of that cache.

Actually you can't compare a zfs system to a classical system. I mean most of the time on a zfs system you will have more than 100Gb ram and you add a zil for synchronous writes (mean only nfs and it's just a way to secure them. If i take a vnx you don't need a zil because the file part are built on top of the san part so all write go to the cache of the dual controller, but you don't have something like dual controller on your zfs system so you use zil to replace that). And you say 8gb is little but it cache only writes, and the transaction groups are flushed every 10 sec MAX so it means you can have at least 800 MB/s by second so 200 000 4k iop/s (100% random write) with only 2 zeus ram. It's just huge