Virtualised dual head setup under VMware ESX5i.

Added by Ashley Watson over 2 years ago

Hi, We were brainstorming some ideas this morning on Nexenta and have an idea.

Because ESX5i in standalone form is free, is it technically possible to create a dual head architecture using ESX5i hardware passthrough and 2 virtual instances of NexentaStor?

ie. One ESX5i host at commodity specs (eg. 96gb ram+2x6core CPUs+24 drives in SuperMicro chassis. 1 VM running NexentaStor (with 32gb ram, 4vCPUs) 1 VM running NexentaStor (with 32gb ram, 4vCPUs)

All 24 drives passed through to each of the VMs using VMware hardware pass through.

From what we've seen issues with storage drop outs etc are very rarely at the hardware layer, they are typically at the Nexenta layer (for whatever reason), so we could get all the benefits of a dual head configuration without the costs (apart from the Nexenta licensing).

Our main motivation for considering something like this is to avoid downtime during the inevitable patching of Nexenta and to increase reliability (and potentially throughput at the same time).

Has anyone created a virtualised dual head configuration under ESX5i like this or is it a complete non-starter?


Replies

RE: Virtualised dual head setup under VMware ESX5i. - Added by David Bond over 2 years ago

Not sure how it will perform, but the setup described will not be possible with the free version of ESXi 5. VMware has limited the amount of RAM accessible in ESXi5 free, to 32GB, so you would not be able to use 64GB of the RAM in the server and you would have to 2 x over commited with 32GB per VM. To do this you would have to run with ESXi4, or buy the enterprise plus edition of ESXi5, or an enterprise and standard to get to the 96GB of RAM you have.

RE: Virtualised dual head setup under VMware ESX5i. - Added by Roman Strashkin over 2 years ago

As i know we have some problems with pass-through of devices. Some months ago i tried pass-through on ESX4.1, but without results: Nexenta did not recognize disk-controller.

Thanks.

RE: Virtualised dual head setup under VMware ESX5i. - Added by Ashley Watson over 2 years ago

thanks guys, great comments - I wasn't aware of the RAM limitations on the free ESX5i as we use enterprise Plus VMware licenses for our development farms.

As a matter of interest we took an old SuperMicro host with 2x3ware 9650SE controllers in and were able to install ESX5i and pass through the controllers so we were able to install NexentaStor and get it running on pass through hardware. When we tried to bring up a second Nexenta instance on the same box, we saw a load of "drive failure" style messages - which was expected I guess.

So back to our original requirement, we want to be able to patch/reboot a Nexenta instance/stay up if an instance crashes without loosing storage connectivity to our farm.

Currently the only way right now (if we didn't want to replicate the storage via the SimpleHA option) seems to be a dual headed unit and shared disk - in the style of SuperMicro Storage Bridge Bay style solution or separate heads connected to shared disks. Or we need to run virtual storage appliances on top of the disk layer and use HA/replication technologies at the VSA layer.

Is there no other way using for example Solaris containers or zones so a single disk chassis could deliver a Nexenta HA style solution (with preferential Nexenta licensing as there wouldn't be full hardware redundancy)?

The big issue we have is that when we start pricing up a traditional dual head unit (including licensing) we quickly exceed the costs of commercial dual controller solutions from HP/IBM/NetApp/EMC (at least at the lower end).

Are there any solutions to make a "dual controller" style Nexenta box cost effective against the equivalent of say an HP P2000 G3?

RE: Virtualised dual head setup under VMware ESX5i. - Added by Ashley Watson over 2 years ago

bump (sorry). Any response would be appreciated.

RE: Virtualised dual head setup under VMware ESX5i. - Added by Linda Kateley over 2 years ago

Ashley,

I think the answer rally is how much down time can you have and how dynamic is the data? You can use tools like autotier to create a replica system to take the load while patching. You will then have to manually manage changes to data. Dual heads with jbods is the best practice, but there are also good practices and modest practices. The difference in all of them is how much data can you lose at failure? And how fast does it need to be? With dual heads you get the benefit of having zil devices available to both devices. You can always repopulate read cache, but write cache is.. well.. important.

I apologize that i am not that familiar with the different hardware offerings from the different vendors, but everyone has the same set of problems. How do you replicate and solve for data in flight? Is it done on the clients? if so then mirroring between 2 lowcost configs might be a workable solution. Can the clients afford minimal performance? Everything in architecture is a tradeoff and you have to determine what is most important. SSD's create great perf, but what if one fails? Often the decision has to be made as to what is most important, performance, recovery, dr or budget. Typically all can be satisified, but you can't make all of the results optimal. I can't always have cheap and fast and redundant.

RE: Virtualised dual head setup under VMware ESX5i. - Added by FREDY . over 2 years ago

Hi Ashley,

I am not sure if you have already setup a Nexenta HA scenario, but I can tell you with enough time on it and most of the time on troubleshooting that it doesn't work well as in theory.

It would be lovely if I we could just failover to apply the upgrades without much disruption to the system, but in practice is that the failover time in most cases will be significanlly high that virtual machine will eventually time out. Don't tell me to change the SCSI timeouts to something like 2 or 3 minutes because that's the same sa a server is down. Monitoring systems will detect the virtual machines are frozen and send alerts anyway. Vmware tools already change it if I am not wrong and it's not practical to change it again manually for every VM.

Anything over 45 seconds is unaccepatble for a failover time for a Enterprise solution, regardless how much storage there is there.

Yes, there are many and many upgrades released to be applyed but the headaches a failover can bring you are way over than not have your system updated. I just chosed this path. Won't update my systems until I do have to failover to do something like hardware maintenance or upgrades.

This is just one of the problems of a HA solution, but there are others. Besides the amount of time you have to spend on command line for a solution you would expect work out of the box.

Unless you really need HA for a solution that you want to minimze downtime (you can't think that you will not have any downtime really) go with a single node. It will save you a lot of headache of the HA plugin. Project your single node with enogh hardware that you will not nede to reboot it that soon like enough memory and a SCSI controller with an external port in order to plug more JBODs for upgrade.

I have single nodes and a dual head node and can tell you that nothing took so much of my time like the HA plugin, sometimes due bugs that were acknoledged by Nexenta and of course, left behind by QA.

RE: Virtualised dual head setup under VMware ESX5i. - Added by Ashley Watson over 2 years ago

thanks guys for your comments. Some of these things are starting to alarm me.

On a workgroup level iSCSI SAN like an MSA 2312 G2/P2000 G3 (or the FC equivalent or similar rebadged DotHill units) with dual controllers (we currently run our development loads on multiple of these units), a controller can crash and our storage stays up to our VMware hosts. Configuration is trivial and performance is adequate. We can update controllers one at a time and the storage doesn't drop to our ESXi hosts. when a controller fails, failover to the other controller is virtually instant. We buy the SANs with with 1GbE, 10GbE or 4/8GB FC out the box and don't pay any extra for FC licenses etc.

Am I right in thinking there is no equivalent configuration that works at-least as well as that from Nexenta at a similar price point?

We love NexentaStor due to the flexibility it provides. It makes a fantastic backup target and outperforms our MSAs by a considerable margin, but at the end of the day any application requiring primary storage for VMware hosts needs to have good availability and to be able to withstand basic failure through redundancy.

Otherwise, I guess we need to present the raw storage via NexentaStor (or similar) and then run an additional layer of virtual storage appliance that is able to handle the HA in a simpler manor (eg. Lefthand VSA, storMagic svSVA, SANsymphony-V, Netapp V etc)?

I'm just looking for possible solutions here.

RE: Virtualised dual head setup under VMware ESX5i. - Added by FREDY . over 2 years ago

Hi Ashley,

You are right on your assumption. What I found is that the time you spend with a HA configuration is the same you would pay extra for a P2000 G3 that doesn't have all this hassle, and FC licenses. And sometimes times and deadlines can be fairly expensive. So again depends for what you will use Nexenta can have its great advantages like the performance and hybrid storage. But as a rule I would advise you to stay away from the HA plugin if you can. Sometimes when you are trying to save is only a false saving if you consider time and/or realibility/stability.

RE: Virtualised dual head setup under VMware ESX5i. - Added by Steve Van over 2 years ago

Fredy,

Other than the lag time failing from one node to another (agree with you, any lag that is enough to cause a timeout for the VM's on a data store is unacceptable) what other actions do you have to take regularly that require the command line for HA.

We are in the process of purchasing a unit which includes HA and your comments bring up some very important issues.

Any thoughts would be appreciated.

~Steve

RE: Virtualised dual head setup under VMware ESX5i. - Added by David Bond over 2 years ago

We have the HA cluster plugin, we have had one problem with it, upgrading from 3.0.5 to 3.1.1. It lost all our iscsi mapping, as when we updated one head, it wiped our mappings, which it the replicated to the other head, wiping them. The plugin otherwise has worked well, failover could be quicker, but no timeouts on esxi or Linux, if you set the initiator settings correctly. Once setup nothing has needed to be done.

RE: Virtualised dual head setup under VMware ESX5i. - Added by Steve Van over 2 years ago

Thanks for the response David.

Did you have to change anything in relation to timeouts on the initiator?

About how long does failover take for you if you can recall?

Thanks!

~Steve

RE: Virtualised dual head setup under VMware ESX5i. - Added by David Bond over 2 years ago

Hi,

With ESXi, we didnt have to do anything, but with linux we did, we had to change the retry rate and retry time outs (nexenta have documentation on this).

The failover speed depends upon the number of drive in the pool and the number of file systems in the pool. We have 2 pools, split between 128 x (1TB) discs plus 4 x 8GB zeus RAMs and 12 x C300 (256GB) L2ARC. and at the moment there is around 12 file systems on the pool. It takes around 2 minutes to import the pool on the other head if one goes down. The L2ARC is in the head, so on failover the pool is degraded, but I dont think that affects the speed as we did some tests recently with out them, and it was around the same amount of time.

The time for failover is long (longer than the documentations 20-30 seconds) but when the timeouts are set on the servers connecting it doesnt cause any problems apart from read/writes hanging, until it comes back.

RE: Virtualised dual head setup under VMware ESX5i. - Added by Steve Van over 2 years ago

Great, thanks for the info David.

Ashley, sorry to hijack your thread. ;o)

~Steve

RE: Virtualised dual head setup under VMware ESX5i. - Added by FREDY . over 2 years ago

Hi, The time to failover is the most affecting thing. When you have hundreds of systems running on the top of the storage it becomes even more concerning. I strongly beleive no storage, regardless the conditions, can take more than 45 seconds to failover (60 on very bad conditions). It's not practical to change each server's SCSI timeout individually and also even if you do that the monitoring systems will detect the system is frozen and will send alerts or become unresponsive as if they were rebooted. Far from Enterprise usage.

Steven, answering your question we had problems also with mappings disapering on previous versions and they seem aparentlly to have been fixed now, but it was quiet disapointing it wasn't tested properly and catch by QA. Also some other minor stuff on the web interface regading HA setup not showing up correctlly.

I re-interate to avoid the HA plugin if you can, otherwise plan well ahead and make sure you have enough time to look after this. As I said depending on the usage it can work well otherwise the extra complexity will only be false savings.

RE: Virtualised dual head setup under VMware ESX5i. - Added by Chris Young over 2 years ago

I can't speak for the other vendors like lefthand, emc, etc.

But, having worked on netapp sans -- their failover/HA setup is quite impressive by not having to have 2x the storage with it replicating. There's also no fussing around with making sure that you have the same setup for specific machines/vms/etc for anything like iscsi, nfs, luns, etc. On a moderate load system, doing a NDU or reboot on one head is a great feature.

Until open-e/nexenta come out with a solution that provides similar capabilities -- a dual head solution on these platforms is going to run you roughly the same amount as a netapp solution with way less dependability.

The Supermicro SBB system has been something I've been looking at. But, the dual motherboard setup looks more to be for hardware protection and not running two instances of the OS/etc.

RE: Virtualised dual head setup under VMware ESX5i. - Added by Linda Kateley over 2 years ago

we can typical tune our clusters to optimal failover times, with the use of our professional service. Because every install is different hardware and clients, we typically need to do some tuning.

RE: Virtualised dual head setup under VMware ESX5i. - Added by FREDY . over 2 years ago

Linda,

I disagree with this approach. Any "Enterprise" storage system the failvoer has hardly to be touched and should be something standard and auto adjustable to any supported hardware size.

If this doesn't happen with Nexenta because they didn't spent enough time researching what the ideal settings are they can not make customers pay extra money to get something that you don't even need to think about on any other storage system. I personally never heard of anyone having to touch failover configuration on any of the most known storage systems.

As I pointed in one of my previous posts regardless the hardware configuration or size a failover time more than 30 - 45 seconds is not acceptable to any Enterprise grade storage. A little bit more than that start to cause Linux machines to mount filesystems as ready only if using the default settings, besides the guest OS will hang and cause systems be unresponsive for too long and alerts go off.

RE: Virtualised dual head setup under VMware ESX5i. - Added by Linda Kateley over 2 years ago

We should in the next few months come out with some standard configs. We are enterprise storage, but we depend on all kinds of different hardware. When you look at the other enterprise systems, they sell you the disks at a premium, we let you get the disks from the lowest price vendor for a lower total cost of owernship.

In the high end enterprise configs we create we try to help customer configure the hardware to workload and ha demands. We aren't going to quote numbers for failover, because all hardware configs are different.. you might have a single 5400 rpm drive and 100 mb connection :)

RE: Virtualised dual head setup under VMware ESX5i. - Added by Ashley Watson over 2 years ago

my 2c worth here...

  • Entry level commercial SANs (HP/NetApp/IBM/Dell/DotHill etc etc) all come in dual controller flavours to provide HA at an entry level price. If multipathing is set up correctly at the vSphere level (trivial), then a single controller can die and all the LUNs stay up and accessible without any interruption/stall. We have had controller failure on an HP MSA SAN due to electrical failure, and we've never crashed and have been able to refresh the firmware on each controller without taking down the SAN. The ability to have instant automatic failover when a controller fails is a basic requirement of any SAN providing iSCSI storage for vSphere (even for our development loads). In it's current form, there appears to be no out-the-box equivalent of this under the Nexenta platform due to it's open architecture/flexibility.

  • I fully agree that until Nexenta can find a solution to be able to run 2 copies of the Nexenta software on the same piece of metal to mitigate the risks of a software crash/hang of an instance and to be able to patch an individual instance without bringing down the storage, Nexenta reliability is always going to be questioned. It's the potential application layer failure that worries us more than the hardware failure to be honest because every decent piece of server grade hardware (eg SuperMicro) has enough redundancy in it to stay up for years.

  • I still have real issues with Nexenta not being fully certified for use with vSphere 5 and VAAI. The only link that mentions the Nexenta VSA is this one and this is not for vSphere 5; http://partnerweb.vmware.com/comp_guide2/detail.php?deviceCategory=san&productid=20052&vcl=true

Things have got better over the last 6 months, but Nexenta is not ready to take a kick at the traditional enterprise storage vendors until these very basic issues can be resolved IMHO.

RE: Virtualised dual head setup under VMware ESX5i. - Added by FREDY . over 2 years ago

Couldn't agree more Ashley, and that's why I have been stressing about this subject. As you said, it is a basic requirement.

To be fair even SAN's like HP/NetApp run two 'systems', one on each controller like two head nodes, but the failover times are much lower as you described, and that is independent of the size of the hardware nor requires fine tunning. Upgrade a Nexenta node, even on a HA, hasn't been for me as trivial and safe for me as upgrading a firmware on one of these other solutions mentioned.

With regards the certification I don't think Nexenta will ever be due its nature, but I do beleive that should be some changes on the way VMware certifies solutions and that should be pushed mainly by Nexenta Systems as the most interested. And in the meantime make sure their customers are not refused support from VMware depending on the situation. VSA is a different solution and only for small businesses I would say.

RE: Virtualised dual head setup under VMware ESX5i. - Added by Edmund White about 1 year ago

Has there been any movement on this?

I've had a tough time pushing NexentaStor to certain clients simply because of the availability of entry-level SANs like the HP P2000 G3. At an equal price-point, I can build an HP ProLiant-based Nexenta system that will greatly outperform an HP P2000 G3... But the P2000 G3 still has the dual-controller configuration. Achieving the same on the Nexenta solution more than doubles the cost; considering licensing, the need for another server, and a more complex SAS cabling/JBOD setup.

RE: Virtualised dual head setup under VMware ESX5i. - Added by James Hess about 1 year ago

Edmund White wrote:

I've had a tough time pushing NexentaStor to certain clients simply because of the availability of entry-level SANs like the HP P2000 G3. At an equal price-point, I can build an HP ProLiant-based Nexenta system that will greatly outperform an HP P2000 G3... But the P2000 G3 still has the dual-controller configuration. Achieving the same on the Nexenta solution more than doubles the cost; considering licensing, the need for another server, and a more complex SAS cabling/JBOD setup.

I agree with this. While there are plenty of situations where the case is to be made for NexentaStor. For Mid-sized Enterprise primary storage where HA would be required, but the capacity requirement is 14TB or less, Nexentastor just isn't economical, unless you are comparing it to the most expensive arrays on the market.

You easily spend $15k on Nexenta software, and still have to buy $20k extra worth of hardware to have a certified reliable supported solution.

While array vendors have solid hardware, software bundles on the market for $20k.

Telling the client "But it's open"; "based on open source technology with just a few proprietary bits", doesn't suddenly convince the client that the product is worth 40% more.

Content-Type: text/html; charset=utf-8 Set-Cookie: _redmine_session=BAh7BiIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--cebfb08d300a85bd88dafd1422210ebe7c9a5873; path=/; HttpOnly Status: 500 Internal Server Error X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.0.3 ETag: "0a3171d826431316bbf49b68161e427f" X-Runtime: 1908ms Content-Length: 35224 Cache-Control: private, max-age=0, must-revalidate redMine 500 error

Internal error

An error occurred on the page you were trying to access.
If you continue to experience problems please contact your redMine administrator for assistance.

Back