SAN | The IT Melting Pot!

Under the Covers of a Distributed Virtual Computing Platform – Built For Scale and Agility – via @dlink7, #Nutanix

November 21, 2013 Richard Egenas Leave a comment

I must say that Dwayne did a great job with this blog post series!! It goes into expelling the Nutanix Distributed File System (NDFS) that I must say is the most amazing enterprise product out there if you need a truly scalable and agile Compute and Storage platform! I advise you to read this series!!

Under the Covers of a Distributed Virtual Computing Platform – Part 1: Built For Scale and Agility

Lots of talk in the industry about how had software defined storage first and who was using what components. I don’t want to go down that rat hole since it’s all marketing and it won’t help you at the end of the day to enable your business. I want to really get into the nitty gritty of the Nutanix Distributed Files System(NDFS). NDFS has been in production for over a year and half with good success, take read of the article on the Wall Street Journal.

Below are core services and components that make NDFS tick. There are actually over 13 services, for example our replication is distributed across all the nodes to provide speed and low impact on the system. The replication service is called Cerebro which we will get to in this series.

This isn’t some home grown science experiment, the engineers that wrote the code come from Google, Facebook, Yahoo where this components where invented. It’s important to realize that all components are replaceable or future proofed if you will. The services\libraries provide the API’s so as newest innovations happen in the community, Nutanix is positioned to take advantage.

All the services mentioned above run on multiple nodes in cluster a master-less fashion to provide availability. The nodes talk over 10 GbE and are able to scale in a linear fashion. There is no performance degradation as you add nodes. Other vendors have to use InfiniBand because they don’t share the metadata cross all of the nodes. Those vendors end up putting a full copy of the metadata on each node, this eventually will cause them to hit a performance cliff and the scaling stops. Each Nutanix node acts a storage controller allowing you to do things like have a datastore of 10,000 VM’s without any performance impact… continue reading part 1 here.

Under the Covers of a Distributed Virtual Computing Platform – Part 2: ZZ Top

In case you missed Part 1 – Part 1: Built For Scale and Agility

No it’s not Billy Gibbons, Dusty Hill, or drummer Frank Beard. It’s Zeus and Zookeeper providing the strong blues that allow the Nutanix Distributed File System to maintain it’s configuration across the entire cluster. Read more…

Categories: All, Infrastructure, Nutanix, Storage Tags: agility, amazing enterprise, Apache, availability, Cassandra, cluster configuration, community, controller, Curator, disk, DRAC, ESXi, facebook, flash, Google, Hadoop, IaaS, ILO, IP, IPMI, Medusa, NDFS, Nutanix, Nutanix Distributed File System, oplog, performance, Prism, REST-API, SAN, scalability, scale, scale-out, Stargate, Stats, vDisk, Virtual, virtual computing platform, Wall Street, workloads, Yahoo, Zookeeper

Single File Restore – Fairy Tale Ending Going Down History Lane – via @Nutanix and @dlink7

November 21, 2013 Richard Egenas Leave a comment

Great blog post by Dwayne Lessner!

If I go back to my earliest sysadmin days where I had to restore a file from a network share, I was happy just to get the file back. Where I worked we only had tape and it was crapshoot at the best of times. Luckily, 2007 brought me a SAN to play with.

The SAN made it easier for sure to go back into time and find that file and pull it back from the clutches of death by using hardware based snapshots. It was no big deal to mount the snapshot to the guest but fighting with the MS iSCSI initiator got pretty painful, partly because I had a complex password for the CHAP authentication, and partly because clean-up and logging out of the iSCSI was problematic. I always had ton of errors, both in the windows guest and in the SAN console which caused more grief than good it seemed.

Shortly after the SAN showed up, VMware entered my world. It was great that I didn’t have to mess with MS iSCSI initiators any more but it really just moved my problem to the ESXi host. Now that VMware had the LUN with all my VMs, I had to worry about resignatureing the LUN so it wouldn’t have conflicts with the rest of production VMs. This whole process was short lived because we couldn’t afford all the space the snapshots were taking up. Since we had to use LUNS we had to take snapshots of all the VMs even though there were a handful that really need the extra protection. Before virtualization we were already reserving over 50% of the total LUN space because snapshots were backed by large block sizes and ate through space. Due to the fact that we had to snapshot all of the VMs on the LUN we had to change the snap reserve to 100%. We quickly ran out of space and turned off snapshots for our virtual environment.

When a snapshot is taken on Nutanix, we don’t copy data, nor do we copy the meta-data. The meta-data and data diverge on a need basis; as new writes happen against the active parent snapshot we just track the changes. Changes operate at the byte level which is a far cry from the 16 MB I had to live with in the past.

Due to the above-mentioned life lessons in LUN-based snapshots, I am very happy to show Nutanix customers the benefits of per-VM snapshots and how easy it to restore a file.

To restore a file from a VM living on Nutanix you just need to make sure you have a protection domain set up with a proper RPO schedule. For this example, I created a Protection Domain called RPO-High. This is great as you could have 2,000 VMs all on one volume with Nutanix. You just slide over what VMs you want to protect; in this example, I am protecting my FileServer. Note you can have more than one protection domain if you want to assign different RPO to different VMs. Create a new protection domain and add 1 VM or more based on the application grouping.

#Gartner report – How to Choose Between #Hyper-V and #vSphere – #IaaS

November 19, 2013 Richard Egenas Leave a comment

The constant battle between the hypervisor and orchestration of IaaS etc. is of course continuing! But it is really fun I must say that Microsoft is getting more and more mature with it’s offerings in this space, great job!

One of the things that I tend to think most of is the cost, scalability and flexibility of the infrastructure that we build and how we build it, I often see that we tend to do what we’ve done for so many years now. We buy our SAN/NAS storage, we buy our servers but lean towards Blade servers though we think that’s the latest and coolest, and then we try to squeeze that into some sort of POD/FlexPods/UCS or whatever we like to call it to find our optimal “volume of Compute, Network and Storage” that we can scale. But is this scalable like the bigger cloud players like Google, Amazon etc.? Is this 2013 state of the art? I think that we’re just fooling ourselves a bit and build whatever we’ve done for all these years and don’t really provide the business with anything new… but that’s my view… I know what I’d look at and most of you that have read my earlier blog posts know that I love the way of scaling out and doing more like the big players using something like Nutanix and ensure that you choose the right IaaS components as a part of that stack, as well as the orchestration layer (OpenStack, System Center, CloudStack, Cloud Platform or whatever you prefer after you’ve done your homework).

Back to the topic a bit, I’d say that the hypervisor is of no importance anymore, that’s why everyone if giving it away for free or to the open source community! Vendors are after the more IaaS/PaaS orchestration layer and get into that because if they get that business then they have nested their way into your business processes, that’s where ultimately that will deliver the value as IT services in an automated way once you’ve got your business services and processes in place, and then it’s harder to make a change and they will live fat and happy on you for some years to come! 😉

There was a big flash, and then the dinosaurs died – via @binnygill, #Nutanix

November 15, 2013 Richard Egenas Leave a comment

Great blog post by @binnygill! 😉

This is how it was supposed to end. The legacy SAN and NAS vendors finally realize that Flash is fundamentally different from HDDs. Even after a decade of efforts to completely assimilate Flash into the legacy architectures of the SAN/NAS era, it’s now clear that new architectures are required to support Flash arrays. The excitement around all-flash arrays is a testament to how different Flash is from HDDs, and its ultimate importance to datacenters.

Consider what happened in the datacenter two decades ago: HDDs were moved out of networked computers, and SAN and NAS were born. What is more interesting, however, is what was not relocated.

Although it was feasible to move DRAM out with technology similar to RDMA, it did not make sense. Why move a low latency, high throughput component across a networking fabric, which would inevitably become a bottleneck?

Today Flash is forcing datacenter architects to revisit this same decision. Fast near-DRAM-speed storage is a reality today. SAN and NAS vendors have attempted to provide that same goodness in the legacy architectures, but have failed. The last ditch effort is to create special-purpose architectures that bundle flash into arrays, and connect it to a bunch of servers. If that is really a good idea, then why don’t we also pool DRAM in that fashion and share with all servers? This last stand will be a very short lived one. What is becoming increasingly apparent is that Flash belongs on the server – just like DRAM.

For example, consider a single Fusion-IO flash card that writes at 2.5GB/s throughput and supports 1,100,000 IOPS with just 15 microsec latency (http://www.fusionio.com/products/iodrive2-duo/). You can realize these speeds by attaching the card to your server and throwing your workload at it. If you put 10 of these cards in a 2U-3U storage controller, should you expect 25GB/s streaming writes, and 11 million IOPS at sub millisecond latencies. To my knowledge no storage controller can do that today, and for good reasons.

Networked storage has the overhead of networking protocols. Protocols like NFS and iSCSI are not designed for massive parallelism, and end up creating bottlenecks that make crossing a few million IOPS on a single datastore an extremely hard computer science problem. Further, if an all-flash array is servicing ten servers, then the networking prowess of the all-flash array should be 10X of that of each server, or else we end up artificially limiting the bandwidth that each server can get based on how the storage array is shared.

No networking technology, whether it be Infiniband, Ethernet, or fibre channel can beat the price and performance of locally-attached PCIe, or even that of a locally-attached SATA controller. Placing flash devices that operate at almost DRAM speeds outside of the server requires unnecessary investment in high-end networking. Eventually, as flash becomes faster, the cost of a speed-matched network will become unbearable, and the datacenter will gravitate towards locally-attached flash – both for technological reasons, as well as for sustainable economics.

The right way to utilize flash is to treat it as one would treat DRAM — place it on the server where it belongs. The charts below illustrate the dramatic speed up from server-attached flash.

Continue reading here!

//Richard

Categories: All, Nutanix, Storage Tags: array, computing, DRAM, flash, HDD, IaaS, IOPS, iSCSI, latency, NAS, NFS, Nutanix, SAN, Server, storage, Virtual

Solving the Compute and Storage scalability dilemma – #Nutanix, via @josh_odgers

October 24, 2013 Richard Egenas Leave a comment

The topic of Compute, Network and STORAGE is a hot topic as I’ve written in blog posts before this one (How to pick virtualization (HW, NW, Storage) solution for your #VDI environment? – #Nutanix, @StevenPoitras) … and still a lot of colleagues and customers are struggling with finding better solutions and architecture.

How can we ensure that we get the same or better performance of our new architecture? How can we scale in a more simple and linear manner? How can we ensure that we don’t have a single point of failure for all of our VM’s etc..? How are others scaling and doing this in a better way?

I’m not a storage expert, but I do know and read that many companies out there are working on finding the optimal solution for Compute and Storage, and how they can get the cost down and be left with a more simple architecture to manage…

This is a topic that most need to address as well now when more and more organisations are starting to build their private clouds, because how are you going to scale it and how can you get closer to the delivery that the big players provide? Gartner even had Software-Defined-Storage (SDS) as the number 2 trend going forward: #Gartner Outlines 10 IT Trends To Watch – via @MichealRoth, #Nutanix, #VMWare

Right now I see Nutanix as the leader here! They rock! Just have a look at this linear scalability:

If you want to learn more how Nutanix can bring great value please contact us at EnvokeIT!

For an intro of Nutanix in 2 minutes have a look at these videos:

Overview:

True or False: Always use Provisioning Services – #Citrix, #PVS, #MCS

August 29, 2013 Richard Egenas 1 comment

Another good blog post from Daniel Feller:

Test your Citrix muscle…

True or False: Always use Provisioning Services

Answer: False

There has always been this aura around Machine Creation Services in that it could not hold a candle to Provisioning Services; that you would be completely insane to implement this feature in any but the simplest/smallest deployments.

How did we get to this myth? Back in March of 2011 I blogged about deciding between MCS and PVS. I wanted to help people decide between using Provisioning Services and the newly released Machine Creation Services. Back in 2011, MCS an alternative to PVS in that MCS was easy to setup, but had some limitations when compared to PVS. My blog and decision tree were used to help steer people into the PVS route except for the use cases where MCS made sense.

Two and a half years passed and over that time, MCS has grown up. Unfortunately, I got very busy and didn’t keep this decision matrix updated. I blame the XenDesktop product group. How dare they improve our products. Don’t they know this causes me more work?

It’s time to make some updates based on improvements of XenDesktop 7 (and these improvements aren’t just on the MCS side but also on the PVS side as well).

So let’s break it down:

Hosted VDI desktops only: MCS in XenDesktop 7 now supports XenApp hosts. This is really cool, and am very happy about this improvement as so many organizations understand that XA plays a huge part in any successful VDI project.
Dedicated Desktops: Before PVD, I was no fan of doing dedicated VDI desktops with PVS. With PVD, PVS dedicated desktops is now much more feasible, like it always was with MCS
Boot/Logon Storms: PVS, if configured correctly, would cache many of the reads into system memory, helping to reduce the Read IOPS. Hypervisors have improved over the past 2 years to help us with the large number of Read disk operations. This helps lessen the impact of the boot/logon storms when using MCS.
Read more…

Categories: All, Citrix, Provisioning Services, XenDesktop Tags: blade, Citrix, client, IOPS, Machine Creation Services, MCS, OS, physical, Provisioning Services, PVS, SAN, Server, VDA, Virtual, VM, XenApp, XenDesktop

The IT Melting Pot!

Archive

Under the Covers of a Distributed Virtual Computing Platform – Built For Scale and Agility – via @dlink7, #Nutanix

Single File Restore – Fairy Tale Ending Going Down History Lane – via @Nutanix and @dlink7

#Gartner report – How to Choose Between #Hyper-V and #vSphere – #IaaS

There was a big flash, and then the dinosaurs died – via @binnygill, #Nutanix

Solving the Compute and Storage scalability dilemma – #Nutanix, via @josh_odgers

True or False: Always use Provisioning Services – #Citrix, #PVS, #MCS

Follow Blog via Email

My twittering…

Recent Posts

Azure

CitrixBlogs

Categories

Meta

Archive

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Follow Blog via Email

Recent Posts

Categories

Meta