Monthly Archives: July 2016

Automated Storage Reclaim on Nutanix Acropolis Hypervisor (AHV)

Storage reclaim can be a major painpoint for organizations running traditional three tier architecture.  With companies trying to deal with an ever increasing storage footprint, many of them turn to manual storage reclaim procedures to get storage capacity back after files or virtual disk images are deleted.  These problems are exhuberated if you are running vSphere as your Hypervisor on traditional VMFS formatted LUN’s, as any Storage vMotion or Snapshot operation’s results in space not automatically being reclaimed from the source LUN’s.  An example of the manual procedures required in running reclaim on traditional infrastructure running vSphere 6 is outlined below.

  • If EnableBlockDelete advanced feature is not enabled on the ESXi host to identify disks as thin, block zeroing software must be run at the guest level using software such as sdelete. This will create I/O overhead so should be run during a maintenance window
  • Next, after the storage is reclaimed at the guest level, the VMFS LUN must be reclaimed by running esxcli storage vmfs unmap on each LUN manually via the esxi command line. Due to performance overheads, this task is not scheduled automatically.
  • Finally, the storage needs to be re-claimed at the array level by executing a command to reclaim zero pages. On some arrays this is automated, but not all.

As you can see this is a manual three step process that creates operational overhead.  Whilst these procedures could potentially be automated, I don’t really see the point in automating operations on legacy infrastructure when you instead have the options today of replacing your infrastructure with smarter software-defined solutions that automate themselves.  This provides one less thing for your system administrators or automation team to worry about.

Earlier, I blogged about SCSI unmap capabilities using Acropolis Block Services on Acropolis Base Software ver 4.7 HERE.  The good new’s is this functionality is also built into Nutanix’s own free Hypervisor, AHV.  This works by the virtual disks issuing SCSI unmap’s directly from the guest level through to the Nutanix Distributed Storage Fabric.

As you can see below, using acropolis base software less than version 4.7 shows the disks as Standard Hard Disks in Windows 2012 R2.  The reason for this is previously the unmap commands were not translated from the guest OS down to the virtual disk layer.

Screen Shot 2016-07-27 at 2.47.38 PM

After you upgrade to Acropolis 4.7,  all disks are shown as Thin Provisioned in disk optimizer as it now supports unmap.  For the disks to show as thin, the VM first requires a reboot after acropolis base software is updated to 4.7.  If a disk is hot-added to the VM before reboot, the new disk will show as thin. Either way, the change is transparent.

Screen Shot 2016-07-27 at 4.46.37 PM

To show SCSI UNMAP in action on Acropolis Hypervisor, I tested across three Jetstress Exchange 2013 VM’s full of 4.7TB of Databases

image002

After the databases were deleted, Windows converted the file delete operation into a corresponding SCSI UNMAP request as per the following MSDN article.  https://msdn.microsoft.com/windows/hardware/drivers/storage/thin-provisioning

To hurry it up, I also manually ran an optimize operation using the optimize drives utility to reclaim space.  Whilst file delete worked, I noticed the process was slightly faster using disk optimizer as I assume file delete operations queues SCSI unmap’s in the background.

image001

After optimizer was run you can see the used storage is back to a couple of GB’s per VM.

image003

This is just another way Nutanix makes everyday operations invisible, as your admins no longer need to worry about manually reclaiming storage.  It just works. As there are no LUN’s, you also don’t need to worry about doing a LUN level reclaim either.

Together with space saving measures such as Compression, De-Duplication, Erasure coding and now SCSI unmap, this allows you to make the most out of the usable capacity on your Nutanix Infrastructure without wasting capacity due to dead space.

Special thanks to Josh Odgers for letting me delete files on his test Exchange Server VM’s to tryout the unmap capabilities available in Acropolis Hypervisor.

Testing SCSI Unmap on Acropolis Block Services

One of the new features that is available with Acropolis Block Services on Nutanix Acropolis 4.7 is SCSI unmap support.  SCSI unmap (similar to TRIM used on SSD’s) is part of the T10 SCSI spec that commands a disk region to deallocate its deleted space so it can be reused when data is removed.  The only way to reclaim storage at the virtualized guest level with virtual disks that do not support unmap, is to manually write a string of zero’s at the guest level. Depend on your storage vendor, this would potentially cause a performance impact due to the large amount of writes required for this operation.  With Windows Server 2012 R2, unmap is scheduled weekly using disk defragmenter, or “optimize drives” as it is now called.

Screen Shot 2016-07-26 at 4.59.36 PM

As you can see in the Optimize Drives Screenshot, the Media type is different according to if the disk supports unmap or not.

  • Hard disk Drive – In this case, this is a traditional VMDK stored on a NFS volume since my Hypervisor is vSphere. Even though the VMDK is thin, traditional defragmentation will be performed during optimization as SCSI unmap capabilities are not translated to NFS commands.  The only to reclaim storage is this case is to write a string of zeroes in the free space of the disk which will be picked up and reclaimed during a Nutanix Curator Scan (Every six hours)
  • Thin Provisioned drive – This is a Nutanix vdisk presented to the Guest via in-guest iSCSI using Acropolis block services. Since the disk reports itself as Thin, defrag is disabled and the disk will send unmap commands instead of defrag when the “Optimize” button is clicked.  This space will then be reclaimed on the Nutanix distributed storage fabric.

To see these capabilities in action, I have created a single 200GB vdisk in a volume group and attached it to a Windows 2012 R2 Test VM running vSphere 6.0.  I formatted the disk as “F:” and then filled the disk up with 100GB of data as you can see in the below screenshots.

Screen Shot 2016-07-26 at 9.14.33 PM

Screen Shot 2016-07-26 at 5.48.17 PM

You can see the virtual disk now reports it’s physical usage as 99.59 GiB in Nutanix Prism

Screen Shot 2016-07-26 at 9.17.08 PM

Next I deleted the files, and performed an optimize operation using disk defragmenter.  Unmap is then performed on the thin disks, You can track progress by seeing how much the disk is “Trimmed” in the current status tab.

Screen Shot 2016-07-26 at 9.38.16 PM

It only took around 10 seconds to unmap 100GB of data, much quicker and less resource intensive than manually writing zero’s to the disk.

Now you can see the disk physical usage is back to it’s original capacity in Nutanix Prism.

Screen Shot 2016-07-26 at 5.01.00 PM

As you can see there is not much effort to reclaim storage when using acropolis block services.  This functionality also works if you are using Acropolis Hypervisor (AHV), as all vdisks are presented using iSCSI that supports unmap.  I will be testing this in a future blog post.

 

 

 

Extend your Nutanix ROI and save licencing costs by converting existing nodes to storage only

Storage only Nutanix nodes have now been available for over a year now, and have seen great adoption with Nutanix customers looking to solve the issue of an ever increasing storage footprint.

This solves one of the false arguments that some people have against hyper-converged infrastructure stating that it is required to scale compute at the same time as storage.  Not the case with Nutanix.  The storage only nodes run Acropolis Hypervisor (AHV) with a single CVM to form the storage pool, so do not require separate vSphere licencing.

Adding storage nodes are simple.  Simply connect them to the Network, power them on, wait for them to be auto discovered, add IP addresses, then click expand in Nutanix Prism. The storage container presented to vSphere via NFS is then automatically extended with no downtime. No knowledge of KVM is required as the Storage Only nodes are managed via Nutanix Prism. Much less complex than expanding a legacy storage array by adding disk shelves and carving out new LUNS.

One of the new announcements amoung many others that came out of the Nutanix .NEXT 2016 conference in Las Vegas was the ability to convert any Nutanix Nodes to Storage Only.  This functionality is expected to be released in a future Nutanix software update around October of 2016. My colleague Josh Odgers gives a good overview of this capability on his blog here http://www.joshodgers.com/2016/06/15/whats-next-2016-any-node-can-be-storage-only/

An interesting scenario you could do with this capability would be to convert your old existing Nutanix nodes to storage only when you are ready for an expansion of your existing Nutanix infrastructure. Since Intel CPU’s get much faster during new release cycles and also support larger amounts of memory,  this allows you to potentially decrease vSphere cluster sizes and run even more VM’s on less hardware without needing to reduce storage capacity.  Since the Storage only nodes still run a CVM that is part of the Nutanix cluster, there is no loss in overall cluster resiliency and this will also allow you to explore enabling new  functionality such as Erasure coding (ECX) to your cluster.

Converting nodes to storage only also has the distinct advantage of not mixing a vSphere cluster with non-homogenous nodes with different CPU  generations and RAM capacity. Whilst this is supported via VMware EVC, it is not always ideal if you want consistent memory capacity and CPU performance across your vSphere cluster.  EVC will need to mask some newer CPU features which could cause a decrease in performance.   Also VMware HA will also need some careful calculations to ensure there is enough capacity in your cluster if one of the larger hosts fails.

A good example of converting nodes to storage only would be if you had previously purchased a cluster of NX-6020 nodes which were orginally designed for storage heavy applications that do not require much compute capacity

The NX-6020 was released back in 2013 with the following per node specifications

Dual Intel Ivy Bridge E5-2630v2
[12 cores / 2.6 GHz]
1x 800GB SSD
5x 4TB HDD
128GB RAM

Now say you had purchased a cluster of four of these models, and are running ESXi as your hypervisor. The cluster was purchased in 2014 with a five year warranty for the purpose of virtualizing an Email Archiving application, that is typically a low IOP’s high storage capacity workload.

After the initial success of virtualizing this application you decide to migrate some more compute and memory heavy workloads, so decide to go with four  NX-3060-G5 node’s with the following specifications

Duel Intel Broadwell E5-2680v4
[28 cores / 2.4 GHz]
2x 800GB SSD
4x 2TB HDD
512GB RAM

With this new purchase, you have the following options for Expansion.  Nutanix supports adding hosts of different capacity to the same cluster.

  • Add the new nodes to the Existing Nutanix Cluster, and existing vSphere Cluster for a total vSphere cluster size of 8 hosts with varying RAM and CPU sizes.  EVC will need to be enabled at the Ivy Bridge generation.
  • Add the new nodes to the Existing Nutanix Cluster, and create a new vSphere cluster of four hosts.  You will have two vSphere clusters consisting of a total of 8 hosts.
  • Add the new nodes to the Existing Nutanix Cluster.  Create a new vSphere cluster and vMotion VM’s to this cluster.  Convert old nodes to storage only. You will have a single vSphere cluster consisting of four hosts of more CPU and RAM capacity than was available on the NX-6020, whilst still re-using your existing storage.

The following grapths show the perfomance and capacity differences between four nodes of NX-6020 vs four nodes of NX-3060-G5. The CPU performance alone is almost three times the capacity when you look at the Combined Spec Int Rating of the CPU’s.  The Ivy Bridge CPU’s have a spec-int rating of 442 per node whilst the new broadwell cpu’s have a spec-int rating of 1301.  As the new nodes are compute heavy, the NX-6020 nodes have more storage capacity due to utilising larger 4TB 3.5inch drives.

size

During sizing calculations you determine you can easily fit the new workloads along with the existing email archiving application on a single vSphere cluster of four nodes, whist the existing NX-6020 cluster has been converted to serve storage only to the cluster.   This will give us a total usable capacity of 52.16 TB as you can see in the following diagram.  Sizing has been done using http://designbrewz.com/ and shows usable, not RAW capacity.

rf2

But wait, theres more! Since the Nutanix Cluster has been upgraded from four nodes to eight, you also decide to now enable Erasure Coding on the cluster for even more capacity savings. Whilst this is supported on four nodes with a 2/1 stripe size, it is recommended to be used on larger Nutanix Clusters of more than six nodes, to take advantage of a 4/1 stripe size. With Erasure Coding Enabled, our usable capacity has now increased from 52.16TB to 83.45TB, calculated at 1.25x RAW capacity overhead for RF2+ECX  vs 2x for RF2 only.

So what have we gained with this addition of four NX-3060-G5 nodes and converting our old nodes to storage only?

  • vSphere Licencing costs have stayed exactly the same with the addition of new Hardware.
  • Our RAM capacity has increased by four times by upgrading each node from 128GB to 512GB
  • Our CPU Capacity has tripled, whilst the socket count is the same.
  • Our storage capacity has more than doubled even though we added compute heavy nodes, by taking advantage of erasure coding delivered entirely in software
  • Cluster resilency and performance has increased by increasing CVM’s from four to eight
  • Rack Space has only increased by 2RU in the datacentrer

Not a bad result, and really show’s the advantages you gain by running software defined infrastructure that increases value and ROI purely via software updates throughout it’s lifecycle.