Category Archives: VMware

VMware vSphere 6.5 Host Resources Deep Dive – A Must Have!

Just after I joined Zettagrid in June of 2013 I decided to load up vSphere 5.1 Clustering Deepdive by Duncan Epping and Frank Denneman on my iPad to read on my train journey to and from work. Reading that book allowed me to gain a deeper understanding of vSphere through the in depth content that Duncan and Frank had produced. Any VMware administrator worth their salt would be familiar with the book (or the ones that proceeded it) and it’s still a brilliant read.

Fast forward a few versions of vSphere and we finally have follow up:

VMware vSphere 6.5 Host Resources Deep Dive

This time around Frank has been joined by Niels Hagoort and together they have produced another must have virtualization book…though it goes far beyond VMware virtualization. I was lucky enough to review a couple of chapters of the book and I can say without question that this book will make your brain hurt…but in a good way. It’s the deepest of deep dives and it goes beyond the previous books best practice and dives into a lot of the low level compute, storage and networking fundamentals that a lot of us have either forgotten about, never learnt or never bothered to learn about.

This book explains the concepts and mechanisms behind the physical resource components and the VMkernel resource schedulers, which enables you to:

  • Optimize your workload for current and future Non-Uniform Memory Access (NUMA) systems.
  • Discover how vSphere Balanced Power Management takes advantage of the CPU Turbo Boost functionality, and why High Performance does not.
  • How the 3-DIMMs per Channel configuration results in a 10-20% performance drop.
  • How TLB works and why it is bad to disable large pages in virtualized environments.
  • Why 3D XPoint is perfect for the vSAN caching tier.
  • What queues are and where they live inside the end-to-end storage data paths.
  • Tune VMkernel components to optimize performance for VXLAN network traffic and NFV environments.
  • Why Intel’s Data Plane Development Kit significantly boosts packet processing performance.

If any of you have read Frank’s NUMA Deep Dive blog series you will start to get an appreciation of the level of technical detail this book covers, however it is written in a way that allows you absorb the information in a way that is digestible, though some parts may need to be read twice over. Well done to Frank and Niels on getting this book out and again, if you are working in and around anything to do with computers this is a must read so do yourself a favour and grab a copy.

The current Amazon locals that have access to purchase the book can be found below:

Amazon US: http://www.amazon.com/dp/1540873064
Amazon France: https://www.amazon.fr/dp/1540873064
Amazon Germany: https://www.amazon.de/dp/1540873064
Amazon India: http://www.amazon.in/dp/1540873064
Amazon Japan: https://www.amazon.co.jp/dp/1540873064
Amazon Mexico: https://www.amazon.com.mx/dp/1540873064
Amazon Spain: https://www.amazon.es/dp/1540873064
Amazon UK: https://www.amazon.co.uk/dp/1540873064

CPU Overallocation and Poor Network Performance in vCD – Beware of Resource Pools

For the longest time all VMware administrators have been told that resource pools are not folders and that they should only be used under circumstances where the impact of applying the resource settings is fully understood. From my point of view I’ve been able to utilize resource pools for VM management without too much hassle since I first started working on VMware Managed Service platforms and from a managed services point of view they are a lot easier to use as organizational “folders” than vSphere folders themselves. For me, as long as the CPU and Memory Resources Unlimited checkbox was ticked nothing bad happened.

Vote for your favorite blogs at vSphere-land!

Top vBlog Voting 2017

Working with vCloud Director however, resource pools are heavily utilized as the control mechanism for resource allocation, sharing and management. It’s still a topic that can cause confusion when trying to wrap ones head around the different allocation models vCD offers. I still reference blog posts from Duncan Epping and Frank Denneman written nearly seven years ago to refresh my memory every now and then.

Before moving onto an example of how overallocation or client undersizing in vCloud Director can cause serious performance issues it’s worth having a read of this post by Frank that goes through in typical Frank detail around what resource management looks like in vCloud Director.

Proper Resource management is very complicated in a Virtual Infrastructure or vCloud environment. Each allocation models uses a different combination of resource allocation settings on both Resource Pool and Virtual Machine level

Undersized vDCs Causing Network Throughput Issue:

The Allocation Pool model was the one that I worked with the most and it used to throw up a few client related issues when I worked at Zetttagrid. When using the Allocation Pool method which is the default model you are specifying the amount of resources for your Org vDC and also specifying how much of these resources are guaranteed. The guarantee means that a reservation will be set and that the amount of guaranteed resources is taken from the Provider vDC. The total amount of resources specified is the upper boundary, which is also the resource pool limit.

Because tenants where able to purchase Virtual Datacenters of any size there was a number of occasions where the tenants undersized their resources. Specifically, one tenant came to us complaining about poor network performance during a copy operation between VMs in their vDC. At first the operations team thought that is was the network causing issues…we where also running NSX and these VMs where also on a VXLAN segment so fingers where being pointed there as well.

Eventually, after a bit of troubleshooting we where able to replicate the problem…it was related to the resources that the tenant had purchased or lack thereof. In a nutshell because the allocation pool model allows the over provisioning or resources not enough vCPU was purchased. The vDC resource pool had 1000Mhz of vCPU with a 0% reservation but he had created 4 dual vCPU VMs. When the network copy job started it consumed CPU which in turn exhausted the vCD CPU allocation.

What happened next can be seen in the video below…

With the resource pool constrained ready time is introduced to throttle the CPU which in turn impacts the network throughput. As shown in the video when the resource pool has the the unlimited button checked the ready goes away and the network throughput returns to normal.

Conclusion:

Again, its worth checking out the impact on the network throughput in the video as it clearly shows what happens what tenants underprovision or overallocate their Virtual Datacenters in vCloud Director. Outside of vCloud Director it’s also handy to understand the impact of applying reservations on Resource Pools in terms of VM compute and networking performance.

It’s not always the network!

References:

http://www.vmware.com/resources/techresources/10325

http://frankdenneman.nl/2010/09/24/provider-vdc-cluster-or-resource-pool/

http://www.yellow-bricks.com/2012/02/28/resource-pool-shares-dont-make-sense-with-vcloud-director/

https://kb.vmware.com/kb/2006684

Allocation Pool Organization vDC Changes in vCloud Director 5.1

VMworld 2017 – #vGolf Las Vegas

#vGolf is back! Bigger and better than the inaugural #vGolf held last year at VMworld 2016!

Last year we had 24 participants and everyone who attended had a blast at the majestic Bali Hai Golf complex which is in view of the VMworld 2017 venue, Mandalay Bay. This year the event will expand with more sponsors and a more structured golfing competition with prizes going out for the top 2 placed two ball teams.

Vote for your favorite blogs at vSphere-land!

Top vBlog Voting 2017

Details will be updated on this site and on the Eventbrite page once the day is finalised. For the moment, if you are interested please reserve your spot by securing a ticket. At this stage there are 32 spots, but depending on popularity that could be extended.

Last year the golfing fee’s where heavily subsidised to $40 USD per person (green fees usually $130-150) thanks to the sponsors and I expect the same or lower depending on final sponsorship numbers this year. For now, please head to the Eventbrite page and reserve your ticket and wait for further updates as we get closer to the event.

Registration Page

There is a password on the registration page to protect against people registering directly via the public page. The password is vGolf2017Vegas. I’m looking forward to seeing you all there bright and early on Sunday morning!

Take a look at what awaits you…don’t miss out!

Sponsorship Call:

If you, or your company can offer some sponsorship for the event, please email [email protected] to discuss arrangements. I am looking to subsidise most of the green fee’s if possible and for that we would need four to five sponsors.

Important ESXi 6.0 Patch – Upgrade Now!

Last week VMware released a new patch (ESXi 6.0 Build 5572656) that addresses a number of serious bugs with Snapshot operations. Usually I wouldn’t blog about a patch release, but when I looked through the rest of the fixes in the VMwareKB it was apparent to me that this was more than your average VMware patch and addresses a number of issues around storage but again, a lot around Snapshot operations which is so critical to most VM backup operations.

Vote for your favorite blogs at vSphere-land!

Top vBlog Voting 2017

Here are some of the key resolutions that I’ve picked out from the patch release:

  • When you take a snapshot of a virtual machine, the virtual machine might become unresponsive
  • After you create a virtual machine snapshot of a SEsparse format, you might hit a rare race condition if there are significant but varying write IOPS to the snapshot. This race condition might make the ESXi host stop responding
  • Because of a memory leak, the hostd process might crash with the following error: Memory exceeds hard limit. Panic. The hostd logs report numerous errors such as Unable to build Durable Name. This kind of memory leak causes the host to get disconnected from vCenter Server
  • Using SESparse for both creating snapshots and cloning of virtual machines, might cause a corrupted Guest OS file system
  • During snapshot consolidation a precise calculation might be performed to determine the storage space required to perform the consolidation. This precise calculation can cause the virtual machine to stop responding, because it takes a long time to complete
  • Virtual Machines with SEsparse based snapshots might stop responding, during I/O operations with a specific type of I/O workload in multiple threads
  • When you reboot the ESXi host under the following conditions, the host might fail with a purple diagnostic screen and a PCPU xxx: no heartbeat error.
    • You use the vSphere Network Appliance (DVFilter) in an NSX environment
    • You migrate a virtual machine with vMotion under DVFilter control
  • Windows 2012 domain controller supports SMBv2, whereas Likewise stack on ESXi supports only SMBv1. With this release, the likewise stack on ESXi is enabled to support SMBv2
  • When the unmap commands fail, the ESXi host might stop responding due to a memory leak in the failure path. You might receive the following error message in the vmkernel.log file: FSDisk: 300: Issue of delete blocks failed [sync:0] and the host gets unresponsive.
  • In case you use SEsparse and enable unmapping operation to create snapshots and clones of virtual machines, after the wipe operation (the storage unmapping) is completed, the file system of the guest OS might be corrupt. The full clone of the virtual machine performs well.

There is also a number of vSAN related fixes in the patch so overall it’s worth looking to apply this patch as soon as is possible.

References:

https://kb.vmware.com/kb/2149955

Homelab – Lab Access Made Easy with Free Veeam Powered Network

A couple of weeks ago at VeeamON we announced the RC of Veeam PN which is a lightweight SDN appliance that has been released for free. While the main messaging is focused around extending network availability for Microsoft Azure, Veeam PN can be deployed as a stand alone solution via a downloadable OVA from the veeam.com site. While testing the product through it’s early dev cycles I immediately put into action a use case that allowed me to access my homelab and other home devices while I was on the road…all without having to setup and configure relatively complex VPN or remote access solutions.

Vote for your favorite blogs at vSphere-land!

Top vBlog Voting 2017

There are a lot of existing solutions that do what Veeam PN does and a lot of them are decent at what they do, however the biggest difference for me with comparing say the VPN functionality with a pfSense is that Veeam PN is purpose built and can be setup within a couple of clicks. The underlying technology is built upon OpenVPN so there is a level of familiarity and trust with what lies under the hood. The other great thing about leveraging OpenVPN is that any Windows, MacOS or Linux client will work with the configuration files generated for point-to-site connectivity.

Homelab Remote Connectivity Overview:

While on the road I wanted to access my homelab/office machines with minimal effort and without the reliance on published services externally via my entry level Belkin router. I also didn’t have a static IP which always proved problematic for remote services. At home I run a desktop that acts as my primary Windows workstation which also has VMware Workstation installed. I then have my SuperMicro 5028D-TNT4 server that has ESXi installed and runs my NestedESXi lab. I need access to at least RDP into that Windows workstation, but also get access to the management vCenter, SuperMicro IPMI and other systems that are running on the 192.168.1.0/24 subnet.

As seen above I also wanted to directly access workloads in the NestedESXi environment specifically on the 172.17.0.1/24 and 172.17.1.1/24 networks. A little more detail on my use case in a follow up post but as you can see from the diagram above, with the use of the Tunnelblick OpenVPN Client on my MBP I am able to create a point-to-site connection to the Veeam PN HUB which is in turn connected via site-to-site to each of the subnets I want to connect into.

Deploying and Configuring Veeam Powered Network:

As mentioned above you will need to download the Veeam PN OVA from the veeam.com website. This VeeamKB describes where to get the OVA and how to deploy and configure the appliance for first use. If you don’t have a DHCP enabled subnet to deploy the appliance into you can configure the network as a static by accessing the VM console, logging in with the default credentials and modifying the /etc/networking/interface file as described here.

Components

  • Veeam PN Hub Appliance x 1
  • Veeam PN Site Gateway x number of sites/subnets required
  • OpenVPN Client

The OVA is 1.5GB and when deployed the Virtual Machine has the base specifications of 1x vCPU, 1GB of vRAM and a 16GB of storage, which if thin provisioned consumes a tick over 5GB initially.

Networking Requirements

  • Veeam PN Hub Appliance – Incoming Ports TCP/UDP 1194, 6179 and TCP 443
  • Veeam PN Site Gateway – Outgoing access to at least TCP/UDP 1194
  • OpenVPN Client – Outgoing access to at least TCP/UDP 6179

Note that as part of the initial configuration you can configure the site-to-site and point-to-site protocol and ports which is handy if you are deploying into a locked down environment and want to have Veeam PN listen on different port numbers.

In my setup the Veeam PN Hub Appliance has been deployed into Azure mainly because that’s where I was able to test out the product initially, but also because in theory it provides a centralised, highly available location for all the site-to-site connections to terminate into. This central Hub can be deployed anywhere and as long as it’s got HTTPS connectivity configured correctly you can access the web interface and start to configure your site and standalone clients.

Configuring Site Clients (site-to-site):

To complete the configuration of the Veeam PN Site Gateway you need to register the sites from the Veeam PN Hub Appliance. When you register a client, Veeam PN generates a configuration file that contains VPN connection settings for the client. You must use the configuration file (downloadable as an XML) to set up the Site Gateway’s. Referencing the digram at the beginning of the post I needed to register three seperate client configurations as shown below.

Once this has been completed I deployed three Veeam PN Site Gateway’s on my Home Office infrastructure as shown in the diagram…one for each Site or subnet I wanted to have extended through the central Hub. I deployed one to my Windows VMware Workstation instance  on the 192.168.1.0/24 subnet and as shown below I deployed two Site Gateway’s into my NestedESXi lab on the 172.17.0.0/24 and 172.17.0.1/24 subnets respectively.

From there I imported the site configuration file into each corresponding Site Gateway that was generated from the central Hub Appliance and in as little as three clicks on each one, all three networks where joined using site-to-site connectivity to the central Hub.

Configuring Remote Clients (point-to-site):

To be able to connect into my home office and home lab which on the road the final step is to register a standalone client from the central Hub Appliance. Again, because Veeam PN is leveraging OpenVPN what we are producing here is an OVPN configuration file that has all the details required to create the point-to-site connection…noting that there isn’t any requirement to enter in a username and password as Veeam PN is authenticating using SSL authentication.

For my MPB I’m using the Tunnelblick OpenVPN Client I’ve found it to be an excellent client but obviously being OpenVPN there are a bunch of other clients for pretty much any platform you might be running. Once I’ve imported the OVPN configuration file into the client I am able to authenticate against the Hub Appliance endpoint as the site-to-site routing is injected into the network settings.

You can see above that the 192.168.1.0, 172.17.0.0 and 172.17.0.1 static routes have been added and set to use the tunnel interfaces default gateway which is on the central Hub Appliance. This means that from my MPB I can now get to any device on any of those three subnets no matter where I am in the world…in this case I can RDP to my Windows workstation, connect to vCenter or ssh into my ESXi hosts.

Conclusion:

Summerizing the steps that where taken in order to setup and configure the extension of my home office network using Veeam PN through its site-to-site connectivity feature to allow me to access systems and services via a point-to-site VPN:

  • Deploy and configure Veeam PN Hub Appliance
  • Register Sites
  • Register Endpoints
  • Deploy and configure Veeam PN Site Gateway
  • Setup Endpoint and connect to Hub Appliance

Those five steps took me less than 15 minutes which also took into consideration the OVA deployments as well…that to me is extremely streamlined, efficient process to achieve what in the past, could have taken hours and certainly would have involved a more complex set of commands and configuration steps. The simplicity of the solution is what makes it very useful for home labbers wanting a quick and easy way to access their systems…it just works!

Again, Veeam PN is free and is deployable from the Azure Marketplace to help extend availability for Microsoft Azure…or downloadable in OVA format directly from the veeam.com site. The use case i’ve described and have been using without issue for a number of months adds to the flexibility of the Veeam Powered Network solution.

References:

https://helpcenter.veeam.com/docs/veeampn/userguide/overview.html?ver=10

https://www.veeam.com/kb2271

 

Quick Fix: VCSA 503 Service Unavailable Error

I’ve just had to fix one of my VCSA’s again from the infamous 503 Service Unavailable error that seems to be fairly common with the VCSA even though it’s was claimed to be fixed in vCenter version 6.5d. I’ve had this error pop up fairly regularly since deploying my homelab’s vCenter Server Appliance as a version 6.5 GA instance and for the most part I’ve refrained from rebooting the VCSA just in case the error pops up upon reboot and have even kept a snapshot against the VM just in case I needed to revert to it on the high change that it would error out.

503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x0000559b1531ef80] _serverNamespace = / action = Allow _pipeName =/var/run/vmware/vpxd-webserver-pipe)

After doing a Google search for any permanent solutions to the issue, I came across a couple of posts referencing USB passthrough devices that could trigger the error which was plausible given I was using an external USB Hard Drive. IP changes seem to also be a trigger for the error though in my case, it wasn’t the cause. There is a good Reddit thread here that talks about duplicate keys…again related to USB passthrough. It also links externally to some other solutions that where not relevant to my VCSA.

Solution:

As referenced in this VMware communities forum post, to fix the issue I had to first find out if I did have a duplicate key error in the VCSA logs. To do that I dropped into the VCSA shell and went into /var/logs and did a search for any file containing device_key + already exists. As shown in the image above this returned a number of entries confirming that I had duplicate keys and that it was causing the issue.

The VMware vCenter Server Appliance vpxd 6.5 logs are located in the /var/log/vmware/vmware-vpx folder

What was required next was to delete the duplicate embedded PostGres database table entries. To connect to the embedded postgres database you need to run the following command from the VCSA shell:

To remove the duplicate key I ran the following command and rebooted the appliance, noting that the id and device_key will vary.

Once everything rebooted all the services started up and I had a functional vCenter again which was a relief given I was about five minutes away from a restore or a complete rebuild…and ain’t nobody got time for that!

vCenter (VCSA) 6.5 broken after restart from vmware

Reference:

https://communities.vmware.com/thread/556490

 

VMware Flings: Top 5 – 2017 Edition

VMware has had their Lab Flings program going for a number of years now and in 2015 I wrote this post listing out my Top 5 Flings. Since then there have been some awesome Flings released and I thought it was a good time to update my Top 5 Flings to reflect the continued awesomeness generated within the VMware Labs. Since my last post there have also been a number of flings that have found their way into product releases:

Flings are apps and tools built by our engineers that are intended to be played with and explored.

There are 128 (up from 57 from August of 2015) Flings listed on the site though some have been depreciated. They range across most of VMware’s Product stack…most of them have been created out of some requirement or function that was/is lacking in the current toolset for their respective products. Most of them solve usability issues or look to resolve performance bottlenecks and look to optimize product experience.

Fling Number 5 – Storage Profile Updater

This Fling is a simple tool that enables the migration of vCloud Director virtual machines and templates from the default any storage profile to a specific storage profile. The tool can be run from the command-line with the help of a configuration file, and it allows you to change storage profiles in a batch style of processing.

For those that upgraded vCloud Director from 1.5 to 5.x you would know about the Any profile issue…this fling allows you to migrate all VMs from that default storage policy to any new one you might have configured in your Provider vDC.

Fling Number 4 – Cross vCenter VM Mobility – CLI

Cross vCenter VM Mobility – CLI is a command line interface (CLI) tool that can be used to migrate or clone a VM from one host to another host managed by a linked or isolated vCenter (VC) instance. It has been built using vSphere Java-based SDK APIs.

Currently, as of vSphere 6.0, the vSphere HTML5 Web Client allows users to perform Cross-VC operations like migration and cloning if two VCs are linked. If VCs are not linked, users cannot view the infrastructure across multiple VCs and thus, cannot utilize this functionality through UI. This Fling provides a way for users to access this vSphere feature through simple CLI commands. It also supports cross-cluster placement and shared storage vMotion between two VCs.

Cross vCenter migrations is probably one of the most underrated features VMware has released and has been present since vSphere 6.0. Originally exposed via the API’s William Lam blogged about a wrapper he wrote to use the functionality and this Fling sits beside that as possible tools to perform the cross vCenter actions.

Fling Number 3 – Embedded Host Client

The ESXi Embedded Host Client is a native HTML and JavaScript application and is served directly from your ESXi host! It should perform much better than any of the existing solutions

This Fling was a revelation when it was first released and adds a very usable and functional HTML5 web interface from which to manage your ESXi hosts. It’s now productized and packaged into ESXi 5.5, 6.0 and 6.5 and there is continues on going development of the tool along with bug fixes and features that can be installed via the vib on the Fling site.

Fling Number 2 – VMware Tools for Nested ESXi

This VIB package provides a VMware Tools service (vmtoolsd) for running inside a nested ESXi virtual machine. The following capabilities are exposed through VMware Tools:

Provides guest OS information of the nested ESXi Hypervisor (eg. IP address, configured hostname, etc.).
Allows the nested ESXi VM to be cleanly shut down or restarted when performing power operations with the vSphere Web/C# Client or vSphere APIs.
Executes scripts that help automate ESXi guest OS operations when the guest’s power state changes.
Supports the Guest Operations API (formally known as the VIX API).

The release of this Fling was met with a lot of thankyou’s from those who had battled with NestedESXi Hosts not having VMTools available. If anything, the ability to cleanly shutdown or restart the ESXi Guest was welcomed. With the release of ESXi 6.0 (and subsequent 6.5 release) the Tools are included in the OS by default…but for those running 5.x Nested Hosts its a must have.

Fling Number 1 – ESXi Mac Learning dvFilter v2.0

MAC learning functionality solves performance problems for use cases like nested ESX.  This ESX extension adds functionality to ESX to support MAC-learning on vswitch ports. For most ESX use cases, MAC learning is not required as ESX knows exactly which MAC address will be used by a VM. However, for applications like running nested ESX, i.e. ESX as a guest-VM on ESX, the situation is different. As an ESX VM may emit packets for a multitude of different MAC addresses, it currently requires the vswitch port to be put in “promiscuous mode”. That however will lead to too many packets delivered into the ESX VM, as it leads to all packets on the vswitch being seen by all ESX VMs. When running several ESX VMs, this can lead to very significant CPU overhead and noticeable degradation in network throughput. Combining MAC learning with “promiscuous mode” solves this problem. The MAC learning functionality is delivered as a high speed VMkernel extension that can be enabled on a per-port basis. It works on legacy standard switches as well as Virtual Distributed Switches

This Fling is close to my heart as I learnt at VMworld 2014 that it was born out of a blog post I did on Promiscuous Mode that triggered William Lam to approach Christian Dickmann with the issues and look for a way to solve the issue. As you can see from my followup post it works as designed and is the single must have Fling for those who run Nested ESXi labs. It was recently upgraded to version 2.0 to support ESXi 6.5.

As of last week there a new ESXi Learnswitch Fling was released which builds upon (but can’t be used with) the MAC Learning fling.

ESXi Learnswitch is a complete implementation of MAC Learning and Filtering and is designed as a wrapper around the host virtual switch. It supports learning multiple source MAC addresses on virtual network interface cards (vNIC) and filters packets from egressing the wrong port based on destination MAC lookup. This substantially improves overall network throughput and system performance for nested ESX and container use cases.

To learn more, read ESXi Learnswitch – Enhancement to the ESXi MAC Learn DvFilter.

For a full list of the Flings available for download, head to this link

https://labs.vmware.com/flings/?utf8=%E2%9C%93&order=date+DESC

vSAN 6.6 – What’s In It For Service Providers

Last February when VMware released VSAN 6.2 I stated that “Things had gotten Interesting” with regards to the 6.2 release of vSAN finally marking it’s arrival as a serious player in the Hyper-converged Infrastructure (HCI) market. vSAN was ready to be taken very seriously by VMware’s competitors. Fast forward fourteen months and apart from the fact we have confirmed the v in vSAN is a lower case with the product name officially changing from Virtual SAN to vSAN…Version 6.6 was announced last week is set to GA today, and with it comes the biggest list of new features and enhancements in vSANs history.

VMware has decided to break with the normal vSphere release cycle for vSAN and move to patch releases for vSphere that are actually major updates of vSAN. This is why this release is labeled vSAN 6.6 and will be included in the vSphere 6.5EP2 build. The move allows the vSAN team to continue to enhance the platform outside of the core vSphere platform and I believe it will deliver at least 2 update releases per year.

Looking at the new features and enhancements of the vSAN 6.6 release it’s clear to see that the platform has matured and given the 7000+ strong customer base it’s also clear to see that its being accepted more and more for critical workloads. From a service provider point of view I know of a lot more vCloud Air Network partners that have implemented vSAN as not only their Management HCI platform, but also now their customer HCI compute and storage  platforms.

A lot for Service Providers to like:

As shown in the feature timeline above there are over 20+ new features and enhancements but for me the following ones are most relative to vCAN Service Providers who are using, or looking to use vSAN in their offerings. I will expand on the ones in red as I see them as being the most significant of the new features and enhancements for service providers.

  • Native encryption for data-at-rest
  • Compliance certifications
  • vSAN Proactive Drive HA for failing drives
  • Resilient management independent of vCenter
  • Rapid recovery with smart, efficient rebuilds
  • Certified file service & data protection solutions
  • Enhanced vSAN SDK and PowerCLI
  • Simple networking with Unicast
  • vSAN Cloud Analytics for performance
  • vSAN Cloud Analytics with real-time support notification and recommendations*
  • vSAN Config Assist with 1-click hardware lifecycle management
  • Extended Health Services
  • Up to 50% greater IOPS for all-flash with optimized checksum and dedupe
  • Optimized for latest flash technologies
  • Expanded caching tier choice
  • New Docker Volume Driver

Simple networking with Unicast:

As John Nicholson wrote on the Virtual Blocks blog…it’s time to say goodbye to the multicast requirements around vSAN networking traffic. For a history as to why multicast was used, click here. Also it’s worth reading John’s post and also the he goes through the upgrade process as if you are upgrading from previous versions, multicast will still be used unless you make the change as also specified here.

I can attest first hand to the added complexity when it comes to setting up vSAN with multicast and have gone through a couple of painful deployments where the multicast configuration was an issue during initial setup and also caused issue with switching infrastructure that needed to be upgraded to before vSAN could work reliably. In my mind unicast offers a simpler less complex solution with minimal overheads and makes it more transportable across networks.

Performance Improvements:

Service Providers are always trying to squeeze the most out of their hardware purchases and with VMware claiming 50% greater IOPS for all-flash through optimized data services that in theory can enable 150K IOPS per host it appears they will be served well. in addition to optimized checksum and dedupe along with support for the latest flash technologies. The increased performance helps accelerate tenant workloads and provides higher consolidation ratios for those workloads.

Service providers can accelerate new hardware technologies with the support of the latest flash technologies, including solutions like the new breed of NVMe SSDs. These solutions can deliver up to 250% greater performance for write-intensive applications. vSAN 6.6 now offers larger caching drive options that includes 1.6TB flash drives, so that service providers can take advantage of larger capacity flash drives.

Disk Performance Enhancements:

For those that have gone through a vSAN rebuild operation you would know that is can be a long exercise depending on the amount of data and configuration of the vSAN datastore. vSAN 6.6 introduces a new smart rebuild and rebalancing feature along with partial repairs of degraded or absent components. There is also resync throttling and improved visibility into the rebuilding status through the Health Status. Cormac Hogan goes through the improvements in detail here.

From a Service Provider point of view having these enhanced features around the rebuilds it critical to continued quality of service for IaaS customer who live on shared vSAN storage. Shorter and more efficient rebuild times means less impact to customers.

Health Checks and Monitoring Improvements:

vSAN Encryption:

VMware has introduced VM encryption native at the vSAN datastore level. This can be enabled per vSAN Cluster and works with deduplication and compression across hybrid and all-flash cluster configurations. vSAN 6.6 data Encryption is hardware agnostic, there is no requirement to use specialized and more expensive Self-Encrypting Drives (SEDs) which is also a bonus. Jase McCarty has another Virtual Blocks article here that goes through this feature in great detail.

From a Service Provider point of view you can now potentially offer two classes of vSAN backed storage for IaaS customers. One that lives on an Encrypted enabled cluster that’s charged at a premium over non Encrypted clusters. In talking with service providers across the globe, data at rest encryption has become something that potential customers are asking for and most leading storage companies have an encryption story…now so does vSAN and it appears to be market leading.

vSAN 6.6 Licensing:

In terms of the licensing Matrix, nothing too drastic has changed except for the addition of Data at Rest Encryption in the Enterprise bundle, however in a significant move for vCAN Service Providers, QoS IOPS Limiting has been extended across all license types and can now be taken advantage across the board. This is good for Service Providers who look to offer different tiers or storage performance based on IOPS limited…previously it was only available under Enterprise licensing.

Bootstrapping UI:

As a bonus feature that I think will assist vCAN Service Providers is the new Native Bootstrap installer in vSAN 6.6. William Lam has written about the feature here, but for those looking to install their first vSAN node without vSphere available the ability to bootstrap is invaluable. The old manual process is still worth looking at as it’s always beneficial to know what’s going on in the background, but it’s all GUI based now via the VCSA installer.

Conclusion:

vSAN 6.6 appears to be a great step forward for VMware and Service Providers will no doubt be keen to upgrade as soon as possible to take advantage of the features and enhancements that have been delivered in this 6.6 release.

References:

http://cormachogan.com/2017/04/11/whats-new-vsan-6-6/ 

https://storagehub.vmware.com/#!/vmware-vsan/vmware-vsan-6-5-technical-overview

http://vsphere-land.com/news/an-overview-of-whats-new-in-vmware-vsan-6-6.html

https://storagehub.vmware.com/#!/vmware-vsan/vsan-multicast-removal/multicast-removal-steps-and-requirements/1

vSAN 6.6 Encryption Configuration

vSAN 6.6 – Native Data-at-Rest Encryption

Goodbye Multicast

Native VCSA bootstrap installer in vSAN 6.6

Worth a Repost: “VMware Doubles Down” vCloud Director 8.20

It seems that with the announcement last week that VMware was offloading vCloud Air to OVH people where again asking what is happening with vCloud Director….and the vCloud Air Network in general. While vCD is still not available for VMware’s enterprise customers, the vCloud Director platform has officially never been in a stronger position.

Those outside the vCAN inner circles probably are not aware of this and I still personally field a lot of questions about vCD and where it sits in regards to VMware’s plans. Apparently the vCloud Team has again sought to clear the air about vCloud Director’s future and posted this fairly emotive blog post overnight.

I’ve reposted part of the article below:

Blogger Blast: VMware vCloud Director 8.20

We are pleased to confirm that vCloud Director continues to be owned and developed by VMware’s Cloud Provider Software Business Unit and is the strategic cloud management platform for vCloud Air Network service providers. VMware has been and continues to be committed to its investment and innovation in vCloud Director.

With the recent release of vCloud Director 8.20 in February 2017 VMware has doubled down on its dedication to enhancing the product, and, in addition, is working to expand its training program to keep pace with the evolving needs of its users. In December 2016 we launched the Instructor Led Training for vCloud Director 8.10 (information and registration link) and in June 2017 we are pleased to be able to offer a Instructor Led Training program for vCloud Director 8.20.

Exciting progress is also occurring with vCloud Director’s expanding partner ecosystem. We are working to provide ISVs with streamlined access and certification to vCloud Director to provide service providers with access to more pre-certified capabilities with the ongoing new releases of vCloud Director. By extending our ecosystem service providers are able to more rapidly monetize services for their customers

Again, this is exciting times for those who are running vCloud Director SP and those looking to implement vCD into their IaaS offerings. It should be an interesting year and I look forward to VMware building on this renewed momentum for vCloud Director. There are many people blogging about vCD again which is awesome to see and it gives everyone in the vCloud Air Network an excellent content from which to leach from.

The vCloud Director Team also has a VMLive session that will provide a sneak peek at vCloud Director.Next roadmap. So if you are not a VMware Partner Central member and work for a vCloud Air Network provider wanting to know about where vCD is heading…sign up.

#LongLivevCD

vCloud Air Sold to OVH – Final Thoughts On Project Zephyr

I’ve just spent the last fifteen minutes looking back through all my posts on vCloud Air over the last four or five years and given yesterday’s announcement that VMware was selling what remains of vCloud Air to OVH Going over the content I thought it would be pertinent to write up one last piece on VMware’s attempt to build a public cloud that tried compete against the might of AWS, Azure, Google and the other well established hyper-scalers.

Project Zephyr:

Project Zephyr was first rumoured during 2012 and later launched as VMware Cloud Hybrid Services or vCHS…and while VMware pushed the cloud platform as a competitor to the hyper-scalers, the fact that it was built upon vCloud Director was probably one of it’s biggest downfalls. That might come as a shock to a lot of you reading this to hear me talk bad about vCD, however it wasn’t so much the fact that vCD was used as the backend, it was more what the consumer saw at the frontend that for me posed a significant problem for it’s initial uptake.

VMworld – Where is the Zephyr?

It was the perfect opportunity for VMware to deliver a completely new and modern UI for vCD and even though they did front the legacy vCD UI will a new frontend it wasn’t game changing enough to draw people in. It was utilitarian at best, but given that you only had to provision VMs it didn’t do enough to show that the service was cutting edge.  Obviously the UI wasn’t the only reason why it failed to take off…using vCD meant that vCloud Air was limited by the fact that vCD wasn’t built for hyper-scale operations such as individual VM instance management or for platform as a service offerings. The lack of PaaS offerings in effect meant it was a glorified extension of existing vCloud Air Network provider clouds…which in fact was some of the key messaging VMware used in the early days.

The use of vCD did deliver benefits to the vCloud Air Network and in truth might have saved vCD from being put on the scrapheap before VMware renewed their commitment to develop the SP version which has resulted in a new UI being introduced for Advanced Networking in 8.20.

vCloud Air Struggles:

There was no hiding the fact that vCloud Air was struggling to gain traction world wide and even as other zones where opening around the world it seemed like VMware where always playing catchup with the hyper-scalers…but the reality of what the platform was meant that there never a chance vCloud Air would grow to rival AWS, Azure and others.

By late 2015 there was a joint venture between EMC’s Virtustream and VMware vCloud Air that looked to join the best of both offerings under the Virtustream banner where they looked to form a new hybrid cloud services business but the DELL/EMC merger looked to get in the way of that deal and by December 2015 the idea has been squashed.

vCloud Air and Virtustream – Just kill vCloud Air Already?!?

vCloud Air and Virtustream – Ok…So This Might Not Happen!

It appeared from the outside that vCloud Air never recovered from that missed opportunity and through 2016 there where a number of announcements that started in March when it was reported that vCloud Air Japan was to be sold to the company that was effectively funding the zone and effectively closed down.

HOTP: vCloud Air Japan to be Shutdown!

Then in June VMware announced that Credit Card payments would no longer be accepted for any vCloud Air online transactions and that the service had to be bought with pre purchased credits through partners. For me this was the final nail in the coffin in terms of vCloud Air being able to compete in the Public Cloud space.

vCloud Air – Pulling Back Credit Card Payments

From this point forward the messaging for the use case of vCloud Air had shifted to Disaster Recovery services via the Hybrid Cloud Manager and vSphere Replication services that where built to work directly from vSphere to vCloud Air endpoints.

vCloud Air Network:

Stepping back, just before VMworld 2014, VMware announced the rebranding of vCHS to what is now called vCloud Air and also launched the vCloud Air Network. Myself and others where pretty happy at the time that VMware looked to reconnect with their service provider partners.

With the announcement around the full rebranding of vCHS to vCloud Air and Transforming the VSPP and vCloud Powered programs to the vCloud Air Network it would appear that VMware has in fact gone the other way and recommitted their support to all vCloud Server Providers and has even sort out to make the partner relationship stronger. The premise being that together, there is a ready made network (Including vCloud Air) of providers around the world ready to take on the greater uptake of Hybrid Cloud that’s expected over the next couple of years.

So while vCloud Air existed VMware acknowledged that more success was possible through support the vCloud Air Network ecosystem as the enabler of hybrid cloud services.

Final Final Thoughts:

To say that I’ve had a love hate relationship with the idea of VMware having a public cloud is reflected in my posts over the years. In truth myself and others who formed part of the vCloud Air Network of VMware based service providers where never really thrilled about the idea of VMware competing directly against their own partners.

vCHS vs. vCloud Providers: The Elephant in the Cloud

I would now say that many would be glad to see it handed over to OVH…because now VMware does not compete against it’s vCAN Service Providers directly, but can continue to hopefully focus on enabling them with the best tools to power their own cloud or provider platforms and help the network grow successfully as what the likes of OVH, iLand, Zettagrid and others have been able to so.

Pat Gelsinger statement in regards to the sale to OVH are very postive for the vCloud Air Network and I believe for VMware hybrid cloud vision that it revealed at VMworld last year can now proceed without this lingering in the corner.

“We remain committed to delivering our broader cross-cloud architecture that extends our hybrid cloud strategy, enabling customers to run, manage, connect, and secure their applications across clouds and devices in a common operating environment”

The VMware vCloud blog here talks about what OVH will bring to the table for the customers that remain on vCloud Air. Overall it’s extremely positive for those customers and they can take advantage of the technical ability and execution of the vCloud Air Networks leading service provider. Overall I think this is a great move by VMware and will hopefully lead to the vCloud Air Network becoming stronger…not weaker.

« Older Entries