Author Archives: Anthony Spiteri

Cloud to Cloud to Cloud Networking with Veeam Powered Network

I’ve written a couple of posts on how Veeam Powered Network can make accessing your homelab easy with it’s straight forward approach to creating and connection site-to-site and point-to-site VPN connections. For a refresh on the use cases that I’ve gone through, I had a requirement where I needed access to my homelab/office machines while on the road and to to achieve this I went through two scenarios on how you can deploy and configure Veeam PN.

In this blog post I’m going to run through a very real world solution with Veeam PN where it will be used to easily connect geographically disparate cloud hosting zones. One of the most common questions I used to receive from sales and customers in my previous roles with service providers is how do we easily connect up two sites so that some form of application high availability could be achieved or even just allowing access to applications or services cross site.

Taking that further…how is this achieved in the most cost effective and operationally efficient way? There are obviously solutions available today that achieve connectivity between multiple sites, weather that be via some sort of MPLS, IPSec, L2VPN or stretched network solution. What Veeam PN achieves is a simple to configure, cost effective (remember it’s free) way to connect up one to one or one to many cloud zones with little to no overheads.

Cloud to Cloud to Cloud Veeam PN Appliance Deployment Model

In this scenario I want each vCloud Director zone to have access to the other zones and be always connected. I also want to be able to connect in via the OpenVPN endpoint client and have access to all zones remotely. All zones will be routed through the Veeam PN Hub Server deployed into Azure via the Azure Marketplace. To go over the Veeam PN deployment process read my first post and also visit this VeeamKB that describes where to get the OVA and how to deploy and configure the appliance for first use.

Components

  • Veeam PN Hub Appliance x 1 (Azure)
  • Veeam PN Site Gateway x 3 (One Per Zettagrid vCD Zone)
  • OpenVPN Client (For remote connectivity)

Networking Overview and Requirements

  • Veeam PN Hub Appliance – Incoming Ports TCP/UDP 1194, 6179 and TCP 443
    • Azure VNET 10.0.0.0/16
    • Azure Veeam PN Endpoint IP and DNS Record
  • Veeam PN Site Gateways – Outgoing access to at least TCP/UDP 1194
    • Perth vCD Zone 192.168.60.0/24
    • Sydney vCD Zone 192.168.70.0/24
    • Melbourne vCD Zone 192.168.80.0/24
  • OpenVPN Client – Outgoing access to at least TCP/UDP 6179

In my setup the Veeam PN Hub Appliance has been deployed into Azure mainly because that’s where I was able to test out the product initially, but also because in theory it provides a centralised, highly available location for all the site-to-site connections to terminate into. This central Hub can be deployed anywhere and as long as it’s got HTTPS connectivity configured correctly to access the web interface and start to configure your site and standalone clients.

Configuring Site Clients for Cloud Zones (site-to-site):

To configuration the Veeam PN Site Gateway you need to register the sites from the Veeam PN Hub Appliance. When you register a client, Veeam PN generates a configuration file that contains VPN connection settings for the client. You must use the configuration file (downloadable as an XML) to set up the Site Gateway’s. Referencing the digram at the beginning of the post I needed to register three seperate client configurations as shown below.

Once this has been completed you need deploy a Veeam PN Site Gateway in each vCloud Hosting Zone…because we are dealing with an OVA the OVFTool will need to be used to upload the Veeam PN Site Gateway appliances. I’ve previously created and blogged about an OVFTool upload script using Powershell which can be viewed here. Each Site Gateway needs to be deployed and attached to the vCloud vORG Network that you want to extend…in my case it’s the 192.168.60.0, 192.168.70.0 and 192.168.80.0 vORG Networks.

Once each vCloud zone has has the Site Gateway deployed and the corresponding XML configuration file added you should see all sites connected in the Veeam PN Dashboard.

At this stage we have connected each vCloud Zone to the central Hub Appliance which is configured now to route to each subnet. If I was to connect up an OpenVPN Client to the HUB Appliance I could access all subnets and be able to connect to systems or services in each location. Shown below is the Tunnelblick OpenVPN Client connected to the HUB Appliance showing the injected routes into the network settings.

You can see above that the 192.168.60.0, 192.168.70.0 and 192.168.80.0 static routes have been added and set to use the tunnel interfaces default gateway which is on the central Hub Appliance.

Adding Static Routes to Cloud Zones (Cloud to Cloud to Cloud):

To complete the setup and have each vCloud zone talking to each other we need to configure static routes on each zone network gateway/router so that traffic destined for the other subnets knows to be routed through to the Site Gateway IP, through to the central Hub Appliance onto the destination and then back. To achieve this you just need to add static routes to the router. In my example I have added the static route to the vCloud Edge Gateway through the vCD Portal as shown below in the Melbourne Zone.

Conclusion:

Summerizing the steps that where taken in order to setup and configure the configuration of a cloud to cloud to cloud network using Veeam PN through its site-to-site connectivity feature to allow cross site connectivity while allowing access to systems and services via the point-to-site VPN:

  • Deploy and configure Veeam PN Hub Appliance
  • Register Cloud Sites
  • Register Endpoints
  • Deploy and configure Veeam PN Site Gateway in each vCloud Zone
  • Configure static routes in each vCloud Zone

Those five steps took me less than 30 minutes which also took into consideration the OVA deployments as well. At the end of the day I’ve connected three disparate cloud zones at Zettagrid which all access each other through a Veeam PN Hub Appliance deployed in Azure. From here there is nothing stopping me from adding more cloud zones that could be situated in AWS, IBM, Google or any other public cloud. I could even connect up my home office or a remote site to the central Hub to give full coverage.

The key here is that Veeam Power Network offers a simple solution to what is traditionally a complex and costly one. Again, this will not suit all use cases but at it’s most basic functional level, it would have been the answer to the cross cloud connectivity questions I used to get that I mentioned at the start of the article.

Go give it a try!

NestedESXi – Network Performance Improvements with Learnswitch

I’ve been running my NestedESXi homelab for about eight months now but in all that time I had not installed or enabled the ESXi MAC Learning dvFilter. As a quick refresher the VMware Fling addresses the issues with nested ESXi hosts and the impact that promiscuous mode has when enabled on virtual switches. In a nutshell, network traffic will hit all the network interfaces attached to the portgroup which reduces network throughput and also increases latency and impacts CPU.

The ESXi MAC Learn dvFilter Fling was released about two years ago and its a must have for those running homelabs or work labs running nested ESXi. However earlier this year a new fling was released that improves on the dvFilter and addresses some of it’s limitations. The new native MAC Learning VMkernel module is called Learnswitch.

ESXi Learnswitch is a complete implementation of MAC Learning and Filtering and is designed as a wrapper around the host virtual switch. It supports learning multiple source MAC addresses on virtual network interface cards (vNIC) and filters packets from egressing the wrong port based on destination MAC lookup. This substantially improves overall network throughput and system performance for nested ESX and container use cases.

For a more in depth look at it’s functionality head over to William Lams blog post here.

dvFilter vs Learnswitch:

I was interested to see if the new Learnswitch offered any significant performance improvements over the dvFilter in addition to its main benefits. I went about installing and enabling the dvFilter in my lab and ran some basic performance tests using Crystal Disk Mark. Before that, I ran the performance test without either installed as a base.

Firstly to see what the network traffic looks like hitting the nested hosts you can see from the ESXTOP output below that each host is dealing with about the same amount of received packets. Overall throughput is reduced when this happens.

In terms of performance the Crystal Disk Mark test run on a nested VM (right) showed reduced performance across all tests when compared to one run on the parent host (left) directly.

There was also elevated datastore latency and significant CPU usage due to the overheads with the increased traffic hitting all interfaces.

The CPU usage alone shows the value in having the dvFilter or Learnswitch installed when running nested ESXi hosts.

With the baseline testing done I installed and enabled the dvFilter and then ran the same tests. For a detailed look at how to install the dvFilter (just in case you don’t fit the requirements for using the Learnswitch module) check out my initial post on the dvFilter here. Having gone through that I went about uninstalling the dvFilter and installing and configuring the Learnswitch.

Like the dvFilter you need to download and install am ESXi software bundle but unlike the dvFilter, you need to reboot the host to enable the Learnswitch module.

As per the instructions on William Lam’s post or the Fling page you then need to configure and run a Python script to enable the Learnswitch against the NestedESXi portgroups that have promiscuous mode enabled.

From there the impact of the module is immediate and you can see a normalization of network traffic hitting the interfaces of each NestedESXi host. When running the performance test the ESXTOP output is significantly different to what you see if the module is not loaded as shown below.

You also have access to a new command that lists out stat’s of the Learnswitch showing packet and port statistics as well as the current MAC address table.

In terms of what it looks like from a performance point of view, below are the results of all Crystal Disk Mark tests. The bottom two represent the dvFilter (left) and the Learnswitch (right).

And finally to have a look at the improvement in CPU performance with the modules installed you can see below a timeline showing the performance tests run at different times across the last 24 hours…again a significant improvement looking at the graphs on the left hand side which was during the testing without any module and then moving across to the dvFilter test with the Learnswitch test on the right hand side. It does seem like the Learnswitch is a little better on CPU, but can’t be 100% with my limited testing.

Conclusion:

As expected there isn’t a huge different in performance between both modules but certainly the features of the Learnswitch make it the new preferred choice out of the two if the requirements are met. Again, the main advantages of the Learnswitch over the dvFilter make it a must have addition to any NestedESXi environment. If you haven’t installed either yet…get onto it!

Veeam Vault #7: Nutanix Support?!, Backup for Office365 1.5 BETA, VeeamON Forums plus Vanguard Roundup

It’s been just over two months since my last Veeam Vault went out and can you believe that was just before VeeamON 2017 in New Orleans. Again, for a recap of what was announced at VeeamON check out my wrap up post here…two months on and we haven’t stopped here at Veeam. As soon as VeeamON was done and dusted focus turned to EMEA SE training in Warsaw which my whole team attended and where the group got an extended look at the new features coming in v10. Since then, i’ve had a good stretch at home where i’ve been preparing for a series of webinars but mainly focused on the upcoming VeeamON Forums happening around the APAC region.

I’ll be presenting sessions at all events and be on stage with Clint Wyckoff for the Sydney and Auckland keynotes where our co-CEO, Peter McKay and VP of Global Cloud Group, Paul Mattes will be headlining. There are other events happening in Asia, so please register here and if you are able to attend any of those cities it would be great to get you down and learn about all that’s happening with Veeam as we move into the second half of the year an into next year.

Nutanix AHV Announcement:

At Nutanix’s .NET conference we announced the intent to support Acropolis Hypervisor (AHV) by years end and also became the Premier Availability solution for supported Nutanix virtualized environments. I’ll be honest and say that this took a lot of us by surprise…and probably most Nutanix employees as well. However it shows our commitment to providing availability for the modern enterprise…of which Nutanix is also pushing hard into.

Backup for Office365 1.5 BETA:

Last week we released the first beta for Backup for Office365 1.5 which is a significant release for our VCSP community as it now introduces multi-tenancy and also an advanced API feature for automation. If you are a VCSP, take some time to download the beta and put the new features to work…there is a significant opportunity to offer backup services for Office365 which now scale.

Version 1.5 Enhancements:

  • A multi-repository, multi-tenant architecture enabling protection of larger Office 365 deployments with a single installation. Also empowering service providers to deliver Office 365 backup services.
  • Automation possibilities via RESTful API and PowerShell SDK to minimize management overhead, improve recovery times and reduce costs

https://go.veeam.com/beta-backup-office-365

Update 1 for Veeam Agent for Linux 1.0:

Last month we released Update 1 for Veeam Agent for Linux so the next time you update the software from your Linux update repositories you will get the update. While this is for the most a bug release we still included file indexing for 1-Click file recovery through Veeam Enterprise Manager, the ability to add storage and network drivers to the recovery media from the Linux OS and the addition of an ssh server to the recovery media. There is also support added for ExaGrid and general wizard improvements.

https://www.veeam.com/kb2290

Veeam Vanguard Blog Post Roundup:

Quick Fix – Unable to Upgrade Distributed Switch After vCenter Upgrade

This week I upgraded (and migrated) my SliemaLabs NestedESXi vCenter from a Windows 6.0 server to a 6.5 VCSA …everything went well, but ran into an issue when I went to upgrade my distributed switch to 6.5.0. Even though everything appeared to be working with regards to the host and VM networking associated with the switch, when I went to upgrade it I got the following error:

Doing a quick Google for Unable to retrieve data about the distributed switch came up with nothing and clicking on next didn’t do anything actionable. A restart of the Web Client and a reboot of the VCSA didn’t resolve the issue either.The distributed switch in question was still on version 5.5 as I forgot to upgrade it to 6.0 during the upgrade to vCenter 6.0. Weather that condition somehow caused the error I am not sure…regardless the quick fix or better said…work around is pretty simple; Use PowerCLI.

Interestingly the Vendor is different…though not sure this caused the issue. In any case the work around is to upgrade the distributed switch using the Set-VDSwitch command.

And success!

I’m not sure what caused the error to appear in the Web Client but the workaround meant that it became a moot point. Suffice to say if you come across this error in your Web Client when trying to upgrade a distributed switch…head over the PowerCLI.

 

migrate2vcsa – Migrating vCenter 6.0 to 6.5 VSCA

Over the past few years i’ve written a couple of articles on upgrading vCenter from 5.5 to 6.0. Firstly an in place upgrade of the 5.5 VCSA to 6.0 and then more recently an in place upgrade of a Windows 5.5 vCenter to 6.0. This week I upgraded and migrated my NestedESXi SliemaLab vCenter using the migrate2vcsa tool that’s now bundled into the vCenter 6.5 ISO. The process worked first time and even though I held some doubts about the migration working without issue and my Windows vCenter is now in retirement.

The migration tool that’s part of vSphere 6.5 was actually first released as a VMware fling after it was put forward as an idea in 2013. It was then officially to GA with the release of vSphere 6.0 Update 2m…where m stood for migration. Over it’s development it has been championed by William Lam who has written a number of articles on his blog and more recently Emad Younis has been the technical marketing lead on the product as it was enhanced for vSphere 6.5.

Upgrade Options:

You basically have two options to upgrade a Windows based 6.0 vCenter:

My approach for this particular environment was to ensure a smooth upgrade to vSphere 6.0 Update 2 and then look to upgrade again to 6.5 once is thaws outs in the market. The cautious approach will still be undertaken by many and a stepped upgrade to 6.5 and migration to the VCSA will still be common place. For those that wish to move away from their Windows vCenter, there is now a very reliable #migrate2vcsa path…as a side note it is possible to migrate directly from 5.5 to 6.5.

Existing Component Versions:

  • vCenter 6.0 (4541947)
    • NSX Registered
    • vCloud Director Registered
    • vCO Registered
  • ESXi 6.0 (3620759)
  • Windows 2008 (RTM)
  • SQL Server 2008 R2 (10.50.6000.34)

All vCenter components where installed on the Windows vCenter instance including Upgrade Manager. There where also a number of external services registered agains’t the vCenter of which the NSX Manager needed to be re-registered for the SSO to allow/trust the new SSL certificate thumbprint. This is common, and one to look out for after migration.

Migration Process:

I’m not going to go through the whole process as it’s been blogged about a number of times, but in a nutshell you need to

  • Take a backup of your existing Windows vCenter
  • I took a snapshot as well before I began the process
  • Download the vCenter Server Appliance 6.5 ISO and mount the ISO
  • Copy the migration-assistant folder to the Windows vCenter
  • Start the migration-assistant tool and work through the pre-checks

If all checks complete successfully the migration assistant will finish at waiting for migration to start. From here you start the VCSA 6.5 installer and click on the Migrate menu option.

Work through the wizard which asks you for detail on the source and target servers, lets you select the compute, storage and appliance size as well as the networking settings. Once everything is entered we are ready to start Stage 1 of the process.

When Stage 1 finishes you are taken to Stage 2 where is asks you to select the migration data as shown below. This will give you some idea as to how much storage you will need and what the initial foot print of the over and above the actual VCSA VM storage.

There are a couple more steps the migration assistant goes through to complete the process…which for me took about 45 minutes to complete but this will vary depending on the amount of date you want to transfer across.

If there are any issues or if the migration failed at any of the steps you do have the option to power down/remove the new VCSA and power back on the old Windows vCenter as is. The old Windows vCenter would have been shutdown by the migration process just as the copying of the key data finished and the VCSA was rebooted with network settings and machine name copied across. There is proper roll back series of steps listed in this VMwareKB.

The only external service that I needed to re-register against vCenter was NSX. vCloud Director carried on without issue, but it’s worth checking out all registered services just in case.

Conclusion and Thoughts:

As mentioned at the start, I was a bit skeptical that this process would work as flawlessly as it did…and on it’s first time! It’s almost a little disappointing to have this as automated and hands off as it is, but it’s a testament to the engineering effort the team at VMware has done around this tool to make it a very viable and reliable way to remove dependancies on Windows and MSSQL. It also allows those with older version of Windows that are well past their used by date the ability to migrate to the VSCA with absolute confidence.

References:

http://www.virtuallyghetto.com/page/2?s=migrate2vcsa

https://github.com/younise/migrate2vcsa-resources

Connecting to Home or Office Networks with Veeam Powered Network

A few weeks ago I wrote an article on how Veeam Powered Network can make accessing your homelab easy with it’s straight forward approach to creating and connection site-to-site and point-to-site VPN connections. Since then I’ve done a couple of webinars on Veeam PN and I was asked a number of times if Veeam PN can be setup without the use of a central hub appliance.

To refresh the use case that I went through in my first post, I wanted to access my homelab/office machines while on the road.

Click here to enlarge.

With the use of the Tunnelblick OpenVPN Client on my MBP I am able to create a point-to-site connection to the Veeam PN HUB which is in turn connected via site-to-site to each of the subnets I want to connect into.

Single Veeam PN Appliance Deployment Model

After fielding a couple of similar questions during the webinars it became apparent that the first use case I described was probably more complicated than it needed to be for the average home office user…that is create a simple point-to-site VPN to allows remote access into the network. This use case can also be used to access a simple (flat) company network for remote users.

In this scenario I want to have access via the OpenVPN endpoint client to my internal network of 192.168.1.0/24 via a single Veeam PN appliance that’s been deployed in my home office network. To go over the Veeam PN deployment process read my first post and also visit this VeeamKB that describes where to get the OVA and how to deploy and configure the appliance for first use.

Components

  • Veeam PN Hub Appliance x 1
  • OpenVPN Client

Networking Requirements

  • Veeam PN Hub Appliance – Incoming Ports UDP 1194, 6179 and TCP 443
  • OpenVPN Client – Outgoing access to at least UDP 6179

In my setup the Veeam PN Hub Appliance has been deployed into VMware Workstation and has picked up a DHCP address. Unlike the Azure Market Place deployment you need to go through an initial configuration wizard to setup the Hub appliance to be ready to accept connections. Go to the Veeam PN URL, enter in the default username and password and click through to the Initial Configuration wizard.

Next step is to configure the SSL certificate that is used for a number of services, but importantly is used to facilitate authentication between the Hub, site and endpoints.

Next step is to configure the Site-to-site and the Point-to-site VPN settings which will be used in the OVPN configuration files that are generated later on.

Once that’s done you are sent to the Veeam PN home dashboard page. In order to have the 192.168.1.0/24 network accessible remotely you need to configure it as a site, as shown below from the Clients menu. This is a bit of a workaround to ensure that the correct static routes are included in the endpoint OVPN configuration files but note that the site will never become connected in the client status window.

To be able to connect into my home office when on the road the final step is to register a standalone client. Again, because Veeam PN is leveraging OpenVPN what we are producing here is an OVPN configuration file that has all the details required to create the point-to-site connection…noting that there isn’t any requirement to enter in a username and password as Veeam PN is authenticating using SSL authentication. As a recap from my previous post, for my MPB I’m using the Tunnelblick OpenVPN Client that I’ve found it to be an excellent client but obviously being OpenVPN there are a bunch of other clients for pretty much any platform you might be running. Once I’ve imported the OVPN configuration file into the client I am able to authenticate against the Hub Appliance endpoint and the home office routing is injected into the network settings.

You can see above that the 192.168.1.0 static route has been added and set to use the tunnel interfaces default gateway which is on the Hub Appliance running in my home office. This means that from my MPB I can now get to any device on that subnets no matter where I am in the world…in this case I can RDP to my Windows workstation, and access other resources on 192.168.1.0/24.

Conclusion:

Summerizing the steps that where taken in order to setup and configure remote access into my home office using Veeam PN:

  • Deploy and configure Veeam PN Hub Appliance
  • Go through initial Hub Network Wizard
  • Register local network as a Site
  • Register Endpoints
  • Setup Endpoint and connect to Hub Appliance

Those five steps took me less than 10 minutes which also took into consideration the OVA deployment as well. The simplicity of the solution is what makes it very useful for home users wanting a quick and easy way to access their systems…but also, as mentioned for configuring external access to simple office networks!

Again, Veeam PN is free and is deployable from the Azure Marketplace to help extend availability for Microsoft Azure…or downloadable in OVA format directly from the veeam.com site.

 

VMworld 2017 – Session Breakdown and Analysis

Everything to do with VMworld this year feels like it’s earlier than in previous years. The call for papers opened in Feburary with session voting happening around the end of March. A couple of weeks ago presenters where notified if their session was accepted…or if it was rejected and the content catalog for the US event went live last week! At the moment there is 736 sessions listed which will grow when the #vBrownBag Tech Talks hosted by the VMTN Community get added.

As I do every year I like to filter through the content catalog and work out what technologies are getting the airplay at the event. What first struck me as being interesting was the track names:

Do you see a common thread? They obviously centre around the “digital transformation” theme that we have been fed at every major conference for the last four to five years. I don’t mind it so much, but I know it’s becoming a bit of an industry joke when we hear the same messaging around transformation, digital workspace and modernization.

Shown above are all the products and topics listed in the content catalog and previously when the public voting took place I did some analysis around the number of sessions relating to the filters shown below.

  • vCD 32
  • vCloud 305
  • vCloud Director 64
  • NSX 426
  • NSX-T 116
  • vSAN 223
  • AWS 51
  • Containers 85
  • Devops 69
  • Automation 223

Using those same filters, below are the numbers from what made the cut and are in the content catalog for 2017.

What’s interesting in looking at the submitted sessions vs what was picked up…to be included in the content catalog for the event if you want a better than even chance of having your session accepted, submit around NSX, NSX-T, vSAN, AWS and Containers. In the case of vSAN and Containers, working with these numbers about 60% of the submitted sessions got approved and in the case of AWS the number of sessions approved was more than what was submitted!

Even though the number of vCD related sessions didn’t make it through the numbers are still well up from the dark days of vCD around the 2013 and 2014 VMworlds. For anyone working on cloud technologies this year promises to be a bumper year for content so if you haven’t registered for VMworld 2017 yet…what are you waiting for!

Register here:

Top vBlog 2017 – Last week to Vote!

While I had resisted the temptation to put out a blog on this years Top vBlog voting I thought with the voting coming to an end it was worth giving it a shout just in case there are some of you who hadn’t had the chance to vote or didn’t know about the Top vBlog vLaunchPad list created and maintained by Eric Siebert of vShere-Land.

This year’s voting has a slightly different format with the total vote being determined by the following:

  • 60% – public voting – general voting – anyone can vote – votes are tallied and weighted for points based on voting rankings as done in past years
  • 20% – private judges scoring – chosen judges who will grade a select group of blogs based on several factors, combined rankings will equal points
  • 10% – number of posts in a year – how much effort a blogger has put into writing posts over the course of a year based on Andreas hard work adding this up each year (aggregator’s excluded)
  • 10% – Google PageSpeed score – how well a blogger has done to build and optimize their site as scored by Google’s PageSpeed tools

As Eric mentions the vBlog voting should be based on blog content based around longevity, length, frequency and quality of the posts. There is an amazing amount of great content that gets created daily by this community and all things aside, this Top vBlog vote goes someway to recognizing the hard work most bloggers put into the creation of content for the community. Special mention to Duncan Epping and Frank Denneman for pulling out of the voting this year to give others a shot at moving up the ranks…it’s a classy move!

Good luck to all those who are listed and for those who haven’t voted yet click on the link below to cast your vote. If you feel inclined and enjoy my content around vCloud Director, Availability, NSX, vSAN and Cloud and Hosting in general…It would be an honor to have you consider anthonyspiteri.net in your Top 12 and also in the Independent Blogger category.

http://topvblog2017.questionpro.com

Thanks again to Eric Siebert.

References:

http://vsphere-land.com/news/voting-now-open-for-top-vblog-2017.html

http://vsphere-land.com/news/coming-soon-top-vblog-2017-with-a-new-scoring-method.html

VMware vSphere 6.5 Host Resources Deep Dive – A Must Have!

Just after I joined Zettagrid in June of 2013 I decided to load up vSphere 5.1 Clustering Deepdive by Duncan Epping and Frank Denneman on my iPad to read on my train journey to and from work. Reading that book allowed me to gain a deeper understanding of vSphere through the in depth content that Duncan and Frank had produced. Any VMware administrator worth their salt would be familiar with the book (or the ones that proceeded it) and it’s still a brilliant read.

Fast forward a few versions of vSphere and we finally have follow up:

VMware vSphere 6.5 Host Resources Deep Dive

This time around Frank has been joined by Niels Hagoort and together they have produced another must have virtualization book…though it goes far beyond VMware virtualization. I was lucky enough to review a couple of chapters of the book and I can say without question that this book will make your brain hurt…but in a good way. It’s the deepest of deep dives and it goes beyond the previous books best practice and dives into a lot of the low level compute, storage and networking fundamentals that a lot of us have either forgotten about, never learnt or never bothered to learn about.

This book explains the concepts and mechanisms behind the physical resource components and the VMkernel resource schedulers, which enables you to:

  • Optimize your workload for current and future Non-Uniform Memory Access (NUMA) systems.
  • Discover how vSphere Balanced Power Management takes advantage of the CPU Turbo Boost functionality, and why High Performance does not.
  • How the 3-DIMMs per Channel configuration results in a 10-20% performance drop.
  • How TLB works and why it is bad to disable large pages in virtualized environments.
  • Why 3D XPoint is perfect for the vSAN caching tier.
  • What queues are and where they live inside the end-to-end storage data paths.
  • Tune VMkernel components to optimize performance for VXLAN network traffic and NFV environments.
  • Why Intel’s Data Plane Development Kit significantly boosts packet processing performance.

If any of you have read Frank’s NUMA Deep Dive blog series you will start to get an appreciation of the level of technical detail this book covers, however it is written in a way that allows you absorb the information in a way that is digestible, though some parts may need to be read twice over. Well done to Frank and Niels on getting this book out and again, if you are working in and around anything to do with computers this is a must read so do yourself a favour and grab a copy.

The current Amazon locals that have access to purchase the book can be found below:

Amazon US: http://www.amazon.com/dp/1540873064
Amazon France: https://www.amazon.fr/dp/1540873064
Amazon Germany: https://www.amazon.de/dp/1540873064
Amazon India: http://www.amazon.in/dp/1540873064
Amazon Japan: https://www.amazon.co.jp/dp/1540873064
Amazon Mexico: https://www.amazon.com.mx/dp/1540873064
Amazon Spain: https://www.amazon.es/dp/1540873064
Amazon UK: https://www.amazon.co.uk/dp/1540873064

CPU Overallocation and Poor Network Performance in vCD – Beware of Resource Pools

For the longest time all VMware administrators have been told that resource pools are not folders and that they should only be used under circumstances where the impact of applying the resource settings is fully understood. From my point of view I’ve been able to utilize resource pools for VM management without too much hassle since I first started working on VMware Managed Service platforms and from a managed services point of view they are a lot easier to use as organizational “folders” than vSphere folders themselves. For me, as long as the CPU and Memory Resources Unlimited checkbox was ticked nothing bad happened.

Working with vCloud Director however, resource pools are heavily utilized as the control mechanism for resource allocation, sharing and management. It’s still a topic that can cause confusion when trying to wrap ones head around the different allocation models vCD offers. I still reference blog posts from Duncan Epping and Frank Denneman written nearly seven years ago to refresh my memory every now and then.

Before moving onto an example of how overallocation or client undersizing in vCloud Director can cause serious performance issues it’s worth having a read of this post by Frank that goes through in typical Frank detail around what resource management looks like in vCloud Director.

Proper Resource management is very complicated in a Virtual Infrastructure or vCloud environment. Each allocation models uses a different combination of resource allocation settings on both Resource Pool and Virtual Machine level

Undersized vDCs Causing Network Throughput Issue:

The Allocation Pool model was the one that I worked with the most and it used to throw up a few client related issues when I worked at Zetttagrid. When using the Allocation Pool method which is the default model you are specifying the amount of resources for your Org vDC and also specifying how much of these resources are guaranteed. The guarantee means that a reservation will be set and that the amount of guaranteed resources is taken from the Provider vDC. The total amount of resources specified is the upper boundary, which is also the resource pool limit.

Because tenants where able to purchase Virtual Datacenters of any size there was a number of occasions where the tenants undersized their resources. Specifically, one tenant came to us complaining about poor network performance during a copy operation between VMs in their vDC. At first the operations team thought that is was the network causing issues…we where also running NSX and these VMs where also on a VXLAN segment so fingers where being pointed there as well.

Eventually, after a bit of troubleshooting we where able to replicate the problem…it was related to the resources that the tenant had purchased or lack thereof. In a nutshell because the allocation pool model allows the over provisioning or resources not enough vCPU was purchased. The vDC resource pool had 1000Mhz of vCPU with a 0% reservation but he had created 4 dual vCPU VMs. When the network copy job started it consumed CPU which in turn exhausted the vCD CPU allocation.

What happened next can be seen in the video below…

With the resource pool constrained ready time is introduced to throttle the CPU which in turn impacts the network throughput. As shown in the video when the resource pool has the the unlimited button checked the ready goes away and the network throughput returns to normal.

Conclusion:

Again, its worth checking out the impact on the network throughput in the video as it clearly shows what happens what tenants underprovision or overallocate their Virtual Datacenters in vCloud Director. Outside of vCloud Director it’s also handy to understand the impact of applying reservations on Resource Pools in terms of VM compute and networking performance.

It’s not always the network!

References:

http://www.vmware.com/resources/techresources/10325

http://frankdenneman.nl/2010/09/24/provider-vdc-cluster-or-resource-pool/

http://www.yellow-bricks.com/2012/02/28/resource-pool-shares-dont-make-sense-with-vcloud-director/

https://kb.vmware.com/kb/2006684

Allocation Pool Organization vDC Changes in vCloud Director 5.1

« Older Entries