Tag Archives: Nested ESXI

NestedESXi – Network Performance Improvements with Learnswitch

I’ve been running my NestedESXi homelab for about eight months now but in all that time I had not installed or enabled the ESXi MAC Learning dvFilter. As a quick refresher the VMware Fling addresses the issues with nested ESXi hosts and the impact that promiscuous mode has when enabled on virtual switches. In a nutshell, network traffic will hit all the network interfaces attached to the portgroup which reduces network throughput and also increases latency and impacts CPU.

The ESXi MAC Learn dvFilter Fling was released about two years ago and its a must have for those running homelabs or work labs running nested ESXi. However earlier this year a new fling was released that improves on the dvFilter and addresses some of it’s limitations. The new native MAC Learning VMkernel module is called Learnswitch.

ESXi Learnswitch is a complete implementation of MAC Learning and Filtering and is designed as a wrapper around the host virtual switch. It supports learning multiple source MAC addresses on virtual network interface cards (vNIC) and filters packets from egressing the wrong port based on destination MAC lookup. This substantially improves overall network throughput and system performance for nested ESX and container use cases.

For a more in depth look at it’s functionality head over to William Lams blog post here.

dvFilter vs Learnswitch:

I was interested to see if the new Learnswitch offered any significant performance improvements over the dvFilter in addition to its main benefits. I went about installing and enabling the dvFilter in my lab and ran some basic performance tests using Crystal Disk Mark. Before that, I ran the performance test without either installed as a base.

Firstly to see what the network traffic looks like hitting the nested hosts you can see from the ESXTOP output below that each host is dealing with about the same amount of received packets. Overall throughput is reduced when this happens.

In terms of performance the Crystal Disk Mark test run on a nested VM (right) showed reduced performance across all tests when compared to one run on the parent host (left) directly.

There was also elevated datastore latency and significant CPU usage due to the overheads with the increased traffic hitting all interfaces.

The CPU usage alone shows the value in having the dvFilter or Learnswitch installed when running nested ESXi hosts.

With the baseline testing done I installed and enabled the dvFilter and then ran the same tests. For a detailed look at how to install the dvFilter (just in case you don’t fit the requirements for using the Learnswitch module) check out my initial post on the dvFilter here. Having gone through that I went about uninstalling the dvFilter and installing and configuring the Learnswitch.

Like the dvFilter you need to download and install am ESXi software bundle but unlike the dvFilter, you need to reboot the host to enable the Learnswitch module.

As per the instructions on William Lam’s post or the Fling page you then need to configure and run a Python script to enable the Learnswitch against the NestedESXi portgroups that have promiscuous mode enabled.

From there the impact of the module is immediate and you can see a normalization of network traffic hitting the interfaces of each NestedESXi host. When running the performance test the ESXTOP output is significantly different to what you see if the module is not loaded as shown below.

You also have access to a new command that lists out stat’s of the Learnswitch showing packet and port statistics as well as the current MAC address table.

In terms of what it looks like from a performance point of view, below are the results of all Crystal Disk Mark tests. The bottom two represent the dvFilter (left) and the Learnswitch (right).

And finally to have a look at the improvement in CPU performance with the modules installed you can see below a timeline showing the performance tests run at different times across the last 24 hours…again a significant improvement looking at the graphs on the left hand side which was during the testing without any module and then moving across to the dvFilter test with the Learnswitch test on the right hand side. It does seem like the Learnswitch is a little better on CPU, but can’t be 100% with my limited testing.

Conclusion:

As expected there isn’t a huge different in performance between both modules but certainly the features of the Learnswitch make it the new preferred choice out of the two if the requirements are met. Again, the main advantages of the Learnswitch over the dvFilter make it a must have addition to any NestedESXi environment. If you haven’t installed either yet…get onto it!

Homelab – Lab Access Made Easy with Free Veeam Powered Network

A couple of weeks ago at VeeamON we announced the RC of Veeam PN which is a lightweight SDN appliance that has been released for free. While the main messaging is focused around extending network availability for Microsoft Azure, Veeam PN can be deployed as a stand alone solution via a downloadable OVA from the veeam.com site. While testing the product through it’s early dev cycles I immediately put into action a use case that allowed me to access my homelab and other home devices while I was on the road…all without having to setup and configure relatively complex VPN or remote access solutions.

There are a lot of existing solutions that do what Veeam PN does and a lot of them are decent at what they do, however the biggest difference for me with comparing say the VPN functionality with a pfSense is that Veeam PN is purpose built and can be setup within a couple of clicks. The underlying technology is built upon OpenVPN so there is a level of familiarity and trust with what lies under the hood. The other great thing about leveraging OpenVPN is that any Windows, MacOS or Linux client will work with the configuration files generated for point-to-site connectivity.

Homelab Remote Connectivity Overview:

While on the road I wanted to access my homelab/office machines with minimal effort and without the reliance on published services externally via my entry level Belkin router. I also didn’t have a static IP which always proved problematic for remote services. At home I run a desktop that acts as my primary Windows workstation which also has VMware Workstation installed. I then have my SuperMicro 5028D-TNT4 server that has ESXi installed and runs my NestedESXi lab. I need access to at least RDP into that Windows workstation, but also get access to the management vCenter, SuperMicro IPMI and other systems that are running on the 192.168.1.0/24 subnet.

As seen above I also wanted to directly access workloads in the NestedESXi environment specifically on the 172.17.0.1/24 and 172.17.1.1/24 networks. A little more detail on my use case in a follow up post but as you can see from the diagram above, with the use of the Tunnelblick OpenVPN Client on my MBP I am able to create a point-to-site connection to the Veeam PN HUB which is in turn connected via site-to-site to each of the subnets I want to connect into.

Deploying and Configuring Veeam Powered Network:

As mentioned above you will need to download the Veeam PN OVA from the veeam.com website. This VeeamKB describes where to get the OVA and how to deploy and configure the appliance for first use. If you don’t have a DHCP enabled subnet to deploy the appliance into you can configure the network as a static by accessing the VM console, logging in with the default credentials and modifying the /etc/networking/interface file as described here.

Components

  • Veeam PN Hub Appliance x 1
  • Veeam PN Site Gateway x number of sites/subnets required
  • OpenVPN Client

The OVA is 1.5GB and when deployed the Virtual Machine has the base specifications of 1x vCPU, 1GB of vRAM and a 16GB of storage, which if thin provisioned consumes a tick over 5GB initially.

Networking Requirements

  • Veeam PN Hub Appliance – Incoming Ports TCP/UDP 1194, 6179 and TCP 443
  • Veeam PN Site Gateway – Outgoing access to at least TCP/UDP 1194
  • OpenVPN Client – Outgoing access to at least TCP/UDP 6179

Note that as part of the initial configuration you can configure the site-to-site and point-to-site protocol and ports which is handy if you are deploying into a locked down environment and want to have Veeam PN listen on different port numbers.

In my setup the Veeam PN Hub Appliance has been deployed into Azure mainly because that’s where I was able to test out the product initially, but also because in theory it provides a centralised, highly available location for all the site-to-site connections to terminate into. This central Hub can be deployed anywhere and as long as it’s got HTTPS connectivity configured correctly you can access the web interface and start to configure your site and standalone clients.

Configuring Site Clients (site-to-site):

To complete the configuration of the Veeam PN Site Gateway you need to register the sites from the Veeam PN Hub Appliance. When you register a client, Veeam PN generates a configuration file that contains VPN connection settings for the client. You must use the configuration file (downloadable as an XML) to set up the Site Gateway’s. Referencing the digram at the beginning of the post I needed to register three seperate client configurations as shown below.

Once this has been completed I deployed three Veeam PN Site Gateway’s on my Home Office infrastructure as shown in the diagram…one for each Site or subnet I wanted to have extended through the central Hub. I deployed one to my Windows VMware Workstation instance  on the 192.168.1.0/24 subnet and as shown below I deployed two Site Gateway’s into my NestedESXi lab on the 172.17.0.0/24 and 172.17.0.1/24 subnets respectively.

From there I imported the site configuration file into each corresponding Site Gateway that was generated from the central Hub Appliance and in as little as three clicks on each one, all three networks where joined using site-to-site connectivity to the central Hub.

Configuring Remote Clients (point-to-site):

To be able to connect into my home office and home lab which on the road the final step is to register a standalone client from the central Hub Appliance. Again, because Veeam PN is leveraging OpenVPN what we are producing here is an OVPN configuration file that has all the details required to create the point-to-site connection…noting that there isn’t any requirement to enter in a username and password as Veeam PN is authenticating using SSL authentication.

For my MPB I’m using the Tunnelblick OpenVPN Client I’ve found it to be an excellent client but obviously being OpenVPN there are a bunch of other clients for pretty much any platform you might be running. Once I’ve imported the OVPN configuration file into the client I am able to authenticate against the Hub Appliance endpoint as the site-to-site routing is injected into the network settings.

You can see above that the 192.168.1.0, 172.17.0.0 and 172.17.0.1 static routes have been added and set to use the tunnel interfaces default gateway which is on the central Hub Appliance. This means that from my MPB I can now get to any device on any of those three subnets no matter where I am in the world…in this case I can RDP to my Windows workstation, connect to vCenter or ssh into my ESXi hosts.

Conclusion:

Summerizing the steps that where taken in order to setup and configure the extension of my home office network using Veeam PN through its site-to-site connectivity feature to allow me to access systems and services via a point-to-site VPN:

  • Deploy and configure Veeam PN Hub Appliance
  • Register Sites
  • Register Endpoints
  • Deploy and configure Veeam PN Site Gateway
  • Setup Endpoint and connect to Hub Appliance

Those five steps took me less than 15 minutes which also took into consideration the OVA deployments as well…that to me is extremely streamlined, efficient process to achieve what in the past, could have taken hours and certainly would have involved a more complex set of commands and configuration steps. The simplicity of the solution is what makes it very useful for home labbers wanting a quick and easy way to access their systems…it just works!

Again, Veeam PN is free and is deployable from the Azure Marketplace to help extend availability for Microsoft Azure…or downloadable in OVA format directly from the veeam.com site. The use case i’ve described and have been using without issue for a number of months adds to the flexibility of the Veeam Powered Network solution.

References:

https://helpcenter.veeam.com/docs/veeampn/userguide/overview.html?ver=10

https://www.veeam.com/kb2271

 

Free Guide: Building NetApp ONTAP 9 Lab

In computing, there is one thing you shouldn’t compromise on…and that thing is storage. This carries over to Lab or NestedESXi environments as poor lab performance can be just as frustrating as production performance issues. I’ve used a number of nested storage platform’s for my lab environments and I’m always on the lookout for alternative solutions.

When Neil Anderson asked my to write a short introductory post on his new how-to guide Build Your Own NetApp ONTAP 9 Lab I decided to flick through the guide to check it out and see if it could add any value to my future plans for a homelab. The e-book is professionally laid out and has excellent diagrams, notes and step by step…it’s extremely comprehensive.

http://www.flackbox.com/netapp-simulator/

While I’ve never been a NetApp guy there does seem to be a level of complexity in the NetApp VSA setup, but with the step by step in the e-book any ambiguity is removed. If you are looking for lab storage this is a great end to end example of how to install and configure one based on ONTOP 9 NetApp Simulator.

Give it a look over here.

VMware Fling: ESXi MAC Learning dvFilter for Nested ESXi

Late last year I was load testing against a new storage platform using both physical and nested ESXi hosts…at the time I noticed decreased network throughput while using Load Test VMs hosted on the nested hosts. I wrote this post and reached out to William Lam who responded with an explanation as to what was happening and why promiscuous mode was required for nested ESXi installs.

http://www.virtuallyghetto.com/2013/11/why-is-promiscuous-mode-forged.html

Forward to VMworld 2014 and in a discussion I had with William at The W Bar (where lots of great discussions are had) after the Official Party he mentioned that a new Fling was about to be released that addresses the issues with nested ESXi hosts and promiscuous mode enabled on the Virtual Switches. As William explains in his new blog post he took the problem to VMware Engineering who where having similar issues in their R&D Labs and have come up with a workaround…this workaround is now an official Fling! Apart from feeling a little bit chuffed that I sparked interest in this problem which has resulted in a fix, I decided to put it to the test in my lab.

I ran the same tests that I ran last year. Running one load test on a 5.5 ESXi host nested on a physical 5.5 Host I saw equal network utilization across all 6 nested hosts.

The Load VM was only able to push 15-17MBps on a random read test. As William saw in his post ESXTOP shows you more about whats happening

About even network throughput across all NICs on all Hosts that are set for Promiscuous Mode…Overall throughput is reduced

After installing the VIB on the Physical host, you have to add the Advanced Virtual Machine settings to each Nested Host to enable the MAC Learning. Unless you do this via an API call you will need to shutdown the VM to edit the VMX/Config. I worked through a set of PowerCLI commands shown below to bulk add the Advanced Setting to running Nested Hosts. Below works for any VM matching ESX in a resource pool and has two NICs.

Checking back in on ESXTOP it looks to have an instant effect and only the Nested Host generating the traffic shows significant network throughput…the other hosts are doing nothing and I am now seeing about 85-90MBps against the load test.

Taking a look at Network Throughput graphs (below) you can see an example of two Nested Hosts in the group with the same throughput until the dvFilter was installed at which point traffic dropped on the host not running the load test. Throughput increased almost five fold on the host running the test.

The effect on Nested Host CPU utilization is also dramatic. Only the host generating the load has significant CPU usage while the other hosts return to normal operations…meaning overall the physical host CPUs are not working as hard.

As William mentions in his post this is a no brainer install for anyone using nested ESXi hosts for lab work…thinking about further implications of this fix I am thinking about the possibility of being able to support full nested environments within Virtual Data Centers without the fear of increased host CPU and decreased network throughput…for this to happen though VMware would need to change their stance on supportability of Nested ESXi environments…but this Fling, together with the VMTools Fling certainly makes nested hosts all that more viable.