Author Archives: Anthony Spiteri

NSX Bytes: NSX-v 6.3.3 Released – Upgrade Notes and Enhancements

Last week VMware released NSX-v 6.3.3 (Build 6276725) and with it comes a new operating system for the NSX Controllers. Once upgraded the new controllers will be powered by Photon OS which is more and more making it’s way into VMware’s appliances. There are a few other new bits in this release but more importantly a number of Resolved Issues. For those running homelabs with one NSX Controller there are some important upgrade notes to be made aware of before kicking off…i’ll go into those below.

Compatibility:

Before moving to the upgrade there are some important notes around interoperability and supported ESXi versions as is explained in this VMwareKB. The minimum supported version of ESXi running with NSX-v 6.3.3 is as shown below:

  • NSX-v 6.3.3 installed in a vSphere 5.5 environment requires a minimum version of ESXi 5.5 GA
  • NSX-v 6.3.3 installed in a vSphere 6.0 environment requires a minimum version of ESXi 6.0 Update 2
  • NSX-v 6.3.3 installed in a vSphere 6.5 environment requires a minimum version of ESXi 6.5a
If NSX 6.3.3 is installed on an earlier version of 5.5/6.0 ESXi, the netcpa service will fail to start preventing communication between ESXi hosts and the NSX Controllers.
In terms of upgrading from previous versions of NSX-v you can see that the upgrade path does have some stoppers. Below is the interoperability matrix that included vCloud Director 8.20 which, at the moment is not supported with NSX-v 6.3.3…I expect that to change over the next couple of weeks.
Upgrading to NSX-v 6.3.3:

 

As mentioned there are things to look out for during and after the upgrade from previous builds of NSX-v. There are detailed upgrade notes in the release notes so as always, make sure to read those as well, but below is a brief walk through of the upgrade process I conducted in one of my NestedESXi labs.
Once the NSX Manager has been upgraded you should have the following in your Summary tab:
Once the NSX Manager has been upgraded you should restart the vCenter Web Client to ensure any lingering parts of the previous version are removed. Login to the Web Client and click through to Networking & Security -> Installation and then the Management Tab where you will see Upgrade Available.
IMPORTANT NOTE: The upgrade notes state that you need to have a minimum of three NSX Controllers which I’d say is linked to the fact that the underlying OS of the Controllers has been shifted to Photon OS. This is likely to impact anyone running NSX in a NestedESXi or homelab as generally, only one was deployed to preserve resources. Once you click on upgrade you will get a special upgrade warning before committing to the upgrade as shown below:
  • The NSX Controller cluster must contain three controller nodes to upgrade to NSX 6.3.3. If it has fewer than three controllers, you must add controllers before starting the upgrade
  • When you upgrade to NSX-v 6.3.3, instead of an in-place software upgrade, the existing controllers are deleted one at a time, and new Photon OS based controllers are deployed using the same IP addresses

There is also a slight increase to the size of the storage for the controllers from 20GB to 28GB. Once upgraded the NSX Controllers will be at version 6.3.6235594.

The last major step is to upgrade the Host components from the Host Preparation tab. On vSphere 6.0 and above once you have upgraded to NSX 6.3.x, all future NSX VIB changes do not trigger a reboot…only maintenance mode is required to complete the VIB change. In NSX 6.3.3 there is a change to the NSX VIB names on ESXi 6.0 and later where the esx-vxlan and esx-vsip VIBs have been merged and replaced with esx-nsxv as shown below.

VIB names on ESXi 5.5 remain the same.

References:

https://docs.vmware.com/en/VMware-NSX-for-vSphere/6.3/rn/releasenotes_nsx_vsphere_633.html

https://kb.vmware.com/kb/2151267

 

Top vBlog 2017: Notable Representation and Thanks

It feels like this year moving along at ludicrous speed so it’s no surprise that the Top vBlog for 2017 has been run and won. This year Eric Siebert changed things up by introducing new voting mechanisms to try and deliver a more palatable outcome for all who where involved…I think it worked well and delivered interesting results for all those active bloggers listed on the vLaunchpad.

Eric introduced a point system based on Google Page Speed and the number of posts in 2016 to help level the playing field and make it less of a perceived popularity contest. Introducing tangible metrics to make up a portion of the total ranking points was an interesting move and seemed to work well. If nothing else it made people (myself included) more aware around the dark art of web page speed optimization…and this has meant a better browsing experience for those visiting Top vBlog sites.

The Results:

As expected, with Duncan Epping bowing out of the race William Lam deservedly took out the #1 spot with Vladan Seget, Cormac Hogan, Chris Wahl and Scott Lowe rounding out the top 5. There was lots of movement in the top 25 and I managed to sneak into the top 20 at #19 which is extremely humbling.

Creating content for this community is a pleasure and has become somewhat of a personal obsession so it’s nice to get some recognition and I’m happy that what I’m able to produce is (for the most) found useful by people in the community. I’m a passionate guy in most things that I am involved in so it’s no surprise that I feel so strongly in being able to contribute to this great vCommunity…especially when it comes to my strong passion around Hosting, Cloud, Backup and DR.

Aussie Representation:

As with previous years I like to highlight the Aussie and Kiwi (ANZ) representation in the Top vBlog and this year is no different. We have a great blogging scene here in the VMware community and that is reflected with the quality of the bloggers listed below. Special mention to Matt Allford who debuted at #190 …watch out for him to climb up the list over the next few years!

Blog Rank Prev +/- Total Points Total Votes Voting Points #1 Votes # 2016 posts Post Pts PS % PS Pts
Virtualization is Life! (Anthony Spiteri) 19 44 25 1420 165 1042 15 123 246 66% 132
Long White Virtual Clouds (Michael Webster) 20 13 -7 1374 201 1228 7 17 34 56% 112
CloudXC (Josh Odgers) 29 17 -12 1166 144 930 7 53 106 65% 130
Penguinpunk.net (Dan Frith) 106 78 -28 545 48 317 4 46 92 68% 136
Virtual 10 (Manny Sidhu) 115 82 -33 520 44 268 0 32 64 94% 188
Proudest Monkey (Grant Orchard) 127 93 -34 501 53 341 0 11 22 69% 138
Demitasse (Alastair Cooke) 140 168 28 467 42 261 0 33 66 70% 140
Pragmatic IO (Brett Sinclair) 174 153 -21 418 29 222 6 12 24 86% 172
Virtual Tassie (Matt Allford) 190 NEW NEW 392 29 170 1 23 46 88% 176
ukotic.net (Mark Ukotic) 201 179 -22 369 24 159 6 14 28 91% 182
Virtual Notions (Derek Hennessy) 266 298 32 253 21 121 0 25 50 41% 82
Veeam Representation:

My follow colleagues at Veeam made it into the list and all below made the top 50!

Blog Rank Prev +/- Total Points Total Votes Voting Points #1 Votes # 2016 posts Post Pts PS % PS Pts
Virtualization is Life! (Anthony Spiteri) 19 44 25 1420 165 1042 15 123 246 66% 132
Notes from MWhite (Michael White) 31 38 7 1080 108 642 2 132 264 87% 174
Virtual To The Core (Luca Dell’Oca) 38 41 3 927 111 695 2 36 72 80% 160
vZilla (Michael Cade) 44 120 76 871 109 645 9 27 54 86% 172
Tim’s Tech Thoughts (Tim Smith) 48 100 52 837 92 583 5 32 64 95% 190
The Results Show:

Again a massive thank you to Eric for putting together the voting and organising the whole thing. It’s a huge undertaking and we should all be in gratitude to Eric for making it all happen.

The whole list and category winners can been viewed here.

 

Updated: VMworld 2017 – #vGolf Las Vegas

To say that #vGolf is back bigger and better than the inaugural #vGolf held last year at VMworld 2016 is an understatement! This year’s event has been very popular and has gotten a great response.

I’d like to give special mention and thank you to the sponsors of this years event:

Special mention going to Expedient and  who has organized special branded #vGolf balls for the day. We are still looking for another couple of sponsors which will mean that we will be able to accomodate more players as we reached the initial maximum capacity four weeks ago and have had a number of enquiries into getting people on a waiting list. Can I ask that those who wish to put on a wait list please fill out the form below and from there we are working to try and extend the numbers that can play on the day.

vSphere 6.5 Update 1 – What’s in it for Service Providers

Late last week VMware released vSphere 6.5 Update 1 which included updated builds of both vCenter and ESXi and as per usual I will go through some of the key features and fixes that are included in the latest versions of vCenter and ESXi. When looking through the release notes I generally keep an eye out for improvements that relate back to Service Providers who use vSphere as the foundation of their Managed or Infrastructure as a Service offerings. This update also contains an update to vSAN which is now at 6.6.1 so I’ll spend some time looking at what’s been added there.

 

New Features and Enhancements:

Without question this is a significant patch release for vCenter and ESXi and the length of the release notes is testament to that point. In terms of new features there isn’t anything groundbreaking but there are a few nice additions like being able to run the VCSA GUI and CLI installers on Windows 2012 and 2012 R2 as well as 2016 and macOS Sierra and Ubuntu 17.04 OS is supported for Guest OS Customization. vCenter now supports Microsoft SQL Server 2014 SP2 2016 and SP1 as well as some increased configuration maximums supporting Linked Mode with 15 vCenter Instances, 5000 ESXi hosts and 50,000 powered on virtual machines.

Ability to Upgrade or Migrate from vCenter 6.0 Update 3:

This release addresses the previous limitation in the upgrade and migration path for those running vSphere 6.0 U3 in going to vSphere 6.5. I know this will make a lot of providers happy as I know a lot that had to go to 6.0 Update 3 to address existing bug in the platform but where not yet ready or able to go to 6.5 at the time.

HTML5 Client Update:

The HTML5 Web Client has gotten it’s own update that brings it up to speed with the 3.15 Fligng version however it’s still partially functional which remains somewhat frustrating…The online documentation for supported functionality has been updated to vSphere 6.5U1 and is available here.

The list below is of the main updates in this release.

  • DRS/HA VM overrides
  • SDRS rules
  • Content Library – further actions
  • Roles and Global Permissions
  • Download multiple files as zip
  • Distributed Switch – further actions
  • Fault Tolerance
  • SPBM
  • VM Hardware – further items
  • Apply Customize Guest OS during Clone
  • VM Migration – further actions (compute+storage, Cross VC, batch)
vSAN Features:

For service providers, vSAN 6.6 was another major release that sured up vSANs status as a serious storage platform for service provider platforms.

vSAN 6.6.1 introduces three key new features:

  • VMware vSphere Update Manager (VUM) integration
  • Performance Diagnostics in vSAN Cloud Analytics
  • Storage Device Serviceability enhancement

The ability to upgrade with VUM is a nice touch and continues to improve on the usability and manageability of vSAN. For a full look at what’s new in this release for vSAN 6.6.1 head to this blog post.

Resolved Issues:

There are a bunch of resolved issues in this release and I’ve gone through the rather extensive list to pull out the biggest fixes that relate to my experience in service provider operations and have also extended this to include fixes that relate to backup operations. The majority of what I pick out related to storage, networking hosts and VM operations…the core of any platform, but even more important in the service provider world. The ones in red are specific fixes that relate to issues that iv’e come across…good to see them addressed!

vCenter:
  • First-boot failure occurs when upgrading from vSphere 5.5 or 6.0 to vSphere 6.5 on Windows If an older version of the OpеnSSL DLLs are installed, upgrading to vSphere 6.5 fails to run because the older DLL versions are loaded
  • Affinity rules configured on vCenter Server 5.5 can cause crashes after upgrading to vCenter Server 6.5 Migrating a VM with affinity rules configured while on vCenter Server 5.5 to a cluster that has affinity rules configured on vCenter Server 6.0 or 6.5 can cause vCenter Server to crash.
  • VM Snapshot Size (GB) alarm is not triggered after the VM is powered on. VM Snapshot Size (GB) alarm is reset if the virtual machine is shut down. Alarm fails to trigger after the VM is powered on. This issue occurs in alarms based on VM Snapshot (GB) and Vm Total Size on Disk because their status is altered when the power state of the VM is changed. This issue occurs because disk usage of a VM is the same regardless of the VM power state.
  • When you add ports to a vSphere Distributed Switch you get an error Because of a race condition, when you add ports to a vSphere Distributed Switch you get the error message: Cannot create a new port because number of ports exceeds 2147483647, maximum number of ports allowed on vDS.
  • A runtime exception “Unable to retrieve data about the distributed switch” might occur while upgrading vSphere Distributed Switch (vDS) from 5.0 to 6.5 version When you try to upgrade an existing distributed switch after the vCenter upgrade is completed, the runtime exception Unable to retrieve data about the distributed switch might occur in the wizard and the distributed switch cannot be upgraded. The exception is a result of unexpected value NULL for a LACP property of the distributed switch, instead of TRUE or FALSE, as LACP is not supported for the current version of vSphere Distributed Switch.
  • Host configuration might not be available after vCenter Server restarts After a vCenter Server restart, the host configuration might not be available if vCenter Server cannot communicate with the host. After connectivity is restored, the configuration becomes available.
  • OVF tool fails to upload OVF or OVA files larger than 10 GB If you use OVF tool fails to upload OVF or OVA files larger than 10 GB, the upload might fail.

ESXi:

  • Virtual machine crashes on ESXi 6.5 when multiple users log on to Windows Terminal Server VM Windows 2012 terminal server running VMware tools 10.1.0 on ESXi 6.5 stops responding when many users are logged in.vmware.log will show similar messages to2017-03-02T02:03:24.921Z| vmx| I125: GuestRpc: Too many RPCI vsocket channels opened.
    2017-03-02T02:03:24.921Z| vmx| E105: PANIC: ASSERT bora/lib/asyncsocket/asyncsocket.c:5217
    2017-03-02T02:03:28.920Z| vmx| W115: A core file is available in "/vmfs/volumes/515c94fa-d9ff4c34-ecd3-001b210c52a3/h8-
    ubuntu12.04x64/vmx-debug-zdump.001"
    2017-03-02T02:03:28.921Z| mks| W115: Panic in progress... ungrabbing 
  • An ESXi host might fail with purple diagnostic screen when collecting performance snapshots
    An ESXi host might fail with purple diagnostic screen when collecting performance snapshots with vm-support due to calls for memory access after the data structure has already been freed.An error message similar to the following is displayed:
  • Full duplex configured on physical switch may cause duplex mismatch issue with igb native Linux driver supporting only auto-negotiate mode for nic speed/duplex setting
    If you are using the igb native driver on an ESXi host, it always works in auto-negotiate speed and duplex mode. No matter what configuration you set up on this end of the connection, it is not applied on the ESXi side. The auto-negotiate support causes a duplex mismatch issue if a physical switch is set manually to a full-duplex mode.
  • An ESXi host might fail with a purple screen and a Spin count exceeded (refCount) – possible deadlock with PCPU error An ESXi host might fail with a purple screen and a Spin count exceeded (refCount) - possible deadlock with PCPU error, when you reboot the ESXi host under the following conditions:
    • You use the vSphere Network Appliance (DVFilter) in an NSX environment
    • You migrate a virtual machine with vMotion under DVFilter control
  • A Virtual Machine (VM) with e1000/e1000e vNIC might have network connectivity issues For a VM with e1000/e1000e vNIC, when the e1000/e1000e driver tells the e1000/e1000e vmkernel emulation to skip a descriptor (the transmit descriptor address and length are 0), a loss of network connectivity might occur.
  • An ESXi host might stop responding when you migrate a virtual machine with Storage vMotion between ESXi 6.0 and ESXi 6.5 hosts The vmxnet3 device tries to access the memory of the guest OS while the guest memory preallocation is in progress during the migration of virtual machine with Storage vMotion. This results in an invalid memory access and the ESXi 6.5 host failure.
  • Modification of IOPS limit of virtual disks with enabled Changed Block Tracking (CBT) fails with errors in the log files To define the storage I/O scheduling policy for a virtual machine, you can configure the I/O throughput for each virtual machine disk by modifying the IOPS limit. When you edit the IOPS limit and CBT is enabled for the virtual machine, the operation fails with an error The scheduling parameter change failed. Due to this problem, the scheduling policies of the virtual machine cannot be altered. The error message appears in the vSphere Recent Tasks pane.You can see the following errors in the /var/log/vmkernel.log file:2016-11-30T21:01:56.788Z cpu0:136101)VSCSI: 273: handle 8194(vscsi0:0):Input values: res=0 limit=-2 bw=-1 Shares=1000
    2016-11-30T21:01:56.788Z cpu0:136101)ScsiSched: 2760: Invalid Bandwidth Cap Configuration
    2016-11-30T21:01:56.788Z cpu0:136101)WARNING: VSCSI: 337: handle 8194(vscsi0:0):Failed to invert policy
  • When you hot-add an existing or new virtual disk to a CBT (Changed Block Tracking) enabled virtual machine (VM) residing on VVOL datastore, the guest operation system might stop responding When you hot-add an existing or new virtual disk to a CBT enabled VM residing on VVOL datastore, the guest operation system might stop responding until the hot-add process completes. The VM unresponsiveness depends on the size of the virtual disk being added. The VM automatically recovers once hot-add completes.
  • When you use vSphere Storage vMotion, the UUID of a virtual disk might change When you use vSphere Storage vMotion on vSphere Virtual Volumes storage, the UUID of a virtual disk might change. The UUID identifies the virtual disk and a changed UUID makes the virtual disk appear as a new and different disk. The UUID is also visible to the guest OS and might cause drives to be misidentified.
  • An ESXi host might become unresponsive if the VMFS-6 volume has no space for the journal When opening a VMFS-6 volume, it allocates a journal block. Upon successful allocation, a background thread is started. If there is no space on the volume for the journal, it is opened in read-only mode and no background thread is initiated. Any intent to close the volume, results in attempts to wake up a nonexistent thread. This results in the ESXi host failure.
  • SSD congestion might cause multiple virtual machines to become unresponsiv Depending on the workload and the number of virtual machines, diskgroups on the host might go into permanent device loss (PDL) state. This causes the diskgroups to not admit further IOs, rendering them unusable until manual intervention is performed.
  • Unable to collect vm-support bundle from an ESXi 6.5 host Unable to collect vm-support bundle from an ESXi 6.5 host because when generating logs in ESXi 6.5 by using the vSphere Web Client, the select specific logs to export text box is blank. The options: network, storage, fault tolerance, hardware etc. are blank as well. This issue occurs because the rhttpproxy port for /cgi-bin has a value different from 8303.This issue is resolved in this release.
  • vSphere Storage vMotion might fail with an error message if it takes more than 5 minutes The destination virtual machine of the vSphere Storage vMotion is incorrectly stopped by a periodic configuration validation for the virtual machine. vSphere Storage vMotion that takes more than 5 minutes fails with the The source detected that the destination failed to resume message.
    The VMkernel log from the ESXi host contains the message D: Migration cleanup initiated, the VMX has exited unexpectedly. Check the VMX log for more details.

vSAN:

  • Hosts in a vSAN cluster have high congestion which leads to host disconnects When vSAN components with invalid metadata are encountered while an ESXi host is booting, a leak of reference counts to SSD blocks can occur. If these components are removed by policy change, disk decommission, or other method, the leaked reference counts cause the next I/O to the SSD block to get stuck. The log files can build up, which causes high congestion and host disconnects.
  • vSAN cluster becomes partitioned after the member hosts and vCenter Server reboot If the hosts in a unicast vSAN cluster and the vCenter Server are rebooted at the same time, the cluster might become partitioned. The vCenter Server does not properly handle unstable vpxd property updates during a simultaneous reboot of hosts and vCenter Server.
  • Large File System overhead reported by the vSAN capacity monitor When deduplication and compression are enabled on a vSAN cluster, the Used Capacity Breakdown (Monitor > vSAN > Capacity) incorrectly displays the percentage of storage capacity used for file system overhead. This number does not reflect the actual capacity being used for file system activities. The display needs to correctly reflect the File System overhead for a vSAN cluster with deduplication and compression enabled.

It’s also worth reading through the Known Issues section as there is a fair bit to be aware of in Update 1 and that remain from the GA.

Happy upgrading!

References:

https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vsphere-esxi-651-release-notes.html

https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vsphere-vcenter-server-651-release-notes.html

Second vSphere Client (HTML5) update in vSphere 6.5U1

Introducing vSAN 6.6.1 and New Operational Savings

ESXI 6.5 Storage Performance Issues Resolved in Update 1

I originally came across the issue of slow storage performance with the native vmw_ahci driver that comes bundled with ESXi 6.5 just as I was first playing with my SuperMicro SYS-5028D-TN4T in my homelab. After publishing a couple of posts about the workaround shortly afterwards the issue become quiet prevalent in the community and the post continues to get decent traffic, meaning that the issues impacted quiet a few people out there.

The good news is that with the release of vSphere 6.5 Update 1 there is a fix for the problem in the form of updated drivers for the AHCI module. William Lam has been quick to blog about the fix and if you had previously disabled the driver you will need to re-enable it.

This VMwareKB covers the specific patch as listed in the release notes:

No confirmation as of yet if it actually does the trick, but the release notes look promising as the assumption is that it will resolve the issues so that homelabbers and people using the driver in production systems can rest easy.

References:

https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vsphere-esxi-651-release-notes.html

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2149910

http://www.virtuallyghetto.com/2017/07/ahci-vmw_ahci-performance-issue-resolved-in-esxi-6-5-update-1.html

Cloud to Cloud to Cloud Networking with Veeam Powered Network

I’ve written a couple of posts on how Veeam Powered Network can make accessing your homelab easy with it’s straight forward approach to creating and connection site-to-site and point-to-site VPN connections. For a refresh on the use cases that I’ve gone through, I had a requirement where I needed access to my homelab/office machines while on the road and to to achieve this I went through two scenarios on how you can deploy and configure Veeam PN.

In this blog post I’m going to run through a very real world solution with Veeam PN where it will be used to easily connect geographically disparate cloud hosting zones. One of the most common questions I used to receive from sales and customers in my previous roles with service providers is how do we easily connect up two sites so that some form of application high availability could be achieved or even just allowing access to applications or services cross site.

Taking that further…how is this achieved in the most cost effective and operationally efficient way? There are obviously solutions available today that achieve connectivity between multiple sites, weather that be via some sort of MPLS, IPSec, L2VPN or stretched network solution. What Veeam PN achieves is a simple to configure, cost effective (remember it’s free) way to connect up one to one or one to many cloud zones with little to no overheads.

Cloud to Cloud to Cloud Veeam PN Appliance Deployment Model

In this scenario I want each vCloud Director zone to have access to the other zones and be always connected. I also want to be able to connect in via the OpenVPN endpoint client and have access to all zones remotely. All zones will be routed through the Veeam PN Hub Server deployed into Azure via the Azure Marketplace. To go over the Veeam PN deployment process read my first post and also visit this VeeamKB that describes where to get the OVA and how to deploy and configure the appliance for first use.

Components

  • Veeam PN Hub Appliance x 1 (Azure)
  • Veeam PN Site Gateway x 3 (One Per Zettagrid vCD Zone)
  • OpenVPN Client (For remote connectivity)

Networking Overview and Requirements

  • Veeam PN Hub Appliance – Incoming Ports TCP/UDP 1194, 6179 and TCP 443
    • Azure VNET 10.0.0.0/16
    • Azure Veeam PN Endpoint IP and DNS Record
  • Veeam PN Site Gateways – Outgoing access to at least TCP/UDP 1194
    • Perth vCD Zone 192.168.60.0/24
    • Sydney vCD Zone 192.168.70.0/24
    • Melbourne vCD Zone 192.168.80.0/24
  • OpenVPN Client – Outgoing access to at least TCP/UDP 6179

In my setup the Veeam PN Hub Appliance has been deployed into Azure mainly because that’s where I was able to test out the product initially, but also because in theory it provides a centralised, highly available location for all the site-to-site connections to terminate into. This central Hub can be deployed anywhere and as long as it’s got HTTPS connectivity configured correctly to access the web interface and start to configure your site and standalone clients.

Configuring Site Clients for Cloud Zones (site-to-site):

To configuration the Veeam PN Site Gateway you need to register the sites from the Veeam PN Hub Appliance. When you register a client, Veeam PN generates a configuration file that contains VPN connection settings for the client. You must use the configuration file (downloadable as an XML) to set up the Site Gateway’s. Referencing the digram at the beginning of the post I needed to register three seperate client configurations as shown below.

Once this has been completed you need deploy a Veeam PN Site Gateway in each vCloud Hosting Zone…because we are dealing with an OVA the OVFTool will need to be used to upload the Veeam PN Site Gateway appliances. I’ve previously created and blogged about an OVFTool upload script using Powershell which can be viewed here. Each Site Gateway needs to be deployed and attached to the vCloud vORG Network that you want to extend…in my case it’s the 192.168.60.0, 192.168.70.0 and 192.168.80.0 vORG Networks.

Once each vCloud zone has has the Site Gateway deployed and the corresponding XML configuration file added you should see all sites connected in the Veeam PN Dashboard.

At this stage we have connected each vCloud Zone to the central Hub Appliance which is configured now to route to each subnet. If I was to connect up an OpenVPN Client to the HUB Appliance I could access all subnets and be able to connect to systems or services in each location. Shown below is the Tunnelblick OpenVPN Client connected to the HUB Appliance showing the injected routes into the network settings.

You can see above that the 192.168.60.0, 192.168.70.0 and 192.168.80.0 static routes have been added and set to use the tunnel interfaces default gateway which is on the central Hub Appliance.

Adding Static Routes to Cloud Zones (Cloud to Cloud to Cloud):

To complete the setup and have each vCloud zone talking to each other we need to configure static routes on each zone network gateway/router so that traffic destined for the other subnets knows to be routed through to the Site Gateway IP, through to the central Hub Appliance onto the destination and then back. To achieve this you just need to add static routes to the router. In my example I have added the static route to the vCloud Edge Gateway through the vCD Portal as shown below in the Melbourne Zone.

Conclusion:

Summerizing the steps that where taken in order to setup and configure the configuration of a cloud to cloud to cloud network using Veeam PN through its site-to-site connectivity feature to allow cross site connectivity while allowing access to systems and services via the point-to-site VPN:

  • Deploy and configure Veeam PN Hub Appliance
  • Register Cloud Sites
  • Register Endpoints
  • Deploy and configure Veeam PN Site Gateway in each vCloud Zone
  • Configure static routes in each vCloud Zone

Those five steps took me less than 30 minutes which also took into consideration the OVA deployments as well. At the end of the day I’ve connected three disparate cloud zones at Zettagrid which all access each other through a Veeam PN Hub Appliance deployed in Azure. From here there is nothing stopping me from adding more cloud zones that could be situated in AWS, IBM, Google or any other public cloud. I could even connect up my home office or a remote site to the central Hub to give full coverage.

The key here is that Veeam Power Network offers a simple solution to what is traditionally a complex and costly one. Again, this will not suit all use cases but at it’s most basic functional level, it would have been the answer to the cross cloud connectivity questions I used to get that I mentioned at the start of the article.

Go give it a try!

NestedESXi – Network Performance Improvements with Learnswitch

I’ve been running my NestedESXi homelab for about eight months now but in all that time I had not installed or enabled the ESXi MAC Learning dvFilter. As a quick refresher the VMware Fling addresses the issues with nested ESXi hosts and the impact that promiscuous mode has when enabled on virtual switches. In a nutshell, network traffic will hit all the network interfaces attached to the portgroup which reduces network throughput and also increases latency and impacts CPU.

The ESXi MAC Learn dvFilter Fling was released about two years ago and its a must have for those running homelabs or work labs running nested ESXi. However earlier this year a new fling was released that improves on the dvFilter and addresses some of it’s limitations. The new native MAC Learning VMkernel module is called Learnswitch.

ESXi Learnswitch is a complete implementation of MAC Learning and Filtering and is designed as a wrapper around the host virtual switch. It supports learning multiple source MAC addresses on virtual network interface cards (vNIC) and filters packets from egressing the wrong port based on destination MAC lookup. This substantially improves overall network throughput and system performance for nested ESX and container use cases.

For a more in depth look at it’s functionality head over to William Lams blog post here.

dvFilter vs Learnswitch:

I was interested to see if the new Learnswitch offered any significant performance improvements over the dvFilter in addition to its main benefits. I went about installing and enabling the dvFilter in my lab and ran some basic performance tests using Crystal Disk Mark. Before that, I ran the performance test without either installed as a base.

Firstly to see what the network traffic looks like hitting the nested hosts you can see from the ESXTOP output below that each host is dealing with about the same amount of received packets. Overall throughput is reduced when this happens.

In terms of performance the Crystal Disk Mark test run on a nested VM (right) showed reduced performance across all tests when compared to one run on the parent host (left) directly.

There was also elevated datastore latency and significant CPU usage due to the overheads with the increased traffic hitting all interfaces.

The CPU usage alone shows the value in having the dvFilter or Learnswitch installed when running nested ESXi hosts.

With the baseline testing done I installed and enabled the dvFilter and then ran the same tests. For a detailed look at how to install the dvFilter (just in case you don’t fit the requirements for using the Learnswitch module) check out my initial post on the dvFilter here. Having gone through that I went about uninstalling the dvFilter and installing and configuring the Learnswitch.

Like the dvFilter you need to download and install am ESXi software bundle but unlike the dvFilter, you need to reboot the host to enable the Learnswitch module.

As per the instructions on William Lam’s post or the Fling page you then need to configure and run a Python script to enable the Learnswitch against the NestedESXi portgroups that have promiscuous mode enabled.

From there the impact of the module is immediate and you can see a normalization of network traffic hitting the interfaces of each NestedESXi host. When running the performance test the ESXTOP output is significantly different to what you see if the module is not loaded as shown below.

You also have access to a new command that lists out stat’s of the Learnswitch showing packet and port statistics as well as the current MAC address table.

In terms of what it looks like from a performance point of view, below are the results of all Crystal Disk Mark tests. The bottom two represent the dvFilter (left) and the Learnswitch (right).

And finally to have a look at the improvement in CPU performance with the modules installed you can see below a timeline showing the performance tests run at different times across the last 24 hours…again a significant improvement looking at the graphs on the left hand side which was during the testing without any module and then moving across to the dvFilter test with the Learnswitch test on the right hand side. It does seem like the Learnswitch is a little better on CPU, but can’t be 100% with my limited testing.

Conclusion:

As expected there isn’t a huge different in performance between both modules but certainly the features of the Learnswitch make it the new preferred choice out of the two if the requirements are met. Again, the main advantages of the Learnswitch over the dvFilter make it a must have addition to any NestedESXi environment. If you haven’t installed either yet…get onto it!

Veeam Vault #7: Nutanix Support?!, Backup for Office365 1.5 BETA, VeeamON Forums plus Vanguard Roundup

It’s been just over two months since my last Veeam Vault went out and can you believe that was just before VeeamON 2017 in New Orleans. Again, for a recap of what was announced at VeeamON check out my wrap up post here…two months on and we haven’t stopped here at Veeam. As soon as VeeamON was done and dusted focus turned to EMEA SE training in Warsaw which my whole team attended and where the group got an extended look at the new features coming in v10. Since then, i’ve had a good stretch at home where i’ve been preparing for a series of webinars but mainly focused on the upcoming VeeamON Forums happening around the APAC region.

I’ll be presenting sessions at all events and be on stage with Clint Wyckoff for the Sydney and Auckland keynotes where our co-CEO, Peter McKay and VP of Global Cloud Group, Paul Mattes will be headlining. There are other events happening in Asia, so please register here and if you are able to attend any of those cities it would be great to get you down and learn about all that’s happening with Veeam as we move into the second half of the year an into next year.

Nutanix AHV Announcement:

At Nutanix’s .NET conference we announced the intent to support Acropolis Hypervisor (AHV) by years end and also became the Premier Availability solution for supported Nutanix virtualized environments. I’ll be honest and say that this took a lot of us by surprise…and probably most Nutanix employees as well. However it shows our commitment to providing availability for the modern enterprise…of which Nutanix is also pushing hard into.

Backup for Office365 1.5 BETA:

Last week we released the first beta for Backup for Office365 1.5 which is a significant release for our VCSP community as it now introduces multi-tenancy and also an advanced API feature for automation. If you are a VCSP, take some time to download the beta and put the new features to work…there is a significant opportunity to offer backup services for Office365 which now scale.

Version 1.5 Enhancements:

  • A multi-repository, multi-tenant architecture enabling protection of larger Office 365 deployments with a single installation. Also empowering service providers to deliver Office 365 backup services.
  • Automation possibilities via RESTful API and PowerShell SDK to minimize management overhead, improve recovery times and reduce costs

https://go.veeam.com/beta-backup-office-365

Update 1 for Veeam Agent for Linux 1.0:

Last month we released Update 1 for Veeam Agent for Linux so the next time you update the software from your Linux update repositories you will get the update. While this is for the most a bug release we still included file indexing for 1-Click file recovery through Veeam Enterprise Manager, the ability to add storage and network drivers to the recovery media from the Linux OS and the addition of an ssh server to the recovery media. There is also support added for ExaGrid and general wizard improvements.

https://www.veeam.com/kb2290

Veeam Vanguard Blog Post Roundup:

Quick Fix – Unable to Upgrade Distributed Switch After vCenter Upgrade

This week I upgraded (and migrated) my SliemaLabs NestedESXi vCenter from a Windows 6.0 server to a 6.5 VCSA …everything went well, but ran into an issue when I went to upgrade my distributed switch to 6.5.0. Even though everything appeared to be working with regards to the host and VM networking associated with the switch, when I went to upgrade it I got the following error:

Doing a quick Google for Unable to retrieve data about the distributed switch came up with nothing and clicking on next didn’t do anything actionable. A restart of the Web Client and a reboot of the VCSA didn’t resolve the issue either.The distributed switch in question was still on version 5.5 as I forgot to upgrade it to 6.0 during the upgrade to vCenter 6.0. Weather that condition somehow caused the error I am not sure…regardless the quick fix or better said…work around is pretty simple; Use PowerCLI.

Interestingly the Vendor is different…though not sure this caused the issue. In any case the work around is to upgrade the distributed switch using the Set-VDSwitch command.

And success!

I’m not sure what caused the error to appear in the Web Client but the workaround meant that it became a moot point. Suffice to say if you come across this error in your Web Client when trying to upgrade a distributed switch…head over the PowerCLI.

 

migrate2vcsa – Migrating vCenter 6.0 to 6.5 VCSA

Over the past few years i’ve written a couple of articles on upgrading vCenter from 5.5 to 6.0. Firstly an in place upgrade of the 5.5 VCSA to 6.0 and then more recently an in place upgrade of a Windows 5.5 vCenter to 6.0. This week I upgraded and migrated my NestedESXi SliemaLab vCenter using the migrate2vcsa tool that’s now bundled into the vCenter 6.5 ISO. The process worked first time and even though I held some doubts about the migration working without issue and my Windows vCenter is now in retirement.

The migration tool that’s part of vSphere 6.5 was actually first released as a VMware fling after it was put forward as an idea in 2013. It was then officially to GA with the release of vSphere 6.0 Update 2m…where m stood for migration. Over it’s development it has been championed by William Lam who has written a number of articles on his blog and more recently Emad Younis has been the technical marketing lead on the product as it was enhanced for vSphere 6.5.

Upgrade Options:

You basically have two options to upgrade a Windows based 6.0 vCenter:

My approach for this particular environment was to ensure a smooth upgrade to vSphere 6.0 Update 2 and then look to upgrade again to 6.5 once is thaws outs in the market. The cautious approach will still be undertaken by many and a stepped upgrade to 6.5 and migration to the VCSA will still be common place. For those that wish to move away from their Windows vCenter, there is now a very reliable #migrate2vcsa path…as a side note it is possible to migrate directly from 5.5 to 6.5.

Existing Component Versions:

  • vCenter 6.0 (4541947)
    • NSX Registered
    • vCloud Director Registered
    • vCO Registered
  • ESXi 6.0 (3620759)
  • Windows 2008 (RTM)
  • SQL Server 2008 R2 (10.50.6000.34)

All vCenter components where installed on the Windows vCenter instance including Upgrade Manager. There where also a number of external services registered agains’t the vCenter of which the NSX Manager needed to be re-registered for the SSO to allow/trust the new SSL certificate thumbprint. This is common, and one to look out for after migration.

Migration Process:

I’m not going to go through the whole process as it’s been blogged about a number of times, but in a nutshell you need to

  • Take a backup of your existing Windows vCenter
  • I took a snapshot as well before I began the process
  • Download the vCenter Server Appliance 6.5 ISO and mount the ISO
  • Copy the migration-assistant folder to the Windows vCenter
  • Start the migration-assistant tool and work through the pre-checks

If all checks complete successfully the migration assistant will finish at waiting for migration to start. From here you start the VCSA 6.5 installer and click on the Migrate menu option.

Work through the wizard which asks you for detail on the source and target servers, lets you select the compute, storage and appliance size as well as the networking settings. Once everything is entered we are ready to start Stage 1 of the process.

When Stage 1 finishes you are taken to Stage 2 where is asks you to select the migration data as shown below. This will give you some idea as to how much storage you will need and what the initial foot print of the over and above the actual VCSA VM storage.

There are a couple more steps the migration assistant goes through to complete the process…which for me took about 45 minutes to complete but this will vary depending on the amount of date you want to transfer across.

If there are any issues or if the migration failed at any of the steps you do have the option to power down/remove the new VCSA and power back on the old Windows vCenter as is. The old Windows vCenter would have been shutdown by the migration process just as the copying of the key data finished and the VCSA was rebooted with network settings and machine name copied across. There is proper roll back series of steps listed in this VMwareKB.

The only external service that I needed to re-register against vCenter was NSX. vCloud Director carried on without issue, but it’s worth checking out all registered services just in case.

Conclusion and Thoughts:

As mentioned at the start, I was a bit skeptical that this process would work as flawlessly as it did…and on it’s first time! It’s almost a little disappointing to have this as automated and hands off as it is, but it’s a testament to the engineering effort the team at VMware has done around this tool to make it a very viable and reliable way to remove dependancies on Windows and MSSQL. It also allows those with older version of Windows that are well past their used by date the ability to migrate to the VSCA with absolute confidence.

References:

http://www.virtuallyghetto.com/page/2?s=migrate2vcsa

https://github.com/younise/migrate2vcsa-resources

« Older Entries Recent Entries »