Search Results for: NSX + Bytes

NSX Bytes: Important Bug in 6.2.4 to be Aware of

[UPDATE] In light of this post being quoted on The Register I wanted to clarify a couple of things. First off, as mentioned there is a fix for this issue (the KB should be rewritten to clearly state that) and secondly, if you read below, you will see that I did not state that just about anyone running NSX-v 6.2.4 will be impacted. Greenfield deployments are not impacted.

Here we go again…I thought maybe we where over these, but it looks like NSX-v 6.2.4 contains a fairly serious bug impacting VMs after vMotion operations. I had intended to write about this earlier in the week when I first became aware of the issue, however the last couple of days have gotten away from me. That said, please be aware of this issue as it will impact those who have upgraded NSX-v from 6.1.x to 6.2.4.

As the KB states, the issue appears if you have the Distributed Firewall enabled (it’s enabled and inline by default) and you have upgraded NSX-v from 6.1.x to 6.2.3 and above, though for most this should be applicable to 6.2.4 upgrades due to all this issues in 6.2.3. If VM’s are migrated between upgraded hosts they will loose network connectivity and require a reboot to bring back connectivity.

If you check the vmkernal.log file you will see similar entries to that below.

Cause

This issue occurs when the VSIP module at the kernel level does not handle the export_version deployed in NSX for vSphere 6.1.x correctly during the upgrade process.

The is no current resolution to the issue apart from the VM reboot but there is a workaround in the form of a script that can be obtained via GSS if you reference KB2146171. Hopefully there will be a proper fix in future NSX releases.

<RANT>

I can’t believe something as serious as this was missed by QA for what is VMware’s flagship product. It’s beyond me that this sort of error wasn’t picked up in testing before it was released. It’s simply not good enough that a major release goes out with this sort of bug and I don’t know how it keeps on happening. This one specifically impacted customers and for service providers or enterprises that upgraded in good faith, it puts egg of the faces of those who approve, update and execute the upgrades that results in unhappy customers or internal users.

Most organisations can’t fully replicate production situations when testing upgrades due to lack or resources or lack of real world situation testing…VMware could and should have the resources to stop these bugs leaking into release builds. For now, if possible I would suggest that people add more stringent vMotion tests as part of NSX-v lab testing before promoting into production moving forward.

VMware customers shouldn’t have to be the ones discovering these bugs!

</RANT>

[UPDATE] While I am obviously not happy about this issue coming in the wake of previous issues, I still believe in NSX and would recommend all shops looking to automate networking still have faith in what the platform offers. Bug’s will happen…I get that, but I know in the long run there is huge benefit in running NSX.

References:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2146171

NSX Bytes: Updated – NSX Edge Feature and Performance Matrix

A question came up today around throughput numbers for an NSX Edge Services Gateway and that jogged my memory back to a previous blog post where I compared features and performance metrics between vShield Edges and NSX Edges. In the original post I had left out some key metrics, specifically around firewall and load balance throughput so thought it was time for an update. Thanks to a couple of people in the vExpert NSX Slack Channel I was able to fill some gaps and update the tables below.

A reminder that VMware has announced the End of Availability (“EOA”) of the VMware vCloud Networking and Security 5.5.x that kicked in on the September  of 19, 2016 and that vCloud Director 8.10 does not support vShield Edges anymore…hence why I have removed the VSE from the tables.

As a refresher…what is an Edge device?

The Edge Services Gateway (NSX-v) connects isolated, stub networks to shared (uplink) networks by providing common gateway services such as DHCP, VPN, NAT, dynamic routing, and Load Balancing. Common deployments of Edges include in the DMZ, VPN Extranets, and multi-tenant Cloud environments where the Edge creates virtual boundaries for each tenant.

Below is a list of services provided by the NSX Edge.

Service Description
Firewall Supported rules include IP 5-tuple configuration with IP and port ranges for stateful inspection for all protocols
NAT Separate controls for Source and Destination IP addresses, as well as port translation
DHCP Configuration of IP pools, gateways, DNS servers, and search domains
Site to Site VPN Uses standardized IPsec protocol settings to interoperate with all major VPN vendors
SSL VPN SSL VPN-Plus enables remote users to connect securely to private networks behind a NSX Edge gateway
Load Balancing Simple and dynamically configurable virtual IP addresses and server groups
High Availability High availability ensures an active NSX Edge on the network in case the primary NSX Edge virtual machine is unavailable
Syslog Syslog export for all services to remote servers
L2 VPN Provides the ability to stretch your L2 network.
Dynamic Routing Provides the necessary forwarding information between layer 2 broadcast domains, thereby allowing you to decrease layer 2 broadcast domains and improve network efficiency and scale. Provides North-South connectivity, thereby enabling tenants to access public networks.

Below is a table that shows the different sizes of each edge appliance and what (if any) impact that has to the performance of each service. As a disclaimer the below numbers have been cherry picked from different sources and are subject to change…I’ll keep them as up to date as possible.

NSX Edge (Compact) NSX Edge (Large) NSX Edge (Quad-Large) NSX Edge (X-Large)
vCPU 1 2 4 6
Memory 512MB 1GB 1GB 8GB
Disk 512MB 512MB 512MB 4.5GB
Interfaces 10 10 10 10
Sub Interfaces (Trunk) 200 200 200 200
NAT Rules 2000 2000 2000 2000
FW Rules 2000 2000 2000 2000
FW Performance 3Gbps 9.7Gbps 9.7Gbps 9.7Gbps
DHCP Pools 25 25 25 25
Static Routes 2048 2048 2048 2048
LB Pools 64 64 64 64
LB Virtual Servers 64 64 64 64
LB Server / Pool 32 32 32 32
IPSec Tunnels 512 1600 4096 6000
SSLVPN Tunnels 50 100 100 1000
Concurrent Sessions 64,000 1,000,000 1,000,000 1,000,000
Sessions/Second 8,000 50,000 50,000 50,000
LB Throughput L7 Proxy) 2.2Gbps 2.2Gbps 3Gbps
LB Throughput L4 Mode) 6Gbps 6Gbps 6Gbps
LB Connections/s (L7 Proxy) 46,000 50,000 50,000
LB Concurrent Connections (L7 Proxy) 8,000 60,000 60,000
LB Connections/s (L4 Mode) 50,000 50,000 50,000
LB Concurrent Connections (L4 Mode) 600,000 1,000,000 1,000,000
BGP Routes 20,000 50,000 250,000 250,000
BGP Neighbors 10 20 50 50
BGP Routes Redistributed No Limit No Limit No Limit No Limit
OSPF Routes 20,000 50,000 100,000 100,000
OSPF Adjacencies 10 20 40 40
OSPF Routes Redistributed 2000 5000 20,000 20,000
Total Routes 20,000 50,000 250,000 250,000

Of interest from the above table it doesn’t list any Load Balancing performance number for the NSX Compact Edge…take that to mean that if you want to do any sort of load balancing you will need NSX Large and above. To finish up, below is a table describing each NSX Edge size use case.

Use Case
NSX Edge (Compact) Small Deployment, POCs and single service use
NSX Edge (Large) Small/Medium DC or mult-tenant
NSX Edge (Quad-Large) High Throughput ECMP or High Performance Firewall
NSX Edge (X-Large) L7 Load Balancing, Dedicated Core

References:

https://www.vmware.com/files/pdf/products/nsx/vmw-nsx-network-virtualization-design-guide.pdf

https://pubs.vmware.com/NSX-6/index.jsp#com.vmware.nsx.admin.doc/GUID-3F96DECE-33FB-43EE-88D7-124A730830A4.html

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2042799

NSX Bytes: vCloud Director Can’t Deploy NSX Edges

Over the weekend I was tasked with the recovery of a #NestedESXi lab that had vCloud Director and NSX-v components as part of the lab platform. Rather than being a straight forward restore from the Veeam backup I also needed to downgrade the NSX-v version from 6.2.4 to 6.1.4 for testing purposes. That process was relatively straight forward and involved essentially working backwards in terms of installing and configuring NSX and removing all the components from vCenter and the ESXi hosts.

To complete the NSX-v downgrade I deployed a new 6.1.4 appliance and connected it back up to vCenter, configured the hosts, setup VXLAN, transport components and tested NSX Edge deployments through the vCenter Web Client. However, when it came time to test Edge deployments from vCloud Director I kept on getting the following error shown below.

Checking through the NSX Manager logs there was no reference to any API call hitting the endpoint as is suggested by the error detail above. Moving over to the vCloud Director Cells I was able to trace the error message in the log folder…eventually seeing the error generated below in the vcloud-container-info.log file.

As a test I hit the API endpoint referenced in the error message from a browser and got the same result.

This got me thinking that the error was either DNS related or permission related. After confirming that the vCloud Cells where resolving the NSX Manager host name correctly, as suggested by the error I looked at permissions as the cause of the 403 error. vCloud Director was configured to use the service.vcloud service account to connect to the previous NSX/vShield Manager and it dawned on me that I hadn’t setup user rights in the Web Client under Networking & Security. Under the Users section of the Manage Tab the service account used by vCloud Director wasn’t configured and needed to be added. After adding the user I retried the vCD job and the Edge deployed successfully.

While I was in this menu I thought I’d test what level of NSX User was required to for that service account to have in order to execute operations against vCloud Director and NSX. As shown below anything but NSX or Enterprise Administrator triggered a “VSM response error (254). User is not authorized to access object” error.

At the very least to deploy edges, you require the service account to be NSX Administrator…The Auditor and Security Administrator levels are not enough to perform the operations required. More importantly don’t forget to add the service account as configured in vCloud Director to the NSX Manager instance otherwise you won’t be able to have vCloud Director deploy edges using NSX-v.

 

 

NSX Bytes: NSX-v 6.2.4 Released …Important Upgrade!

NSX-v 6.2.4 was released the week before VMworld US so might have gotten somewhat lost in the VMworld noise…For those that where fortunate enough to not upgrade to or deploy a greenfield 6.2.3 site you can now safely do so without the nasty bugs that existed in the 6.2.3 build. In a nutshell this new build delivers all the significant features and enhancements announced in 6.2.3 without the dFW or Edge Gateway bugs that forced the build being pulled from distribution a few weeks back.

In terms of how and when to upgrade from previous versions the following table gives a great overview of the pathways required to get to 6.2.4.

The take away from the table above is that if possible you need to get onto NSX-v 6.2.4 as soon as possible and with good reason:

  • VMware NSX 6.2.4 provides critical bug fixes identified in NSX 6.2.3, and 6.2.4 delivers a security patch for CVE-2016-2079 which is a critical input validation vulnerability for sites that uses NSX SSL VPN.
  • For customers who use SSL VPN, VMware strongly recommends a review of CVE-2016-2079 and an upgrade to NSX 6.2.4.
  • For customers who have installed NSX 6.2.3 or 6.2.3a, VMware recommends installing NSX 6.2.4 to address critical bug fixes.

Prior to this release if you had upgraded to NSX-v 6.1.7 you where stuck and not able to upgrade to 6.2.3. The Upgrade matrix is now reporting that you can upgrade 6.1.7 to 6.2.4 as shown below.

I was able to validate this in my lab going from 6.1.7 to 6.2.4 without any issues.

NSX-v 6.1.4 is also fully supported by vCloud Director SP 8.0.1 and 8.10

References:

http://pubs.vmware.com/Release_Notes/en/nsx/6.2.4/releasenotes_nsx_vsphere_624.html

http://www.theregister.co.uk/2016/07/22/please_dont_upgrade_nsx_just_now_says_vmware/

NSX Bytes: 6.1.x General Support Extended and 6.2.3 Edge Upgrade Issues

A while ago VMware announced that NSX-v general support would come to an end on this October to pave the way for current 6.1.x users to upgrade to 6.2.x. A problem has arisen in that people who patched NSX-v to the latest patch release 6.1.7 to cover a security venerability are left being unable to upgrade to 6.2.3 which also covers the same venerability in the 6.2.x release.

NSX Bytes: Critical Update for NSX-v and vCNS

As of June 9, 2016 with the release of NSX for vSphere 6.1.7, the EOGS date has been extended by 3 months, to January 15th, 2017. This is to allow customers to have time to upgrade from NSX for vSphere 6.1.7,  which contains an important security patch improving input validation of the system, to the latest 6.2.x release. For recommended upgrade paths, refer to the latest NSX for vSphere 6.2
.
It’s not the first time that current releases of NSX-v have blocked upgrades to future releases, and in this case NSX-v 6.2.3 also includes this security patch and along with 6.2.2, remains the suggested release for NSX-v. Repeating that upgrades from NSX 6.1.7 to 6.2.3 are not supported. Once VMware release the patch version beyond 6.1.7 upgrading to 6.2.x will be possible. That said it’s great of VMware to extend the end of support by three months to give themselves time to get the patch out.
.
6.2.3 ESG Catch-22:

For those than can upgrade to NSX-v 6.2.3 there is a current issue around the upgrading of NSX and existing edges possibly becoming unmanageable. This issue occurs when the load balancer is configured for serverSsl or clientSsl but ciphers value is set as NULL in the previous version. NSX-v 6.2.3 introduces a new approved cipher list in NSX Manager and does not allow the ciphers to be NULL when configuring the load balancer…as was the previous default option.

Since the ciphers value defaults to NULL in the earlier version, if this is not set NSX Manager 6.2.3 considers this ciphers value as invalid the Edges in turn become unmanageable. There should be a fix coming and there is a workaround as described in the VMwareKB here.

 

References:

NSX Bytes: NSX 6.2.3 and vShield Endpoint Clarification

NSX-v 6.2.3 has been out for a couple of weeks now and besides the new features and bug fixes there was a significant change to the licensing structure for NSX. Previously there really wasn’t any concept of NSX editions…however 6.2.3 introduced four new tiers. As was announced early May NSX-v comes in Standard, Enterprise and Enterprise Plus. At the time there was still no public mention of what was to happen to existing vCloud Network and Security customers utilizing vShield Endpoint…more so given that vCNS is to be end of lifed in September.

Looking through the release notes for NSX-v 6.2.3 there is a section that talks about the licensing and in addition to the three editions there is a default license which allows use of the vShield Endpoint feature…which is called Guest Introspection under NSX.

Change in default license & evaluation key distribution: default license upon install is “NSX for vShield Endpoint”, which enables use of NSX for deploying and managing vShield Endpoint for anti-virus offload capability only. Evaluation license keys can be requested through VMware sales.

Everyone who is entitled to the vSphere vCloud suits will now download NSX instead of vCNS. Depending on your use case, that will dictate which license you decide to apply, therefore unlocking different features of NSX…People will truly be running NSX everywhere…remembering that as of the current 6.1.x and 6.2.x releases the NSX Manager is a beefed up version of the vShield Manager. The good news for people who are running vShield Endpoint services for Antivirus and other guest introspection tasks will be able to manage this through the Web Client.

In terms of what NSX parts need installing/upgrading from the vCNS bits, you only need to perform a Host Preparation and Guest Introspection install. There is no need to run NSX Controllers or configure VXLAN in order to run Endpoint services…if you want to be able to run those NSX features you will need to request specific NSX edition keys to suit your requirements.

For a complete rundown on NSX-v Licensing Edition features click here.

References:

http://pubs.vmware.com/Release_Notes/en/nsx/6.2.3/releasenotes_nsx_vsphere_623.html

NSX Bytes: Trend Deep Security 9.6 DSVA Deployment Gotchya

This week I’ve been working with Trend Deep Security 9.6 to get a Proof of Concept up and running to protect some internal management virtual machines with Trends agentless protection feature. Trend now integrates with NSX and In an NSX enabled environment, the Deep Security Virtual Appliance (DSVA) provides Anti-Malware, Integrity Monitoring, Web Reputation Service, Firewall, and Intrusion Prevention for your virtual machines, without requiring an Agent.

After following the Install Guide and having installed the Deep Security Manager and connected the vCenter and NSX Managers through the DSM Web Console I installed the NSX Guest Introspection ESX Agents under Service Deployments and got to the part to deploy the Trend Micro Deep Security Service from the same location in the Web Client I got the following error.

Checking the Service Definitions menu under Trend Micro I saw that the Deployment settings looked correct as per the install guide. Heading to the URL provided I got an error from the DSM saying that there was a database error and the file was not found…matching the error above.

After a little digging I checked to see what was listed in the DSM Local Software repository and couldn’t see the ESX Agent in the list. This needs to be imported first before you can use the Service Deployment section to deploy the Trend Micro DSVAs (Download Link). Under > Updates > Software > Local page and click Import. Once imported you should see the following.

Once that has been done you can click on the Resolve Button in the System Alarm window of the NSX Service Deployment section and the appliances will be deployed as version 9.5 as shown below.

Important Note:
EDIT: Trend has responded in the comments:

As mentioned on the download page:

If you are implementing Agentless protection, install the 9.5 version of the DSVA and import the Agent Software for Red Hat Enterprise Linux 6 64-bit package. Afterward, the DSVA will be able to upgrade to the version 9.6…but DONT! 

Upgrade Notice: Version 9.6 of the DSVA is limited to providing Anti-Malware and Integrity Monitoring protection for your virtual machines. If you need pure Agentless protection with Anti-Malware, Firewall, Intrusion Prevention and Integrity Monitoring, do not activate the Deep Security Agent on the VMs and do not upgrade your DSVA to 9.6.

So if you want that agentless protection for all Trend Deep Security features as listed above do not upgrade to the 9.6 version of the DSVA. I’m not sure why this is the case, but I will chase this up and update this post when I know more.

References:

http://docs.trendmicro.com/all/ent/ds/v9.6/en-us/Deep_Security_96_Install_Guide_nsx_EN.pdf

http://downloadcenter.trendmicro.com/index.php?regs=NABU&clk=latest&clkval=4856&lang_loc=1

NSX Bytes: Critical Update for NSX-v and vCNS

I generally don’t post around security releases but after going through the notes on CVE-2016-2079 I thought it was important enough to dedicate a post around. Mainly because it could impact those running NSX Edge Services Gateways or vShield Edges with the SSL-VPN service enabled for clients.

Most vCloud Director based instances won’t have the SSL-VPN enabled due to it not being exposed through the vCD UI however some Service Providers may offer this as a managed service as it’s one of the strongest features of the Edge Gateways. The issue detailed in the CVE is summarized below.

VMware NSX and vCNS with SSL-VPN enabled contain a critical input validation vulnerability. This issue may allow a remote attacker to gain access to sensitive information.

In a nutshell you need to upgrade an existing version of NSX-v or vCNS to the version below. As per usual if you have the entitlements go ahead and download the updates from the links below.

  • NSX Edge: 6.2 -> 6.2.3
  • NSX Edge: 6.1 -> 6.1.7
  • vCNS Edge: 5.5 -> 5.5.4.3

NSX-v  Downloads: https://www.vmware.com/go/download-nsx-vsphere

vCNS Downloads: https://www.vmware.com/go/download-vcd-ns

References:

http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-2079

NSX Bytes: Friends Don’t Let Friends Delete The VTEP PortGroup

Last week I posted a tweet saying “Friends don’t let friends delete the NSX-v VTEP PortGroup” and as most of us do in our industry we learn by doing and I found out the hard way that you shouldn’t mess with the PortGroup created during the Host Preparation of the NSX setup and configuration stage. This PortGroup is used by the Hosts in an NSX Enabled Cluster for the VMKernel Interfaces that are the VTEPs or VXLAN Tunnel End Points.

In a production environment this action is actually near on impossible to do because you can’t delete a PortGroup when it’s in use. Where I found myself in this situation was in trying to clone off a lab environment and restore components of the existing lab into new lab with new hosts. With that the following is something that could be handy in lab environments.

Once the new hosts have been prepared I went to configure the VXLAN against the cluster which creates a new VMKernel Interface on each host and assigns it a VTEP address from DHCP or from a pre-configured IP Pool but got an error. When I looked at the event logs in vCenter I saw the following error.

DVPortGroup dvportgroup-148806 couldnot be found
 The object or item referred to could not be found

Instantly I remembered that I had “cleaned up” the cloned vCenter configuration and removed any surplus PortGroups…in doing so I deleted the PortGroup NSX was referencing. I tried to recreate the PortGroup with the same name but it was clear that the configuration was referencing the MOID of the PortGroup and asking vCenter to use that to complete the job. Even an export/import of the Distributed Switch configuration from the original vCenter didn’t do the trick as the import increments the MOID already contained in the vCenter Database.

GSS Support Fix:

Thinking back to previous NSX related cases I’ve raised with VMware support I knew that the NSX Manager Database kept a very simple structure of vCenter objects and I guessed that some backend SQL search and replace could do the trick. After raising a case I had the guys in GSS enter into the NSX Manager backend, that can only be access with a secret VMware password and search for the table that referenced the MOID of the PortGroup. As can be seen below the fix is simple if you know the MOID of the old and the new PortGroup.

Note: Only VMware Support can action this fix.

With that modification committed I was able configure the VTEPs for the new hosts and continue to rebuild up the cloned instance. So if you ever get yourself in a situation where you have managed to do as I have done…there is a fix that can be done to avoid a complete start from scratch scenario.

NSX Bytes: Controller Deployment Gone Bad?

With NSX becoming more and more widely available there are more NSX home labs being stood up and with that the chances of the NSX Controllers failing due to “Home Lab” nested issues become more prevalent. The NSX Controllers are Ubuntu Linux VMs and like any Linux VM are fairly sensitive to storage latency and other issues that appear in #NestedESXi or lab environments.

In one of my labs I came across an issue where I needed to redeploy all the NSX Controllers due to the VMs effectively breaking due to the storage being ripped out from under them…however when I went to redeploy the latency of the underlying nested storage was still not that great and the deployment got stuck in a loop as shown below.

No matter what I tried…vCenter restart, NSX Manager Reboot or Host Reboot the end result was the status remaining in the spinning state. If I tried to deploy another controller I would get the following error.

Controller IP address allocation failed for reason : cluster already contains controller of IP x.x.x.x

In my case the VM existed with the IP address configured against the VM however I could not access the cli to check NSX Cluster Status due to the fact the VM was in a pretty bad way.

Taking a look at the IP Pool allocations…even though the error said that the IP was in use, it wasn’t listed as such…meaning it was trying to use the first IP in the pool regardless.

Before going into the fix, it should be noted that if this scenario was to happen, and you where down to your last controller in production you would be best served to call up VMware Support and work through the restore options as without any controllers your VXLAN Unicast traffic isn’t going to be updated via the VTEPS and things will eventually grind to a halt. It’s also worth reading the VMware Docs on what to do if even one Controller is lost in a cluster. If this is in a lab scenario…we can be a little harsher!

While the Controller status is spinning in a Deploying state you can’t interact with it via the Web Client. You need to turn to the API to delete the NSX Controller and start again or deploy a new cluster set. First you will need the CONTROLLER-ID which can be easily seen via the Web Client. To remove the controller you need to call the API below using the Delete method. If the stuck controller is the last one in the cluster you need to add the ?forceRemoval=True option at the end of the call.

Once complete you should get a 200 status and a job data ID. If you check back in at the Web Client you should see the Controller VM being deleted and it being removed from the list under Controller Nodes. We are now free of the Deploying Loop and can rebuild or extend the NSX Controller cluster as is appropriate.

References:

https://pubs.vmware.com/NSX-62/index.jsp?topic=%2Fcom.vmware.nsx.admin.doc%2FGUID-3A84E9D1-CAC0-41B1-B45C-E032B230DB49.html

« Older Entries