Category Archives: VMware

vCloud Director 9.0: Digging into the new Standalone VM Feature

vCloud Director 9.0 was released late last month and brought with it a number of big new features and enhancements. If you are interested in a overview of what’s new, head here to my launch post. Getting back to this post I wanted to focus on what I think is a significant change to the way in which workloads are thought about in vCD…the Standalone VM.

Standalone Virtual machines can be instantiated and viewed along with virtual machines as part of a vApp container. A filter button creates a list based on Virtual machines, virtual applications or both.

The vApp container construct in vCloud Director carries divided opinion from both services providers and customers of vCD with one side liking the fact that VMs could be grouped into logical vApps and treated as a like group or VMs such as an Exchange Cluster. While others wanted the ability to deploy standalone VMs that where more like VM instances you find in public clouds. Historically from a programatic point of view the creation of a VM within a vApp had it’s challenges in a chicken and egg type of scenario where by the composition and recomposiontion of the VM within the vApp required a specific order. This was improved from 8.0 with enhancements to vApp functionality, including the ability to reconfigure virtual machines within a vApp, and network connectivity and virtual machine capability during vApp instantiation.

Standalone Virtual Machines:

In vCloud Director 9.0 you can now create and configure individual Virtual Machines form the new HTML5 Tenant UI. Under the compute menu you now have a Virtual Machines and vApps tab. From here you can view either standalone VMs, VMs in a vApp or both. This is also where you can create a new VM. Note that you can’t create new vApps from the new UI just yet…that still needs to be done in the Flash based UI.

You now have the ability to choose from three pre-canned instance sizes which come with default resources depending on the type of VM selected. However you can still customize the VM as shown below.

When provisioned the VM is available from the new tenant UI with all the normal operations possible. The biggest difference here is that you don’t need to worry about the vApp state and that it’s independent from any other VMs. As a side note as it’s not 100% obvious, to view the console of the VM click on the icon top right of the Virtual Machine box.

Standalone VMs in vCenter and Flash UI:

Taking a look under the covers of the HTML5 UI the standalone VMs are represented slightly differently in vCenter. in Previous versions each VM was created with the VM name plus a UUID…when a standalone VM is created the VM name is just that…the VM name.

However what is interesting is when you look in the Flash UI you will see that in fact the standalone VM is still contained within a vCD vAPP construct.

So in effect, that HTML5 UI is presenting the VM as standalone, but in actual fact there is still a one to one relationship with a vApp under the covers. Taking a look back in vCenter under the folder view it’s more representative of what you see in the Flash UI.

Standalone VMs via the API:

Querying the API shows that the Standalone VMs are indeed composed within a traditional vCD vApp.

References:

https://docs.vmware.com/en/vCloud-Director/9.0/rn/rel_notes_vcloud_director_90.html

Released: NSX-v 6.3.4 and Upgrade Notes and Fixes

Last week VMware released NSX-v 6.3.4 (Build 6845891) that contains no specific new features but addresses a couple of bug fixes from previous releases. Going through the release notes there are a lot of known issues that should be known and there are more than a few that apply to service providers…specifically there are a lot around NSX Edge functions. The other interesting point to highlight about this release is that for those on NSX-v 6.3.3 there is are a couple of scripts to run against the API before upgrading to ensure all controllers are upgradable.

As mentioned, before upgrading the release notes stage that for those on NSX-v 6.3.3 they follow this VMwareKB. In a nutshell there is a bug in 6.3.3 where the NSX Controllers are reported as disconnected in the Web Client as shown below.

To fix that situation you need to execute a couple of API calls that POSTs a script to the NSX Manager as documented in the VMwareKB. This needs to be done as the NSX Manager Admin user as I found this didn’t work with an NSX Domain User or an SSO Administrator Account with NSX Org admin level permissions.

Once the second script has been run you should see a similar output to what’s shown above and have all NSX Controllers ready in a connected state which allows you to prepare for the upgrade. Once done, you can go through the normal NSX upgrade steps which will get you to the latest build.

Important Fixes :

  • Fixed Issue 1970527: ARP fails to resolve for VMs when Logical Distributed Router ARP table crosses 5K limit
  • Fixed Issue 1961105: Hardware VTEP connection goes down upon controller rebootA BufferOverFlow exception is seen when certain hardware VTEP configurations are pushed from the NSX Manager to the NSX Controller. This overflow issue prevents the NSX Controller from getting a complete hardware gateway configuration. Fixed in 6.3.4.
  • Fixed Issue 1955855: Controller API could fail due to cleanup of API server reference filesUpon cleanup of required files, workflows such as traceflow and central CLI will fail. If external events disrupt the persistent TCP connections between NSX Manager and controller, NSX Manager will lose the ability to make API connections to controllers, and the UI will display the controllers as disconnected. There is no datapath impact.

Those with the correct entitlements can download NSX-v 6.3.4 here.

References:

https://docs.vmware.com/en/VMware-NSX-for-vSphere/6.3/rn/releasenotes_nsx_vsphere_634.html

https://kb.vmware.com/kb/2151719

 

Enabling, Configuring and Viewing Metrics in vCloud Director 9.0

Last week I released a post on configuring Cassandra for vCloud Director 9.0 metrics. As a refresher, one of the cool features released in vCloud Director SP 5.6.x was the ability to expose VM metrics that service providers could expose to their clients via a set of API calls. With the release of vCloud Director 9.0, the metrics can now be viewed from the new HTML5 tenant UI, meaning that all service providers should be able to offer this to their customers.

With the Cassandra configuration out of the way, the next step is to use the Cell Management Tool to tell the vCD cells to push the VM Metric data. Before this, if you log into the HTML5 UI you will notice no menu for Monitoring…this only gets enabled once the metrics have have been enabled by the tool.

The command has changed from previous versions in line with removing the dependancy on the KairosDB and we are now calling a cassandra argument that has the following options:

Those familiar with the previous command to configure the metrics will see a lot more options that specify the Cassandra nodes, the original command to configure the schema, the username and password to connect to the Cassandra database with and the ttl for the data, meaning that if you wanted you could keep more than two weeks of data.

If you tail the Cassandra system.log while the process is happening you will see a bunch of tables being created and populated with the initial data.

With the done, if you go into the new HTML5 Tenant UI and go to the Virtual Machine view you should now see a Monitoring Chart drop down in the menu in the main window. From here you can choose any of the available metrics across a half hour, hour, day and week timescale.

API Calls to Retrieve Current and Historical Metrics:

If you still want to go old school the following API Calls are used to gather current and historical VM metrics for vCD VMs. The Machine ID required used the VM GUID as seen in vCenter. The ID can be sourced from the VM Name. The vCD Machine ID shown below in the brackets is what you are after.



Configuring Cassandra for vCloud Director 9.0 Metrics

One of the cool features released in vCloud Director SP 5.6.x was the ability to expose VM metrics that service providers could expose to their clients via a set of API calls. Some service providers took advantage of this and where able to offer basic VM metrics to their tenants through customer written portals. Zettagrid was one of those service providers and while I was at Zettagrid, I worked with the developers to get VM metrics out to our customers.

Part of the backend configuration to enable the vCloud Director cells to export the metric data was to stand up a Cassandra/KairosDB cluster. This wasn’t a straight forward exercise but after a bit of tinkering due to a lack of documentation, most service providers where able to have the backend in place to support the metrics.

With the release of vCloud Director 9.0, the requirement to have KairosDB managed by Apache has been removed and metrics can now be accessed natively in Cassandra using the cell management tool. Even cooler is that the metrics can now be viewed from the new HTML5 tenant UI, meaning that all service providers should be able to offer this to their customers.

Cassandra is an open source database that you can use to provide the backing store for a scalable, high-performance solution for collecting time series data like virtual machine metrics. If you want vCloud Director to support retrieval of historic metrics from virtual machines, you must install and configure a Cassandra cluster and use the cell-management-tool to connect the cluster to vCloud Director. Retrieval of current metrics does not require optional database software.

The vCloud Director online docs have a small install guide but it’s not very detailed. It basically says to install and configure the Cassandra cluster with four nodes, two of which are seed nodes, enabling encryption and user authentication with Java Native Access installed. Not overly descriptive. I’ve created an script below that installs and configures a basic single node Cassandra cluster that will suffice for most labs/testing environments.

Setting up Cassandra on Ubuntu 16.04 LTS:

I’ve forked an existing bash script on Github and added modifications that goes through the installation and configuration of Cassandra 2.2.6 (as per the vCD 9.0 release notes) on a single node, enabling authentication while disabling encryption in order to keep things simple.

This will obviously work on any distro that supports apt-get. Once configured you can view the Cassandra status by using the nodetool status command as shown below.

The manual steps for the Cassandra installation are below…note that they don’t include the configuration file changes required to enable authentication and set the seeds.

From here you are ready to configure vCD to push the metrics to the Cassandra database. I’ll cover that in a seperate post.

References:

https://docs.vmware.com/en/vCloud-Director/9.0/com.vmware.vcloud.install.doc/GUID-E5B8EE30-5C99-4609-B92A-B7FAEC1035CE.html

https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/vcloud/vmware-vcloud-director-whats-new-9-0-white-paper.pdf

vCloud Director 9.0: Manual Quick fix for VXLAN Network Pool Error

vCloud Director 9.0, released last week has a bunch of new enhancements and a lot of those are focused around it’s integration with NSX. Tom Fojta has a what’s new page on the go with a lot of the new features being explained. One of his first posts just after the GA was around the new feature of being able to manually create VXLAN backed Network Pools.

VXLAN Network Pool is recommended to be used as it scales the best. Until version 9, vCloud Director would create new VXLAN Network Pool automatically for each Provider VDC backed by NSX Transport Zone (again created automatically) scoped to cluster that belong to the particular Provider VDC. This would create multiple VXLAN network pools and potentially confusion which to use for a particular Org VDC.

In vCloud Director 9.0 we now have the option of creating a VXLAN backed network pool manually instead of one being created at the time of a setting up a Provider vDC. In many of my environments for one reason or another the automatic creation of VXLAN network pool together with NSX would fail. In fact my current NextedESXi SliemaLabs vCD instance shows the following error:

There is a similar but less serious error that can be fixed by changing the replication mode from within the NSX Web Client as detailed here by Luca, however like my lab I’ve know a few people to run into the more serious error as shown above. You can’t delete the pool and a repair operation will continue to error out. Now in vCD 9.0 we can create a new VXLAN Network Pool form the Transport Zones created in NSX.

Once that’s been done you will have the newly created VXLAN Network Pool that’s truly more global and tied to best practice for NSX Transport Zones and one that can be used with the desired replication mode. The old one will remain, but you can now configure Org vDCs to consume the VXLAN backed network pool over the traditional VLAN backed pool.

References:

vCloud Director 9: What’s New

vCloud Director 9: Create VXLAN Network Pool

Released: vCloud Director 9.0 – The Most Significant Update To Date!

Today is a good day! VMware have released to GA vCloud Director 9.0 (build 6681978) and with it come the most significant feature and enhancements of any previous vCD release. This is the 9th major release of vCloud Director, now spanning nearly six and half years since v1.0 was released in Feburary of 2011 and as mentioned from my point of view it’s the most significant update of vCloud Director to date.

Having been part of the BETA program I’ve been able to test some of the new features and enhancements over the past couple of months and even though from a Service Provider perspective there is a heap to like about what is functionally under the covers, but the biggest new feature is without doubt the HTML5 Tenant Portal however as you can see below there is a decent list of top enhancements.

Top Enhancements:

 

  • Multi-Site vCD – Single Access point URL for all vCD instances within same SP federated via SSO
  • On-premises to Cloud Migration – Plugin that enables L2 connectivity, warm and cold migration
  • Expanded NSX Integration – Security Groups, Logical Routing for east-west traffic and audit logging
  • HTML5 Tenant UI – Streamlined workflows for VM deployment, UI Extensibility for 3rd party services/functionality
  • HTML5 Metrics UI – Basic Metrics for VMs shown through tenant portal
  • Extensible Service Framework – Service enablement, SSO Ready
  • Application Extensibility – Plugin Framework
  • PostGres 9.5 Support – In addition to MSSQL and Oracle, Postgres is now supported.
  • …and more under the hood bits

I’m sure there will be a number of other blog posts focusing on the list above, and i’ll look to go through a few myself over the next few weeks but for this GA post I wanted to touch on the new HTML5 Tenant UI.

There is a What’s New in vCloud Director 9.0 PDF here.

New HTML5 Tenant UI:

The vCD team laid the foundation for this new Tenant UI in the last release of vCD in bringing the NSX Advanced HTML5 UI to version 8.20. While most things have been ported across there may still be a case for tenants to go back to the old Flex UI to do some tasks, however from what I have seen there is close to 100% full functionality.

To get to the new HTML5 Tenant UI you go to: https://<vcd>/tenant/orgname

Once logged in you are greeted with a now familiar looking VMware portal based on the Clarity UI. It’s pretty, it’s functional and it doesn’t need Flash…so haters of the existing flex based vCD portal will have to bite their tongues now 🙂

The Networking menu is inbuilt into this same Tenant portal and you you can access it directly from the new UI, or in the same way as was the case with vCD 8.20 from the flex UI. Below is a YouTube video posted by the vCD team that walks through the new UI.

There is also VM Metrics in the UI now, where previously they where only accessible after configuring the vCD Cells to route metric data to a Cassandra database. The metrics where only accessible via the API and some providers managed to tap into that and bring vCD Metrics into their own portals. With the 9.0 release this is now part of the new HTML5 Tenant UI and can be seen in the video below.

As per previous releases this only shows up to two weeks worth of basic metrics but it’s still a step in the right direction and gives vCD tenant’s enough info to do basic monitoring before hitting up a service desk for VM related help.

Conclusion:

vCloud Director 9.0 has delivered on the what most members of the VMware Cloud Provider Program had wanted for some time…that is, a continuation of the commitment to the the HTML5 UI as well as continuing to add features that help service providers extend their reach across multiple zones and over to hybrid cloud setups . As mentioned over the next few weeks, I am going to expand on the key new features and walk through how to configure elements through the UI and API.

Compatibility with Veeam, vSphere 6.5 and NSX-v 6.3.x:

vCloud Director 9.0 is compatible with vSphere 6.5 Update 1 and NSX 6.3.3 and supports full interoperability with other versions as shown in the VMware Product Interoperability Matrix. With regards to Veeam support, I am sure that our QA department will be testing the 9.0 release against our integration pieces at the first opportunity they get, but as of now, there is no ETA on offical support.

A list of known issues can be found in the release notes.

#LongLivevCD

References:

https://docs.vmware.com/en/vCloud-Director/9.0/rn/rel_notes_vcloud_director_90.html

https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/vcloud/vmware-vcloud-director-whats-new-9-0-white-paper.pdf

VMware Announces New vCloud Director 9.0

The One Problem with the VCSA

Over the past couple of months I noticed a trend in my top blog daily reporting…the Quick fix post on fixing a 503 Service Unavailable error was constantly in the top 5 and getting significant views. The 503 error in various forms has been around since the early days of the VCSA which usually manifests it’s self with the following.

503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x0000559b1531ef80] _serverNamespace = / action = Allow _pipeName =/var/run/vmware/vpxd-webserver-pipe)

Looking at the traffic stats for that post it’s clear to see an upward trend in the page views since about the end of June.

This to me is both a good and bad thing. It tells me that more people are deploying or migrating to the VCSA which is what VMware want…but it also tells me that more people are running into this 503 error and looking for ways to fix it online.

The Very Good:

The vCenter Server Appliance is a brilliant initiative from VMware and there has been a huge effort in developing the platform over the past three to four years to get it to a point where it not only became equal to vCenter’s deployed on Windows (and relying on MSSQL) but surpassed it in a lot of features especially in the vSphere 6.5 release. Most VMware shops are planning to or have migrated from Windows to the VCSA and for VMware labs it’s a no brainer for both corporate or homelab instances.

Personally I’ve been running VCSA’s in my various labs since the 5.5 release, have deployed key management clusters with the VCSA and more recently have proven that even the most mature Windows vCenter can be upgraded with the excellent migration tool. Being free of Windows and more importantly MSSQL is a huge factor in why the VCSA is an important consideration and the fact you get extra goodies like HA and API UI’s adds to it’s value.

The One Bad:

Everyone who has dealt with storage issues knows that it can lead to Guest OS file systems errors. I’ve been involved with shared hosting storage platforms all my career so I know how fickle filesystems can be to storage latency or loss of connectivity. Reading through the many forums and blog posts around the 503 error there seems to be a common denominator of something going wrong with the underlying storage before a reboot triggers the 503 error. Clicking here will show the Google results for VCSA + 503 where you can read the various posts mentioned above.

As you may or may not know the 6.5 VCSA has twelve VMDKs, up from 2 in the initial release and to 11 in the 6.0 release. There a couple of great posts from William Lam and Mohammed Raffic that go through what each disk partition does. The big advantage in having these seperate partitions is that you can manage storage space a lot more granularly.

The problem as mentioned is that the underlying Linux file system is susceptible to storage issue. Not matter what storage platform you are running you are guaranteed to have issues at one point or another. In my experience Linux filesystems don’t deal will with those issues. Windows file systems seem to tolerate storage issue much better than their Linux counterparts and without starting a religious war I do know about the various tweaks that can be done to help make Linux filesystems more resilient to underlying storage issues.

With that in mind, the VCSA is very much susceptible to those same storage issues and I believe a lot of people are running into problems mainly triggered by storage related events. Most of the symptoms of the 503 relate back to key vCenter services unable to start after reboot. This usually requires some intervention to fix or a recovery of the VCSA from backup, but hopefully all that’s needed is to run an e2fsck against the filesystem(s) impacted.

The Solution:

VMware are putting a lot of faith into the VCSA and have done a tremendous job to develop it up to this point. It is the only option moving forward for VMware based platforms however there needs to be a little more work done into the resiliency of the services to protect against external issues that can impact the guest OS. PhotonOS is now the OS of choice from 6.5 onwards but that will not stop the legacy of susceptibility that comes with Linux based filesystems leading to issues such as the 503 error. If VMware can protect key services in the event of storage issues that will go a long way to improving that resiliency.

I believe it will get better and just this week VMware announced a monthly security patch program for the VCSA which shows that they are serious (not to say they where not before) about ensuring the appliance is protected but I’m sure many would agree that it needs to offer reliability as well…this is the one area where the Windows based vCenter has an advantage still.

With all that said, make sure you are doing everything possible to have the VCSA housed on as reliable as possible storage and make sure that you are not only backing up the VCSA and external dependancies correctly but understand how to restore the appliance including understanding of the inbuilt backup mechanisms for backing up the config and the PostGres database.

I love and would certainly recommend the VCSA…I just want to love it a little more without having to deal with possibility of having the 503 server error lurking around every storage event.

References:

http://www.vmwarearena.com/understanding-vcsa-6-5-vmdk-partitions-mount-points/

http://www.virtuallyghetto.com/2016/11/updates-to-vmdk-partitions-disk-resizing-in-vcsa-6-5.html

https://www.veeam.com/wp-vmware-vcenter-server-appliance-backup-restore.html

https://kb.vmware.com/kb/2091961

https://kb.vmware.com/kb/2147154

VMworld 2017 Veeam Recap – Breakouts, TechTalks and Final Thoughts.

Both VMworld US and Europe have come and gone in quick time this year and while I only attended VMworld US my team and other Veeam staff featured at Europe and both event’s where extremely successful for Veeam. I felt VMware had a good couple of shows, the gap between the two was too short I felt and meant that the Europe event was at best, a continuation of the US event in terms of vision and announcements. That said, VMware have made VMworld great again and there was an unmistakable buzz around the conference that I have not felt since at least the 2014 event.

I’d encourage everyone to check out the Top 40 Session YouTube playlist here and make sure you have caught up with all the VMworld announcements. For those interested in what Veeam had going on, i’ve listed the Breakout Sessions and vBrownBag TechTalks below.

Breakout Session Replays:

Across both VMworld’s we had four breakout sessions which where all well received and had great attendance. If you have a MyVMworld account, you can view the session replays below by clicking following the link and clicking on the session playback icon which will take you to a protected YouTube video.

Note: The European session replays haven’t been posted yet, but should be put up this week.

vBrownBag TechTalks:

Veeam was a main sponsor for of the vBrownBag TechTalks across both VMworld’s and the feedback to the format this year was brilliant. For the first time, the talks where listed in the content catalog meaning there was a lot more exposure and attendance was up significantly on previous years. Below are the Veeam related TechTalks covering both events featuring Michael Cade, Clint Wyckoff, Michael White from Veeam, some of our Vangaurd’s and also David Hill from VMware.

Full list here:

Final Thoughts and Wrap Up:

Both VMworld’s from a Veeam point of view where extremely successful with great sessions attendance and more importantly lots of traffic being driving through our booths. There was great energy in the US and I have been told that that continued in Europe. Both parties went off and a great time was had by all that attended.

The team at Veeam is looking forward to building on the momentum gained at VMworld as we look to release v10 of Backup & Replication, Veeam Availability Console and Orchestrator, updated Windows and Linux Agents, Availability for AWS and Veeam Powered Network.

VMworld Top Session YouTube Playlist:

What’s in a name? VSPP to vCAN to VCPP

Prior to VMworld there where rumours floating around that the vCloud Air Network was going to undergo a name change and sure enough at VMworld 2017 in the US, the vCAN was no more and that the VMware Cloud and Service Provider program would be renamed to the VMware Cloud Partner Program. There has been a number of announcements around the VCPP including the upcoming release of vCloud Director 9.0, a new verification program and also at VMworld Europe new cross cloud capabilities with VMware HCX.

VMware is continuing to make significant investments to expand and enhance our portfolio of cloud products and services. At the same time, we will continue to grow and refine our program to better address your needs as a partner and, as a result, enable you to provide even better cloud service options to our mutual customers around the globe.

The VMware Cloud Verified program is interesting and I’m still a little unsure what it delivers above and beyond non verified VMware Clouds…however it seems like a good logo opportunity for providers to aspire to.

This name change was expected given the wrapping up of vCloud Air, however from talking with a lot of people within the old vCloud Air Network, the name will be missed. To me it was the best thing to come out of the whole vCloud Air experiment but I understand why it had to be changed. This isn’t so much a fresh start for the program but more of a signal that it’s growing and improving and is looking to remain a key cornerstone of VMware multi/hybrid cloud strategy.

Even though I am out of the program and not working for a partner anymore, I am very much connected by way of my interactions with the Veeam Cloud and Service Provider program (VCSP) and the success of both is tied back to not only the individual companies remaining innovative and competitive against the large hyper-scalers. It’s also incumbent on VMware and Veeam to continue to offer the tools to be able to make our providers successful.

As a critical component of the Cloud Provider Platform, the recently-announced vCloud Director 9.0 (vCloud Director 9.0 announcement blog) enables simplified cloud consumption for tenants, a fast path to hybrid services, and rapid vSphere-to-cloud migrations for cloud providers worldwide. VMware continues to demonstrate its commitment to investing in the critical products, tools, and solutions that help cloud providers rapidly deploy and monetize highly scalable cloud environments with the least amount of risk.

The name doesn’t matter…but the technology and execution of service sure as hell does!

Note: Visit CloudProviders.VMware.com. Subscribe to the VMware Cloud Provider Blog, follow @vmwarecloudprvd on Twitter or ‘like’ VMware Cloud on Facebook for future updates.

Quick Fix: OVF package with compressed disks is currently not supported

A couple of weeks ago I ran into an issue stopping me from importing an OVA and today I came across another issue relating to the Web Client not able to import OVF packages with compressed disks.

There seems to be a lot of issues to do with OVF/A operations in vSphere 6.5 Update 1…in fact there are 187 mentioned of OVF and 95 mentions of OVA in the release notes. Searching through the release notes I found a specific entry relating to this issue that I came across and it’s work around.

Deploying an OVF template containing compressed file references might fail
When you deploy an OVF template containing compressed files references (typically compressed using gzip), the operation fails.

The following is an example of an OVF element in the OVF descriptor:
<References>
<File ovf:size="458" ovf:href="valid_disk.vmdk.gz" ovf:compression="gzip" ovf:id="file1"></File>
</References>

The workaround is to download OVFTool and run a simple command to convert the OVF or OVA template to one without the compressed file…which in effect its just a copy of the original.

Seems like a strange fix but it works!

References:

https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vsphere-esxi-vcenter-server-65-release-notes.html

« Older Entries