Tag Archives: vCenter

The One Problem with the VCSA

Over the past couple of months I noticed a trend in my top blog daily reporting…the Quick fix post on fixing a 503 Service Unavailable error was constantly in the top 5 and getting significant views. The 503 error in various forms has been around since the early days of the VCSA which usually manifests it’s self with the following.

503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x0000559b1531ef80] _serverNamespace = / action = Allow _pipeName =/var/run/vmware/vpxd-webserver-pipe)

Looking at the traffic stats for that post it’s clear to see an upward trend in the page views since about the end of June.

This to me is both a good and bad thing. It tells me that more people are deploying or migrating to the VCSA which is what VMware want…but it also tells me that more people are running into this 503 error and looking for ways to fix it online.

The Very Good:

The vCenter Server Appliance is a brilliant initiative from VMware and there has been a huge effort in developing the platform over the past three to four years to get it to a point where it not only became equal to vCenter’s deployed on Windows (and relying on MSSQL) but surpassed it in a lot of features especially in the vSphere 6.5 release. Most VMware shops are planning to or have migrated from Windows to the VCSA and for VMware labs it’s a no brainer for both corporate or homelab instances.

Personally I’ve been running VCSA’s in my various labs since the 5.5 release, have deployed key management clusters with the VCSA and more recently have proven that even the most mature Windows vCenter can be upgraded with the excellent migration tool. Being free of Windows and more importantly MSSQL is a huge factor in why the VCSA is an important consideration and the fact you get extra goodies like HA and API UI’s adds to it’s value.

The One Bad:

Everyone who has dealt with storage issues knows that it can lead to Guest OS file systems errors. I’ve been involved with shared hosting storage platforms all my career so I know how fickle filesystems can be to storage latency or loss of connectivity. Reading through the many forums and blog posts around the 503 error there seems to be a common denominator of something going wrong with the underlying storage before a reboot triggers the 503 error. Clicking here will show the Google results for VCSA + 503 where you can read the various posts mentioned above.

As you may or may not know the 6.5 VCSA has twelve VMDKs, up from 2 in the initial release and to 11 in the 6.0 release. There a couple of great posts from William Lam and Mohammed Raffic that go through what each disk partition does. The big advantage in having these seperate partitions is that you can manage storage space a lot more granularly.

The problem as mentioned is that the underlying Linux file system is susceptible to storage issue. Not matter what storage platform you are running you are guaranteed to have issues at one point or another. In my experience Linux filesystems don’t deal will with those issues. Windows file systems seem to tolerate storage issue much better than their Linux counterparts and without starting a religious war I do know about the various tweaks that can be done to help make Linux filesystems more resilient to underlying storage issues.

With that in mind, the VCSA is very much susceptible to those same storage issues and I believe a lot of people are running into problems mainly triggered by storage related events. Most of the symptoms of the 503 relate back to key vCenter services unable to start after reboot. This usually requires some intervention to fix or a recovery of the VCSA from backup, but hopefully all that’s needed is to run an e2fsck against the filesystem(s) impacted.

The Solution:

VMware are putting a lot of faith into the VCSA and have done a tremendous job to develop it up to this point. It is the only option moving forward for VMware based platforms however there needs to be a little more work done into the resiliency of the services to protect against external issues that can impact the guest OS. PhotonOS is now the OS of choice from 6.5 onwards but that will not stop the legacy of susceptibility that comes with Linux based filesystems leading to issues such as the 503 error. If VMware can protect key services in the event of storage issues that will go a long way to improving that resiliency.

I believe it will get better and just this week VMware announced a monthly security patch program for the VCSA which shows that they are serious (not to say they where not before) about ensuring the appliance is protected but I’m sure many would agree that it needs to offer reliability as well…this is the one area where the Windows based vCenter has an advantage still.

With all that said, make sure you are doing everything possible to have the VCSA housed on as reliable as possible storage and make sure that you are not only backing up the VCSA and external dependancies correctly but understand how to restore the appliance including understanding of the inbuilt backup mechanisms for backing up the config and the PostGres database.

I love and would certainly recommend the VCSA…I just want to love it a little more without having to deal with possibility of having the 503 server error lurking around every storage event.

References:

http://www.vmwarearena.com/understanding-vcsa-6-5-vmdk-partitions-mount-points/

http://www.virtuallyghetto.com/2016/11/updates-to-vmdk-partitions-disk-resizing-in-vcsa-6-5.html

https://www.veeam.com/wp-vmware-vcenter-server-appliance-backup-restore.html

https://kb.vmware.com/kb/2091961

https://kb.vmware.com/kb/2147154

migrate2vcsa – Migrating vCenter 6.0 to 6.5 VCSA

Over the past few years i’ve written a couple of articles on upgrading vCenter from 5.5 to 6.0. Firstly an in place upgrade of the 5.5 VCSA to 6.0 and then more recently an in place upgrade of a Windows 5.5 vCenter to 6.0. This week I upgraded and migrated my NestedESXi SliemaLab vCenter using the migrate2vcsa tool that’s now bundled into the vCenter 6.5 ISO. The process worked first time and even though I held some doubts about the migration working without issue and my Windows vCenter is now in retirement.

The migration tool that’s part of vSphere 6.5 was actually first released as a VMware fling after it was put forward as an idea in 2013. It was then officially to GA with the release of vSphere 6.0 Update 2m…where m stood for migration. Over it’s development it has been championed by William Lam who has written a number of articles on his blog and more recently Emad Younis has been the technical marketing lead on the product as it was enhanced for vSphere 6.5.

Upgrade Options:

You basically have two options to upgrade a Windows based 6.0 vCenter:

My approach for this particular environment was to ensure a smooth upgrade to vSphere 6.0 Update 2 and then look to upgrade again to 6.5 once is thaws outs in the market. The cautious approach will still be undertaken by many and a stepped upgrade to 6.5 and migration to the VCSA will still be common place. For those that wish to move away from their Windows vCenter, there is now a very reliable #migrate2vcsa path…as a side note it is possible to migrate directly from 5.5 to 6.5.

Existing Component Versions:

  • vCenter 6.0 (4541947)
    • NSX Registered
    • vCloud Director Registered
    • vCO Registered
  • ESXi 6.0 (3620759)
  • Windows 2008 (RTM)
  • SQL Server 2008 R2 (10.50.6000.34)

All vCenter components where installed on the Windows vCenter instance including Upgrade Manager. There where also a number of external services registered agains’t the vCenter of which the NSX Manager needed to be re-registered for the SSO to allow/trust the new SSL certificate thumbprint. This is common, and one to look out for after migration.

Migration Process:

I’m not going to go through the whole process as it’s been blogged about a number of times, but in a nutshell you need to

  • Take a backup of your existing Windows vCenter
  • I took a snapshot as well before I began the process
  • Download the vCenter Server Appliance 6.5 ISO and mount the ISO
  • Copy the migration-assistant folder to the Windows vCenter
  • Start the migration-assistant tool and work through the pre-checks

If all checks complete successfully the migration assistant will finish at waiting for migration to start. From here you start the VCSA 6.5 installer and click on the Migrate menu option.

Work through the wizard which asks you for detail on the source and target servers, lets you select the compute, storage and appliance size as well as the networking settings. Once everything is entered we are ready to start Stage 1 of the process.

When Stage 1 finishes you are taken to Stage 2 where is asks you to select the migration data as shown below. This will give you some idea as to how much storage you will need and what the initial foot print of the over and above the actual VCSA VM storage.

There are a couple more steps the migration assistant goes through to complete the process…which for me took about 45 minutes to complete but this will vary depending on the amount of date you want to transfer across.

If there are any issues or if the migration failed at any of the steps you do have the option to power down/remove the new VCSA and power back on the old Windows vCenter as is. The old Windows vCenter would have been shutdown by the migration process just as the copying of the key data finished and the VCSA was rebooted with network settings and machine name copied across. There is proper roll back series of steps listed in this VMwareKB.

The only external service that I needed to re-register against vCenter was NSX. vCloud Director carried on without issue, but it’s worth checking out all registered services just in case.

Conclusion and Thoughts:

As mentioned at the start, I was a bit skeptical that this process would work as flawlessly as it did…and on it’s first time! It’s almost a little disappointing to have this as automated and hands off as it is, but it’s a testament to the engineering effort the team at VMware has done around this tool to make it a very viable and reliable way to remove dependancies on Windows and MSSQL. It also allows those with older version of Windows that are well past their used by date the ability to migrate to the VSCA with absolute confidence.

References:

http://www.virtuallyghetto.com/page/2?s=migrate2vcsa

https://github.com/younise/migrate2vcsa-resources

Released: vCenter and ESXi 6.0 Update 3 – What’s in It for Service Providers

Last month I wrote a blog post on upgrading vCenter 5.5 to 6.0 Update 2 and during the course of writing that blog post I conducted a survey on which version of vSphere most people where seeing out in the wild…overwhelmingly vSphere 6.0 was the most popular version with 5.5 second and 6.5 lagging in adoption for the moment. It’s safe to assume that vCenter 6.0 and ESXi 6.0 will be common deployments for some time in brownfield sites and with the release of Update 3 for vCenter and ESXi I thought it would be good to again highlight some of the best features and enhancements as I see them from a Service Provider point of view.

vCenter 6.0 Update 3 (Build 5112506)

This is actually the eighth build release of vCenter 6.0 and includes updated TLS support for v1.0 1.1 and 1.2 which is worth a look in terms of what it means for other VMware products as it could impact connectivity…I know that vCloud Director SP now expects TLSv 1.1 by default as an example. Other things listed in the What’s New include support for MSSQL 2012 SP3, updated M2VCSA support, timezone updates and some changes to the resource allocation for the platform services controller.

Looking through the Resolved Issue there are a number of networking related fixes in the release plus a few annoying problems relating to vMotion. The ones below are the main ones that could impact on Service Provider operations.

  • Upgrading vCenter Server from version 6.0.0b to 6.0.x might fail. 
    Attempts to upgrade vCenter Server from version 6.0.0b to 6.0.x might fail. This issue occurs while starting service An error message similar to the following is displayed in the run-updateboot-scripts.log file.
    “Installation of component VCSServiceManager failed with error code ‘1603’”
  • Managing legacy ESXi from the vCenter Server with TLSv1.0 disabled is impacted.
    vCenter Server with TLSv1.0 disabled supports management of legacy ESXi versions in 5.5.x and 6.0.x. ESXi 5.5 P08 and ESXi 6.0 P02 onwards is supported for 5.5.x and 6.0.x respectively.
  • x-VC operations involving legacy ESXi 5.5 host succeeds.
    x-VC operations involving legacy ESXi 5.5 host succeeds. Cold relocate and clone have been implicitly allowed for ESXi 5.5 host.
  • Unable to use End Vmware Tools install option using vSphere Client.
    Unable to use End VMware Tools install option while installing VMware Tools using vSphere Client. This issue occurs after upgrading to vCenter Server 6.0 Update 1.
  • Enhanced vMotion fails to move the vApp.VmConfigInfo property to destination vCenter Server.
    Enhanced vMotion fails to move the vApp.VmConfigInfo property to destination vCenter Server although virtual machine migration is successful.
  • Storage vMotion fails if the VM is connected with a CD ISO file.
    If the VM is connected with a CD ISO file, Storage vMotion fails with an error similar to the following:
  • Unregistering an extension does not delete agencies created by a solution plug-in.
    The agencies or agents created by a solution such as NSX, or any other solution which uses EAM is not deleted from the database when the solution is unregistered as an extension in vCenter Server.

ESXi 6.0 Update 3 (Build 5050593)

The what’s new in ESXi is a lot more exciting than what’s new with vCenter highlighted by a new Host Client and fairly significant improvements in vSAN performance along with similar TLS changes that are included in the vCenter update 3. With regards to the Host Client the version is now 1.14.0. and includes bug fixes and brings it closer to the functionality provided by the vSphere Client. It’s also worth mentioning that new versions of the Host Client continue to be released through the VMware Labs Flings site. but, those versions are not officially supported and not recommended for production environments.

For vSAN, multiple fixes have been introduced to optimize I/O path for improved vSAN performance in All Flash and Hybrid configurations and there is a seperate VMwareKB that address the fixes here.

  • More Logs Much less Space vSAN now has efficient log management strategies that allows more logging to be packed per byte of storage. This prevents the log from reaching its assigned limit too fast and too frequently. It also provides enough time for vSAN to process the log entries before it reaches it’s assigned limit thereby avoiding unnecessary I/O operations
  • Pre-emptive de-staging vSAN has built in algorithms that de-stages data on periodic basis. The de-staging operations coupled with efficient log management significantly improves performance for large file deletes including performance for write intensive workloads
  • Checksum  Improvements vSAN has several enhancements that made the checksum code path more efficient. These changes are expected to be extremely beneficial and make a significant impact on all flash configurations, as there is no additional read cache look up. These enhancements are expected to provide significant performance benefits for both sequential and random workloads.

As with vCenter, I’ve gone through and picked out the most significant bug fixes as they relate to Service Providers. The first one listed below is important to think about as it should significantly reduce the number of failures that people have been seeing with ESXi installed on SD-Flash Card and not just for VDI environments as the release notes suggest.

  • High read load of VMware Tools ISO images might cause corruption of flash media  In VDI environment, the high read load of the VMware Tools images can result in corruption of the flash media.
    You can copy all the VMware Tools data into its own ramdisk. As a result, the data can be read from the flash media only once per boot. All other reads will go to the ramdisk. vCenter Server Agent (vpxa) accesses this data through the /vmimages directory which has symlinks that point to productLocker.
  • ESXi 6.x hosts stop responding after running for 85 days
    When this problem occurs, the /var/log/vmkernel log file displays entries similar to the followingARP request packets might drop.
  • ARP request packets between two VMs might be dropped if one VM is configured with guest VLAN tagging and the other VM is configured with virtual switch VLAN tagging, and VLAN offload is turned off on the VMs.
  • Physical switch flooded with RARP packets when using Citrix VDI PXE boot
    When you boot a virtual machine for Citrix VDI, the physical switch is flooded with RARP packets (over 1000) which might cause network connections to drop and a momentary outage. This release provides an advanced option /Net/NetSendRARPOnPortEnablement. You need to set the value for /Net/NetSendRARPOnPortEnablementto 0 to resolve this issue.
  • Snapshot creation task cancellation for Virtual Volumes might result in data loss
    Attempts to cancel snapshot creation for a VM whose VMDKs are on Virtual Volumes datastores might result in virtual disks not getting rolled back properly and consequent data loss. This situation occurs when a VM has multiple VMDKs with the same name and these come from different Virtual Volumes datastores.
  • VMDK does not roll back properly when snapshot creation fails for Virtual Volumes VMs
    When snapshot creation attempts for a Virtual Volumes VM fail, the VMDK is tied to an incorrect data Virtual Volume. The issue occurs only when the VMDK for the Virtual Volumes VM comes from multiple Virtual Volumes datastores.
  • ESXi host fails with a purple diagnostic screen due to path claiming conflicts
    An ESXi host displays a purple diagnostic screen when it encounters a device that is registered, but whose paths are claimed by a two multipath plugins, for example EMC PowerPath and the Native Multipathing Plugin (NMP). This type of conflict occurs when a plugin claim rule fails to claim the path and NMP claims the path by default. NMP tries to register the device but because the device is already registered by the other plugin, a race condition occurs and triggers an ESXi host failure.
  • ESXi host fails with a purple diagnostic screen due to path claiming conflicts
    An ESXi host displays a purple diagnostic screen when it encounters a device that is registered, but whose paths are claimed by a two multipath plugins, for example EMC PowerPath and the Native Multipathing Plugin (NMP). This type of conflict occurs when a plugin claim rule fails to claim the path and NMP claims the path by default. NMP tries to register the device but because the device is already registered by the other plugin, a race condition occurs and triggers an ESXi host failure.
  • ESXi host fails to rejoin VMware Virtual SAN cluster after a reboot
    Attempts to rejoin the VMware Virtual SAN cluster manually after a reboot might fail with the following error:
    Failed to join the host in VSAN cluster (Failed to start vsantraced (return code 2)
  • Virtual SAN Disk Rebalance task halts at 5% for more than 24 hours
    The Virtual SAN Health Service reports Virtual SAN Disk Balance warnings in the vSphere Web Client. When you click Rebalance disks, the task appears to halt at 5% for more than 24 hours.

It’s also worth reading through the Known Issues section as there is a fair bit to be aware of especially if running NFS 4.1 and worth looking through the general storage issues.

Happy upgrading!

References:

http://pubs.vmware.com/Release_Notes/en/vsphere/60/vsphere-vcenter-server-60u3-release-notes.html

http://pubs.vmware.com/Release_Notes/en/vsphere/60/vsphere-esxi-60u3-release-notes.html

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2149127

Upgrading Windows vCenter 5.5 to 6.0 In-Place: Issues and Fixes

Yes that’s not a typo…this post is focusing on upgrading Windows vCenter 5.5 to 6.0 via an in-place upgrade. There is the option to use the vSphere 6.0 Update2M build with the included Migrate to VCSA tool to achieve this and move away from Windows, but I thought it was worth documenting my experiences with a mature vCenter that’s at version 5.5 Update 2 and upgrade that to 6.0 Update 2. Eventually this vCenter will need to move off the current Windows 2008 RTM server which will bring into play the VCSA migration however for the moment it’s going to be upgraded to 6.0 on the same server.

With VMware releasing vSphere 6.5 in November there should be an increased desire for IT shops to start seriously thinking about moving on from there existing vSphere versions and upgrading to the latest 6.5 release however many people I know where still running vSphere 5.5, so the jump to 6.5 directly might not be possible due to internal policies or other business reasons. Interestingly in the rough numbers, I’ve got an active Twitter Poll out at the moment which after 100 votes shows that vSphere 5.5 makes up 53% of the most common vCenter version, followed by 6.0 with 44% and 6.5 with only 3%.

Upgrade Options:

You basically have two options to upgrade a Windows based 5.5 vCenter:

My approach for this particular environment (which is a NestedESXi lab environment) was to ensure a smooth upgrade to vSphere 6.0 Update 2 and then look to upgrade again to 6.5 once is thaws outs in the market. That said, I haven’t read too many issues with vSphere 6.5 and VMware have been excellent in ensuring that the 6.5 release was the most stable for years. The cautious approach will still be undertaken by many and a stepped upgrade to 6.5 and migration to the VCSA will be common place. For those that wish to move away from their Windows vCenter, there is nothing stopping you from going down the Migrate2VCSA path, and it is possible to migrate directly from 5.5 to 6.5.

Existing Component Versions:

  • vCenter 5.5 (2001466)
  • ESXi 5.5 (3116895)

SQL Version Requirements:

vCenter 6.0 Update 2 requires at least SQL Server 2008 R2 SP1 or higher, so if you are running anything lower than that you will need to upgrade to a later service pack or upgrade to later versions of SQL Server. For a list of all compatible databases click here.

vCenter Upgrade Pre-Upgrade Checks:

First step is to make sure you have a backup of the vCenter environment meaning VM state (Snapshot) and vCenter database backup. Once that’s done there are a few pre-requisites that need to be met and that will be checked by the upgrade process before the actual upgrade occurs. The first thing the installer will do after asking for the SSO and VC service account password is run the Pre-Upgrade Checker.

vCenter SSL and SSO SSL System Name Mismatch Error:

A common issue that may pop up from the pre-upgrade checker is the warning below talking about an issue with the system name of the vCenter Server certificate and the SSO certificate. As shown below it’s a hard stop and tells you to replace one or the other certificate so that the same system name is used.

If you have a publicly signed SSL Certificate you will need to generate a new cert request and submit that through the public authority of choice. The quickest way to achieve this for me was to generate a new self signed certificate by following the VMwareKB article here. Once that’s been generated you can replace the existing certificate by following a previous post I did using the VMware SSL Certificate Updater Tool.

After all that, in any case I got the warning below saying that the 5.5 SSL Certificates do not meet security requirements, and so new SSL certificates will need to be generated for vCenter Server 6.0.0.

With that, my suggestion would be to generate a temporary self signed certificate for the upgrade and then apply a public certificate after that’s completed.

Ephemeral TCP Port Error:

Once the SSL mismatch error has been sorted you can run the pre-upgrade checker again. Once that completes successfully you move onto the Configure Ports window. I ran into the error shown below that states that the range of port is too large and the system must be reconfigured to use a smaller ephemeral port range before the install can continue.

The fix is presented in the error message so after running netsh.exe int ipv4 set dynamicportrange tcp 49152 16384 you should be ok to hit Next again and continue the upgrade.

Export of 5.x Data:

During the upgrade the 5.5 data is stored in a directory and then migrated to 6.0. You need to ensure that you have enough room on the drive location to cater for your vCenter instance. While I haven’t seen any offical rules around the storage required, I would suggest having enough storage free and the size of your vCenter SQL database data file.

vCenter Upgrade:

Once you have worked through all the upgrade screens you are ready for upgrade. Confirm the settings, take note of the fact that once updated the vCenter will be in evaluation mode, meaning you need to apply a new vCenter 6.x license once completed, check the checkbox that states you have a backup of the vCenter machine and database and you should be good to go.

Depending on the size of you vCenter instance and the speed of your disks the upgrade can take anywhere from 30 to 60 minutes or longer. If at any time the upgrade process fails during the initial export of the 5.5 data a roll back via the installer is possible…however if there is an issue while 6.0 is being installed the likelihood is that you will need to recover from backups.

Post Upgrade Checks:

Apart from making sure that the upgrade has gone through smoothly by ensuring all core vCenter services are up and running, it’s important to check any VMware or third party services that where registered against the vCenter especially given that the SSL Certificate has been replaced a couple of times. Server applications like NSX-v, vCloud Director and vCO explicitly trust SSL certificates so the registration needs to be actioned again. Also if you are running Veeam Backup & Replication you will need to go through the setup process again to accept the new SSL Certificate otherwise your backup jobs will fail.

If everything has gone as expected you will have a functional vCenter 6.0 Update 2 instance and planning can now take place for the 6.5 upgrade and in my case…the migration from Windows to the VCSA.

References:

http://www.vmware.com/resources/compatibility/sim/interop_matrix.php#db&2=998

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1029944

 

vSphere 6.5 – Whats in it for Service Providers Part 1

Last week after an extended period of development and beta testing VMware released vSphere 6.5. This is a lot more than a point release and is a major major upgrade from vSphere 6.0. In fact, there is so much packed into this new release that there is an official whitepaper listing all the features and enhancements that had been linked from the release notes.  I thought I would go through some of the key features and enhancements that are included in the latest versions of vCenter and ESXi and as per usual I’ll go through those improvements that relate back to the Service Providers that use vSphere as the foundation of their Managed or Infrastructure as a Service offerings.

Generally the “whats new” would fit into one post, however having gotten through just the vCenter features it became apparent that this would have to be a multi-post series…this is great news for vCloud Air Network Service Providers out there as it means there is a lot packed in for IaaS and MSPs to take advantage of.

With that, in this post will cover the following:

  • vCenter 6.5 New Features
  • vCD and NSX Compatibility
  • Current Known Issues

vCenter 6.5 New Features:

Without question the enhancements to the VCSA stand out as one of the biggest features of 6.5 and as mentioned in the whitepaper, the installer process has been overhauled and is a much smoother, streamlined experience than with previous versions. It’s also supported across more operating systems and the 6.5 version of vCenter now surpasses the Windows version offering the migration tool, native high availability and built in backup and restore. One interesting sidenote to the new VCSA is that the HTML5 vSphere Client has shipped, though it’s still very much a work in progress as a lot of unsupported functionality mentioned in the release notes…there is lots of work to do to bring it up to parity with the Flex Web Client.

In terms of the inbuilt PostGreSQL database I think it’s time that Service Providers feel confident in making the switch away from MSSQL (which was the norm with Windows based vCenters) as the enhanced VCSA Management Interface (found on port 5480) has a new monitoring screen showing information relating to disk space usage and also provides a way to gracefully start and stop the database engine.

Other vCenter enhancements that Service Providers will make use of is the High availability feature which is something a lot of people have been asking for a long time. For me, I always dealt with the no HA constraint in that vCenter may become unavailable for 5-10 minutes during maintenance or at worse an extended outage while recovering from a VM or OS level failure. Knowing that hosts and VMs are still working and responding with vCenter down leaving only core management functionality unavailable it was a risk myself and others were willing to take. However, in this day of the always on datacenter it’s expected that management functionality be as available at IaaS services…so with that, this HA feature is well welcomed for Service Providers.

This native HA solution is available exclusively for the VCSA and the solution consists of active, passive, and witness nodes that are cloned from the existing vCenter Server instance. The HA cluster can be enabled, disabled, or destroyed at any time. There is also a maintenance mode that prevents planned maintenance from causing an unwanted failover.

The VCSA Migration Tool that was previously released in 6.0 Update 2m is shipped in the VCSA ISO and can be used to migrate from Windows based 5.5 vCenter’s to the 6.5 VCSA. Again this is something that more and more service providers will take advantage of as the reliance on Windows based vCenters and MSSQL becomes more and more something that’s unwanted from a manageability and cost point of view. Throw in the enhanced features that have only been released for the VCSA and this is a migration that all service providers should be planning.

To complete the move away from any Windows based dependencies the vSphere Update Manager has also been fully integrated into the VCSA. VUM is now fully integrated into the Web Client UI and is enabled by default. For larger environments with a large numbers of hosts AutoDeploy is now fully manageable from the VCSA UI and doesn’t require PowerCLI to manage or configure it’s options. There is a new image builder included in the UI that can hit local or public repositories to pull images or drivers and there are performance enhancements during deployments of ESXi images to hosts.

vCD and NSX Compatibility:

Shifting from new features and enhancements to an important subject to talk about when talking service provider platform…VMware product compatibility. For those vCAN Service Providers running a Hybrid Cloud you should be running a combination of vCloud Director SP or/and NSX-v of which, at the moment there is no support for either in vSphere 6.5. No compatible versions of NSX are available for vSphere 6.5. If you attempt to prepare your vSphere 6.5 hosts with NSX 6.2.x, you receive an error message and cannot proceed.

I haven’t tested to see if vCloud Director SP will connect and interact with vCenter 6.5 or ESXi 6.5 however as it’s not supported I wouldn’t suggest upgrading production IaaS platforms until the interoperability matrix’s are updated.

At this stage there is no word on when either product will support vSphere 6.5 but I suspect we will see NSX-v come out with a supported build shortly…though I’m expecting vCloud Director SP to no support 6.5 until the next major version release, which is looking like the new year.

Installation and Upgrade Known Issues:

Having read through the release notes, there are also a number of known issues you should be aware of. I’ve gone through those and pulled the ones I consider the most likely to be impactful to IaaS platforms.

  • After upgrading to vCenter Server 6.5, the ESXi hosts in High Availability clusters appear as Not Ready in the VMware NSX UI
    If your vSphere environment includes NSX and clusters configured with vSphere High Availability, after you upgrade to vCenter Server 6.5, both NSX and vSphere High Availability start installing VIBs on all hosts in the clusters. This might cause installation of NSX VIBs on some hosts to fail, and you see the hosts as Not Ready in the NSX UI.
    Workaround: Use the NSX UI to reinstall the VIBs.
  • Error 400 during attempt to log in to vCenter Server from the vSphere Web Client
    You log in to vCenter Server from the vSphere Web Client and log out. If, after 8 hours or more, you attempt to log in from the same browser tab, the following error results.
    400 An Error occurred from SSO. urn:oasis:names:tc:SAML:2.0:status:Requester, sub status:nullWorkaround: Close the browser or the browser tab and log in again.
  • Using storage rescan in environments with the large number of LUNs might cause unpredictable problems
    Storage rescan is an IO intensive operation. If you run it while performing other datastore management operation, such as creating or extending a datastore, you might experience delays and other problems. Problems are likely to occur in environments with the large number of LUNs, up to 1024, that are supported in the vSphere 6.5 release.Workaround: Typically, storage rescans that your hosts periodically perform are sufficient. You are not required to rescan storage when you perform the general datastore management tasks. Run storage rescans only when absolutely necessary, especially when your deployments include a large set of LUNs.
  • In vSphere 6.5, the name assigned to the iSCSI software adapter is different from the earlier releases
    After you upgrade to the vSphere 6.5 release, the name of the existing software iSCSI adapter, vmhbaXX, changes. This change affects any scripts that use hard-coded values for the name of the adapter. Because VMware does not guarantee that the adapter name remains the same across releases, you should not hard code the name in the scripts. The name change does not affect the behavior of the iSCSI software adapter.Workaround: None.
  • The bnx2x inbox driver that supports the QLogic NetXtreme II Network/iSCSI/FCoE adapter might cause problems in your ESXi environment
    Problems and errors occur when you disable or enable VMkernel ports and change the failover order of NICs for your iSCSI network setup.Workaround: Replace the bnx2x driver with an asynchronous driver. For information, see the VMware Web site.
  • When you use the Dell lsi_mr3 driver version 6.903.85.00-1OEM.600.0.0.2768847, you might encounter errors
    If you use the Dell lsi_mr3 asynchronous driver version 6.903.85.00-1OEM.600.0.0.2768847, the VMkernel logs might display the following message ScsiCore: 1806: Invalid sense buffer.Workaround: Replace the driver with the vSphere 6.5 inbox driver or an asynchronous driver from Broadcom.
  • Storage I/O Control settings are not honored per VMDK
    Storage I/O Control settings are not honored on a per VMDK basis. The VMDK settings are honored at the virtual machine level.Workaround: None.
  • Cannot create or clone a virtual machine on a SDRS-disabled datastore cluster
    This issue occurs when you select a datastore that is part of a SDRS-disabled datastore cluster in any of the New Virtual Machine, Clone Virtual Machine (to virtual machine or to template), or Deploy From Template wizards. When you arrive at the the Ready to Complete page and click Finish, the wizard remains open and nothing appears to occur. The Datastore value status for the virtual machine might display “Getting data…” and does not change.Workaround: Use the vSphere Web Client for placing virtual machines on SDRS-disabled datastore clusters.

These are just a few, that I have singled out…it’s worth reading through all the known issues just in case there are any specific issues that might impact you.

In the next post in this vSphere 6.5 for Service Providers series I will cover, more vCenter features as well as ESXi enhancements and what’s new in Core Storage.

References:

http://pubs.vmware.com/Release_Notes/en/vsphere/65/vsphere-esxi-vcenter-server-65-release-notes.html

http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/whitepaper/vsphere/vmw-white-paper-vsphr-whats-new-6-5.pdf

http://pubs.vmware.com/Release_Notes/en/vsphere/65/vsphere-client-65-html5-functionality-support.html

Quick Post: Web Client vs VI Client Permissions with VCSA

I’ve been using the VCSA for a couple of years now since the release of vSphere 5.5 and have been happily using the upgraded 6.0 version for a couple of my environments As with most people I found the adjustment going from the VI Client to the new Web Client to be a little rough and I do still find myself going between the two while performing different tasks and configuration actions.

I caught this tweet from Luis Ayuso overnight which he was asking if I had found out the answer to a tweet I had put out almost a year ago meaning it had had a Google Hit as the best response.

After Luis’s issues I decided to put together a very quick post outlining in a basic way what needs to be configured for like for like access in both the Web Client and in the VI Client. In this scenario I have a single VM deployment of the 6.0 VCSA with a simple install of the Platform Services Controller and a SSO Domain configured and the VCSA connected and configured to a local Active Directory.

Let’s start by logging in with a user that’s got no permissions set but is a member of the AD domain. As you can see the Web Client will allow the user to log in but show an empty inventory…the VI Client gives you a “You Shall Not Pass!” response.

I then added the user to the AD Group that had been granted Administrator permissions in the VI Client at the top level.

These match what you see from the Web Client

Logging back into the VI Client the user now has full admin rights

However if you log into the Web Client you still get the Empty Inventory message. To get the user the same access in the Web Client as the VI Client you need to log into the Web Client using the SSO Admin account, head to Administration -> Users and Groups -> Groups and select the Administrators group in the main window. Under Group Members search the AD Domain for the user account or group and add to the membership.

Now when you log into the Web Client with the user account you should see the full inventory and have admin access to perform tasks on vCenter Objects.

This may not be 100% best practice way to achieve the goal but it works and you should consider permission structures for vCenter relative to your requirements.

Dealing with a Revoked vCenter SSL Certificate

Certificates and VMware don’t go together like a horse and carriage… And while I’ve never really had a major issue with SSL certs in VMware mainly because on a personal level I am ok with using self signed or default certificates (queue security nuts) I was forced recently to change a publicly signed vCenter SSL Certificate which also doubled as the Web Client SSL Certificate. This was due to VeriSign revoking the certificate that had been purchased on a per year renewal plan…the vCenter Client doesn’t like revoked certs.

Prior to vSphere 5.5 my usual trick of simply replacing the rui.crt and rui.key files in the vCenter/Web Client SSL folder and restarting vCenter didn’t work…in fact the vCenter Service (5.5 Update 2) won’t start if its done that way anymore…this is mainly due to the reliance on the SSO and Inventory services that don’t like the SSL thumbprint to be changed underneath them.

To resolve this I had to read through and learn how to use the VMware SSL Certificate Automation Tool. Once mastered it’s a great tool and lets you change/update all relevant vSphere SSL Certificates. Below is the quick and easy command line walkthrough to get the job done…note that you need to build up the SSL Certificate Chain correctly and make one small modification the ssl-environment.bat file

set ssl_tool_no_cert_san_check=1

A Couple of vCenter and Web Client service restarts later and the SSL Certificate has been replaced. While there are a lot more options there I only needed two steps to replace the original publicly signed certificate as all other certificates where the internally generated certs…As a specific heads up from the KB, these where the issues I ran into

  • SSL Certificate Update fails if vCenter Single Sign-On Password contains spaces or special characters such as &, ^, %, <.If the vCenter Single Sign-On password has a space or any special characters, such as &, ^, %, or <, the configuration of the Inventory service fails.To work around this issue, change the vCenter Single Sign-On password so it does not contain a space or any of the special characters &, ^, %, < in it.

  • If the certificate chain file for vCenter Single Sign-On is out-of-order, you see an error similar to:Certificate chain is incomplete: the root authority certificate is not present and could not be detected automatically. The presence of the root certificate is required so the other service can establish trust to this service. Try adding the authority certificate manually.To resolve this issue, ensure that the certificate chain file for vCenter Single Sign-On is created in the correct order. For more information, see Generating certificates for use with the VMware SSL Certificate Automation Tool (2044696).

References:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2057340

https://my.vmware.com/group/vmware/details?productId=351&downloadGroup=SSLTOOL550

 

 

Quick Fix: vCenter 5.5 Update 3x Phone Home Warning and VPXD Service not Starting

This week I’ve been upgrading vCenter in a couple of our labs and came across this issue during and after the upgrade of vCenter from 5.5 Update 2 to 5.5 Update 3a or 3b. During the upgrade of the vCenter the error below is thrown.

It’s an easy one to ignore as it only relates to the Phone Home Service…which to be honest I didn’t think would or was important at the time. When you click ok the installed finished as being successful, however the vCenter Service is not brought up automatically and when you go to start the service you get the following error from the services manager.

Not sure why the Googling for this particular error wasn’t as straight forward to search against but if you search to Error 1053 or Error 1053 + VMware you get referenced to some generic forum issues and this VMware KB which is a red herring in relation to this error. With that I went back to search against the Phone Home Warning 32014 and got a hit against this VMware KB which contains the exact error and reference to the deployPkg.dll that you would see in the Windows Application Event Logs when you try to start the vCenter Service.

The KB title is a little misleading in that it states

Updating vCenter Server 5.5 to Update 3 fails with the error: Warning 32014

However the fix is the right fix and after working through the work around in the KB the upgrades went through without issue and vCenter was at 5.5 Update 3b.

References:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2134141

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2069296

Quick Fix: VCSA Web Client 6.0 Throws Monitoring Errors

This quick fix post is for those out there who are still using vCenter Operations Manager 5.8.x and are or thinking about deploying or upgrading to vCenter 6.0…I came across this annoying situation all of a sudden while working on a new vCenter instance when the Web Client started to report the error shown below.

This can be ignored by clicking no and you will still be able to operate most areas of the Web Client but you will find that Monitoring and Health pages fail to load and give you a generic Error #2036 as shown below.

It took me a while to realize that the error was related specifically to the monitoring modules and it finally clicked in my head that the error started happening when I Registered the vCenter against my lab vCOPs instance. I was still running vCOPs (not vRA) and the instance hadn’t been upgraded to the latest build. Having a look through the VMwareKBs I came across KB 2111224 which explained the cause.

This issue occurs because vRealize Operations Manager versions prior to 5.8.5 are not supported in the vSphere 6.0 environment.

Upgrading the vCOPs Appliances to build 5.8.5-2532416 sorted the issue and I was able to browse through the Web Client without the error and have the integrated Health Monitoring work without issue.

References:

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=2111224&sliceId=1&docTypeID=DT_KB_1_1&dialogID=850159080&stateId=0%200%20850161293

vSphere 5.5 Update 3 Released: Features and Top Fixes

vSphere 5.5 Update 3 was released earlier today and there are a bunch of bug fixes and feature improvements in this update release for both vCenter and ESXi. For most Service Providers updating to vSphere 6.0 is still a while away so it’s good to have continued support and improvement for the 5.5 platform. I’ve scanned through the release notes and picked out what I consider some of the more important bug fixes and resolved issues as they pertain to my deployments of vSphere.

Note: Still appears that there is no resolution to the vMotion errors I reported on earlier in the year or the bugs around the mClock Scheduler and IOPS Limiter on NFS.

ESXi 5.5 Update 3:

  • Status of some disks might be displayed as UNCONFIGURED GOOD instead of ONLINEStatus of some disks on an ESXi 5.5 host might be displayed as UNCONFIGURED GOOD instead of ONLINE. This issue occurs for LSI controller using the LSI CIM provider.
  • Cloning CBT-enabled virtual machine templates from ESXi hosts might failAttempt to clone CBT-enabled virtual machines templates simultaneously from two different ESXi 5.5 hosts might fail. An error message similar to the following is displayed:Failed to open VM_template.vmdk': Could not open/create change tracking file (2108).
  • ESXi hosts with the virtual machines having e1000 or e1000e vNIC driver might fail with a purple screenESXi hosts with the virtual machines having e1000 or e1000e vNIC driver might fail with a purple screen when you enable TCP segmentation Offload (TSO). Error messages similar to the following might be written to the log files:cpu7:nnnnnn)Code start: 0xnnnnnnnnnnnn VMK uptime: 9:21:12:17.991 cpu7:nnnnnn)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn][email protected]#nover+0x65b stack: 0xnnnnnnnnnn cpu7:nnnnnn)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn][email protected]#nover+0x18ab stack: 0xnnnnnnnnnnnn cpu7:nnnnnn)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn][email protected]#nover+0xa2 stack: 0xnnnnnnnnnnnn cpu7:nnnnnn)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn][email protected]#nover+0xae stack: 0xnnnnnnnnnnnn cpu7:nnnnnn)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn][email protected]#nover+0x488 stack: 0xnnnnnnnnnnnn cpu7:nnnnnn)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn][email protected]#nover+0x60 stack: 0xnnnnnnnnnnnnnnn cpu7:nnnnnn)0xnnnnnnnnnnnn:[0xnnnnnnnnnnnn][email protected]#nover+0x185 stack: 0xnnnnnnnnnnnn
  • Attempts to reboot Windows 8 and Windows 2012 server on ESXi host virtual machines might failAfter you reboot, the Windows 8 and Windows 2012 Server virtual machines might become unresponsive when the Microsoft Windows boot splash screen appears. For more information refer, Knowledge Base article 2092807.
  • Attempts to install or upgrade VMware Tools on a Solaris 10 Update 3 virtual machine might fail
    Attempts to install or upgrade VMware Tools on a Solaris 10 Update 3 virtual machine might fail with the following error message:Detected X version 6.9
    Could not read /usr/lib/vmware-tools/configurator/XOrg/7.0/vmwlegacy_drv.so Execution aborted.This issue occurs if the vmware-config-tools.pl script copies the vmwlegacy_drv.so file, which should not be used in Xorg 6.9.

In going through the remaining Known Issues you come across a lot of Flash Read Cache related problems…maybe VMware should call it a day with this feature…not sure if anyone has the balls to actually use it in production…be interested to hear? There are also a lot of VSAN issues still being reported as known with workarounds in place…all the more reason to start a VSAN journey with vSphere 6.0.

For a look at what’s new and for the release notes in full…click on the links below:

VMware ESXi™ 5.5 Update 3 | 16 SEP 2015 | Build 3029944

vCenter Server 5.5 Update 3 | 16 SEP 2015 | Build 3000241

« Older Entries