Category Archives: VMware

Workaround – VCSA 6.7 Upgrade Fails with CURL Error: Couldn’t resolve host name

It’s never an issue with DNS! Even when DNS looks right…it’s still DNS! I came across an issue today trying to upgrade a 6.5 VCSA to 6.7. The new VCSA appliance deployment was failing with an OVFTool error suggesting that DNS was incorrectly configured.

Initially I used the FQDN for source and target vCenter’s and let the installer choose the underlying host to deploy the new VCSA appliance to. Even though everything checked out fine in terms of DNS resolution across all systems I kept on getting the failure. I triple checked name resolution on the machine running the update, both vCenter’s and the target hosts. I even tried using IP addresses for the source and target vCenter but the error remained as it still tried to connect to the vCenter controlled host via it’s FQDN resulting in the error.

After doing a quick Google search and finding nothing, I changed the target to be an ESXi host directly and used it’s IP address over it’s FQDN. This time the OVFTool was able to do it’s thing and deploy the new VCSA appliance.

The one caveat when deploying directly to a host over a vCenter is that you need to have the target PortGroup configured as an ephemeral…but that’s a general rule of bootstrapping a VCSA in any case and it’s the only one that will show up from the drop down list.

While very strange given all DNS checked out as per my testing, the workaround did it’s thing and allowed me to continue with the upgrade. This didn’t find the root cause…however when you need to motor on with anupgrade, a workaround is just as good!

Adding Let’s Encrypt SSL Certificate to vCloud Director Keystore

For the longest time the configuring of vCloud Director’s SSL certificate keystore has been the thing that makes vCD admins shudder. There are lots of posts on the process…some good…some not so good. I even have a post from way back in 2012 about fronting vCD with a Citrix NetScaler and if I am honest, I cheated in having HTTPS at the load balancer deal with the SSL certificate while leaving vCD configured with the self signed cert. With the changes to the way the HTML5 Tenant Portal deals with certs and DNS I’m not sure that method would even work today.

I wanted to try and update the self signed certs in both my lab environments to assist in resolving the No Datacenters are available issue that cropped up in vCD 9.1. Instead of generating and using self signed certs I decided to try use Let’s Encrypt signed certs. Most of the process below is curtesy of blog posts from Luca Dell’Oca and it’s worth looking at this blog post from Tom Fojta who has a PowerShell script to automate Let’s Encrypt SSL certs for us on NSX Edge load balancers.

In my case, I wanted to install the cert directly into the vCD Cell Keystore. The manual end to end the process is listed below. I intend to try and automate this process so as to overcome the one constraint with using Let’s Encrypt…that is the 90 day lifespan of the certs. I think that is acceptable and it ensures validity of the SSL cert and a fair caveat given the main use case for this is in lab environments.

Generating the Signed SSL Cert from Let’s Encrypt:

To complete this process you need the ACMESharp PowerShell module. There are a couple of steps to follow which include registering the domain you want to create the SSL cert against, triggering a verification challenge that can be done by creating a domain TXT record as shown in the output of the challenge command. Once submitted, you need to look out for a Valid Status response.

Once complete, there is a script that can be run as show on Luca’s Blog. I’ve added to the script to automatically import the newly created SSL cert into the Local Computer certificate store.

From here, I exported the certificate with the private key so that you are left with a PFX file. I also saved to Base-64 X.509 format the Root and Intermediate certs that form the whole chain. This is required to help resolve the No Datacenters are available error mentioned above. Upload the three files to the vCD cell and continue as shown below.

Importing Signed SSL from Let’s Encrypt into vCD Keystore:

Next, the steps to take on the vCD Cell can be the most complex steps to follow and this is where I have seen different posts do different things. Below shows the commands from start to finish that worked for me…see inline for comments on what each command is doing.

Once that has been done and the vCD services has restarted, the SSL cert has been applied and we are all green and the Let’s Encrypt SSL cert is in play.

Released: vCloud Director 9.1.0.1 – API Tweaks and Resolved Issues

There was a point release of vCloud Director 9.1 (9.1.0.1 Build 8825802) released last week, bringing with it an updated Java Runtime plus new API functions that allow additional configuration of advanced settings for virtual machines. There was also a number of bug fixes from the initial 9.1 release earlier in the year. Some of the issues that are resolved are significant and worth looking into if you have 9.1 GA deployed.

I haven’t been able to find an exact list of the new API functions, however traversing the Org Admin rights API call I did spot something new relating to Latency as show below.

And when I granted this right through the API mechanism I was able to allocate the right to the Org Admin via the administrator web interface.

I’m trying get a list of all the new API rights that where added as part of this release and will update this post when I have them.

Some of the bigger issues that where resolved are listed below:

  • In vCloud Director Tenant Portal, the Configure Services tab is disabled for Advanced Edge Gateway. In vCloud Director Tenant Portal, you cannot configure Advanced Edge Gateway settings as an administrator with any of the Gateway Advanced Services rights.
  • When importing a virtual machine from vCenter Server, vCloud Director relocates it to the primary resource pool. When you import a virtual machine created on a non-primary cluster in vCenter Server to vCloud Director, the machine is always relocated to the primary cluster.
  • In the vCloud Director Tenant Portal, the administrator of one organization can see virtual machines that belong to other vCloud Director organizations. When you configure the organizations in vCloud Director to use an LDAP server for authentication, an administrator of one organization, who is logged in vCloud Director Tenant Portal, can see virtual machines that belong to other organizations.
  • Importing a virtual machine from the vCenter Server deletes the original virtual machine after cloning it. When importing a virtual machine from the vCenter Server to vCloud Director involves changing its datastore, the process consists in cloning the source virtual machine and deleting it, while effectively changing its Managed Object Reference (MoRef).
  • Enabling High Availability for existing edge gateways in a data center with installed NSX Edge 6.4.0 fails.  In a data center with installed NSX Edge 6.4.0, you cannot enable High Availability for existing edge gateways that belong to a datastore cluster with enabled Storage Distributed Resource Scheduler (SDRS).
  • vCloud Director Tenant Portal does not display existing organization virtual data centers. When you use a self-signed SSL certificate for vCloud Director and you log in to vCloud Director Tenant Portal, you do not see a list of the existing organization virtual data centers.

The rest can be found here.

Just to finish up, there is still a lingering issue from the GA release that changed the behaviour of the HTML5 Tenant UI in scenarios where the SSL self signed certificates are used which is covered in this VMwareKB. Even though (as shown above) it’s been listed as resolved…I have run into it again in two different installs.

Obviously, if you are using legit SSL certificates you won’t have the issue, however the work around is not doing it’s thing for me. Hopefully I can resolve this ASAP as I am about to start some validation testing for Veeam and vCloud Director as well as start to test out our new functionality coming in Update 4 of Backup & Replication for Cloud Connect Replication.

For those with the correct entitlements…download here.

#LongLivevCD

References:

https://docs.vmware.com/en/vCloud-Director/9.1/rn/rel_notes_vcloud_director_9-1-0-1.html

vBrownBag TechTalks at VMworld 2018 – The Power to Catapult!

VMworld 2018 is fast approaching and in the last 24 hours, notifications where sent out to those lucky enough to have their session submissions accepted. Having been on the wrong side of that email multiple times I understand the disappointment that comes with not having a session accepted. The great news about VMworld is that there is another way to get your session seen and heard…and that is through the vBrownBag Techtalks.

The TechTalks have been a staple at VMworld’s (and other industry conferences) for a number of years now. Last year saw a stepping up of the vBrownBag game by having the TechTalks listed in the VMworld Content Catalog. I’ve had the pleasure of presenting tech talks at three VMworld’s over the years. The first one was back in 2014 but I remember it being a significant milestone in my career…regardless of the fact it was just a TechTalk it meant a lot to present at VMworld.

Make no mistake…these talks have the power and potential to catapult careers!

While the TechTalks offer the opportunity for folks that have not had sessions accepted, the real power of the talks is in offering a platform for the community to step up and present relevant, thought leading content that generally isn’t driven by marketing. In many ways I see more value in these sessions than in the VMworld sessions proper and there is a lot that can be taken away from the sessions.

That said, it’s great to see a number of vendors sponsoring the TechTalks and as per usual, Veeam is leading the way in our support of community at VMworld. As of last week there where around 50 TechTalks submitted and the team expects to have space for over a hundred TechTslks between the both VMworld conferences.

There is still plenty of time to submit your session, more information is in this post.

Here is a Playlist of the 2017 VMworld TechTalks. For those interested, there is a blog post by the vBrownBag team on what it takes to get get a presentation up live streaming and onto YouTube so fast…I found it a fascinating read.

Released: NSX-v 6.4.1 New Features and Fixes

Last week VMware released NSX-v 6.4.1 (Build 8599035) that contains a some new features and addresses a number of resolved issues from previous releases. I will go through the new features in more detail below however a key mentions is the fact that vSphere 6.7 is now supported, also meaning the vCloud Director can now be used with NSX-v 6.4.1 fully supported on vSphere 6.7. Prior to that only 6.5 was supported by NSX-v meaning you couldn’t upgrade to vSphere 6.7 as vCloud Director is dependant on NSX-v which didn’t support 6.7 until this 6.4.1 release.

There is also a small, but cool automatic backup feature introduced that backs up the state of the NSX Manager locally prior to the upgrade. Going through the release notes there are a lot of known issues that should be looked at and there are more than a few that apply to service providers.

The NSX User Interface continues to be enhanced and additional components added to the HTML5 Web Client. As you can see below, there are a lot more options in the HTML5 Web Client compared to the 6.4 base release…to reference that version menu, click here.

NSX User Interface

As you can see, the following VMware NSX features are now available through the HTML5 vSphere Client. Installation, Groups and Tags, Firewall, Service Composer, Application Rule Manager, SpoofGuard, IPFIX and Flow Monitoring. VMware is maintaining a web page that show the current NSX for vSphere UI Plug-in Functionality.

Other enhancements to the User Interface include:

  • Firewall – UI Enhancements:
    • Improved visibility: status summary, action toolbar, view of group membership details from firewall table
    • Efficient rule creation: in-line editing, clone rules, multi-selection and bulk action support, simplified rule configuration
    • Efficient section management: drag-and-drop, positional insert of sections and rules, section anchors when scrolling
    • Undo operations: revert unpublished rule and section changes on UI client side
    • Firewall Timeout Settings: Protocol values are displayed at-a-glance, without requiring popup dialogs.
  • Application Rule Manager – UI Enhancements:
    • Session Management: View a list of sessions, and their corresponding status (collecting data, analysis complete) and duration.
    • Rule Planning: View summary counts of grouping objects and firewall rules; View recommendations for Universal Firewall Rules
  • Grouping Objects Enhancements:
    • Improved visibility of where the Grouping Objects are used
    • View list of effective group members in terms of VMs, IP, MAC, and vNIC
  • SpoofGuard – UI Enhancements:
    • Bulk action support: Approve or clear multiple IPs at a time

I really like how the HTML5 interface is coming along and i’m now using it as my primary tool over the Flex interface.

Other New Enhancements:

Looking at Security Services are improvements in the Firewall by way of additional layer 7 application context support for Symantec LiveUpdate Traffic, MaxDB SQL Server support and support for web based Git or version control. There is also extended support via the Identity Firewall for user sessions on RDP and application server which now covers Server 2012 and 2012 R2 with specific VMTool versions.

The NSX Load Balance now scales to 256 pool members up from 32 which is a significant enhancement to an already strong feature of the NSX Edges. There are also a number of enhancements to overall operations and troubleshooting pages.

Those with the correct entitlements can download NSX-v 6.4.1 here.

Special Upgrade and Supportability Notes:

  • vSphere 6.7 support: When upgrading to vSphere 6.7, you must first install or upgrade to NSX for vSphere 6.4.1 or later. See Upgrading vSphere in an NSX Environment in the NSX Upgrade Guide and Knowledge Base article 53710 (Update sequence for vSphere 6.7 and its compatible VMware products).
  • NSX for vSphere 6.1.x reached End of Availability (EOA) and End of General Support (EOGS) on January 15, 2017. (See also VMware knowledge base article 2144769.)
  • NSX for vSphere 6.2.x will reach End of General Support (EOGS) on August 20 2018.

References:

https://docs.vmware.com/en/VMware-NSX-for-vSphere/6.4/rn/releasenotes_nsx_vsphere_641.html

 

Quick Fix: vSAN Health Reports iSCSI Target Service Stopped

A few weeks ago I wrote about using iSCSI as a backup repository target. While still running this POC in my environment I came across an error in the vSAN Health Checker stating the vSAN iSCSI target service was in a Failed state. Drilling down into the vSAN Health check tree I could see a Service Runtime status of stopped as shown below against the host.

This host had recently been marked as unreachable in vCenter and required a Management Agent reset to bring it back online. There is a chance that that process stopped the iSCSI Target service but did not start it. In any case there is an easy way to see the status of the services and then get them back online.

Once that’s been done, a re-run of the vSAN Health checker will show that the issue has been resolved and the iSCSI Target Service on the host is now running.

References:

https://kb.vmware.com/s/article/2147603

 

vSphere 6.7 – What’s in it for Service Providers Part 1

A few weeks ago after much anticipation VMware released vSphere 6.7. Like 6.5 before it, this is a lot more than a point release and represents a major upgrade from vSphere 6.5. There is so much packed into this new release that there is an official page with separate blog posts talking about the features and enhancements. As usual, I will go through some of the key features and enhancements that are included in the latest versions of vCenter and ESXi and as they relate back to the Service Providers that use vSphere as the foundation of their Infrastructure as a Service offerings.

There is a lot go get through though and like the vSphere 6.5 release the “whats new” will not fit into one post so i’ll split the highlights between a couple posts and I’ll cover ESXi specifically in a follow-up. I still feel like it’s important to highlight the base hypervisor as well as the management platform. I’ll also talk about current interoperability with vCloud Director and NSX as well as Veeam supportability for vSphere 6.7.

The major features and enhancements as listed in the What’s New PDF are:

  • Scalability Enhancements
  • VMware vCenter Server Appliance Linked Mode
  • VMware vCenter Server Appliance Back Up Scheduler
  • Single Reboot
  • Quick Boot
  • Support for 4K Native Storage
  • Improved HTML 5 based vSphere Client
  • Security-at-Scale
  • Support for Trusted Platform Module (TPM) 2.0 and virtual TPM
  • Cross-vCenter Encrypted vMotion
  • Support for Microsoft’s Virtualization Based Security (VBS)
  • NVIDIA GRID vGPU Enhancements
  • vSphere Persistent Memory
  • Hybrid Linked Mode
  • Per-VM Enhanced vMotion Compatibility (EVC)
  • Cross-vCenter Mixed Version Provisioning – Simplify provisioning across hybrid cloud environments that have diferent vCenter versions

Below are the ones in red fleshed out in the context of Service Providers.

Enhanced vCenter Server Appliance:

The VCSA has been enhanced significantly in this release. Having used the VCSA exclusively for the past year in all my environments I have a love hate relationship with it. I still feel it’s nowhere as stable as vCenter running ontop of Windows and is prone to more issues than a Windows based vCenter…however this 6.7 release will be the last one supporting or offering a Windows based vCenter. With that VMware have had to work hard on making the VCSA more resilient.

Compared to the 6.5 VCSA, 6.7 offers twice the performance in vCenter operations per second with a three times reduction in memory usage and three times faster DRS operations meaning that power on and other VM operations are performed quicker. This is great on a service provider platform with potentially lots of those operations happening during the course of a day. Hopefully this improves the responsiveness overall of the VCSA which I have felt at times to be poor under load or after an extended period of appliance uptime.

There has also been a number of updates to the APIs offered in vSphere, the VCSA and ESXi. William Lam has a great post on what’s new for APIs here, but all Service Providers should have teams looking at the API Explorer as it’s a great way to explore and learn what’s available.

Single Reboot and Quick Reboot:

For Service Providers who need upgrade their platforms to maintain optimal compatibility, upgrading hosts can be time consuming at scale. vSphere 6.7 reduces ESXi host upgrades, by eliminating one of the two reboots normally required for major version upgrades. This is the single reboot feature. There is also vSphere Quick Boot that restarts the ESXi hypervisor without rebooting the physical host. This skips time-consuming server hardware initialization and post boot operation wait times. Both of these significantly reduce maintenance times.

This blog post covers both features in more detail.

Improved HTML 5 based vSphere Client:

While minor in terms of actual under the hood improvements, the efficiencies that are gained when it comes to a decent user interface are significant. When managing Service Provider platforms at scale, having a reliable client is important and with the decommissioning of the VI client and the often frustrating performance of the Flex client a near complete and workable HTML vSphere Client is a big plus for those who work day to day on vCenter.

The vSphere 6.7 vSphere Client has support for vSAN as well as having Update Manager fully built in. As per the last NSX 6.4 update there is also limited management of that. There is also a new vROps plugin…this plugin is available out-of-the-box once vROps has been linked with vCenter and offers dashboards directly in the vSphere client that can view, cluster view, and alerts for both vCenter and vSAN views. This is extremely handy for Service Providers who use vROps dashboard not needing to go to two different locations to get the info.

vCD and NSX Supportability:

Shifting from new features and enhancements to an important subject to talk about when talking service provider platform…VMware product compatibility. For those VCPP Service Providers running a Hybrid Cloud you should be running a combination of vCloud Director SP or/and NSX-v of which, at the moment there is no support for either in vSphere 6.7.

Looking at vCloud Director, it looks like 9.1 is supported however given the fact you need to be running NSX-v with vCD these days and NSX is not yet supported, it doesn’t make too much sense to suggest that there is total compatability.

I suspect we will see NSX-v come out with a supported build shortly…though I’m only expecting vCloud Director SP to support 6.7 form version 9.1 which will mean upgrades.

Veeam Backup & Replication Supportability: 

Veeam commits to supporting major version releases within 90 days or sooner of GA. So with that, those Service Provider that are also VCSPs using Veeam to backup their infrastructure should not upgrade to vSphere 6.7 until Backup & Replication Update 3a is released. For those that are bleeding edge and have updated your only option at that point is our Agents for Windows and Linux until Update 3a is released.

Wrapping up Part 1:

Rounding off this post, in the Known Issues section there is a fair bit to be aware of for 6.7. it’s worth reading through all the known issues just in case there are any specific issues that might impact you. In upcoming posts around vSphere 6.7 for Service Providers series I will cover more vCenter features as well as ESXi enhancements and what’s new in Core Storage.

Happy upgrading!

References:

https://docs.vmware.com/en/VMware-vSphere/6.7/rn/vsphere-esxi-vcenter-server-67-release-notes.html

Introducing Faster Lifecycle Management Operations in VMware vSphere 6.7

Released: vSAN 6.7 – HTML5 Goodness, Enhanced Health Checks and More!

VMware has announced the general availability of vSAN 6.7. As vSAN continues to grow, VMware are very buoyant about how it’s performing in the market. With some 10,000 customers at a run rate of over 600 million they claim to lead the HyperConverged market with a 32% market share. From my point of view it’s great to see vSAN being deployed across 250 cloud providers and have it as the cornerstone storage of the VMware Cloud on AWS solution. vSAN 6.7 is focusing on intuitive operational experience, consistent application experience and holistic support experience.

New Features and Enhancements:

  • HTML5 User Interface
  • Embedded vROPs plugin for HTML5 User Interface
  • Support for Windows Failover Cluster using iSCSI
  • Adaptive Resync Performance Improvements
  • Destaging Performance Improvements
  • More Efficient data placement during Host Decommissioning
  • Improved Space Efficiency
  • Faster Failover with Redundant vSAN Networks
  • Optimized Witness Traffic Seperation
  • Stretched Cluster Improvements
  • Host Affinity for Next-Gen Applications
  • Health Check Enhancements
  • Enhanced Diagnostics
  • vSAN Support Insight
  • 4Kn Device Support
  • Improved FIPS 140-2 Validation Security

There are a lot of enhancements in this release and while not as ground breaking at the 6.6 release last year, there is still a lot to like about how VMware is improving the platform. From the list above, i’ve taken the key ones from my point of view and expanded on them a little.

HTML5 User Interface:

As has been the trend with all VMware products of late, vSAN is getting the Clarity Framework overhaul and is being included in the HTML5 vSphere Web Client with new vSAN tasks and workflows developed from the ground up to simplify the experience. There is also new vSAN functionality that can only be accessed via the HTML5 client.

The legacy Flex client will still be available for use and it’s also worth noting that this is not a direct port of the Flex interface but started from the ground up. This has resulted in a more efficient experience for the user with less clicks and less time to action items. Any new features or enhancements will only be seen in the new HTML5 UI.

Support for Windows Failover Cluster using iSCSI:

A few weeks back I posted around how you could use vSAN as Veeam repository using the iSCSI feature. With vSAN 6.7 there is offical support for Windows Failover Clustering using the vSAN iSCSI service. Lots of people still run MSCS and a lot still use traditional clustering. This supports physical and virtual Guest iSCSI initiators that includes transparent failover of clusters with vSAN iSCSI volumes.

I’m not sure if this now means that iSCSI volumes are supported as Veeam Cloud Repositories…but I will confirm either way.

Adaptive Resync Performance Improvements:

vSAN 6.7 introduces a new Adaptive ReSync feature that will make sure resources are available for VM IO and resync IO. This ensures that under IO stress certain traffic types are not starved of resources and allows more bandwidth to be used when there are periods of less contention. Under contention, resync IO will be guaranteed at least 20% of the bandwidth and if no resync traffic exists, VM IO may consume 100%. This is effectively regulating reads and writes to ensure optimal balance for VM and reync IO.

Destaging Performance Improvements:

vSAN 6.7 looks to be more consistent when talking about data optimizations in the data path. With the faster destaging, data drains more quickly from the write buffer to the capacity tier. This allows the buffer tier to be available for newer IO quicker. This is done via improved in-memory handling of IO during destaging that delivers higher throughput and more consistency which in turn improves the overall performance of VM and resync IO.

More Efficient data placement during Host Decommissioning:

When putting a host in maintenance mode or decommissioning a host you need to select the evacuation type for the objects on that host. This can take time depending on the amount of data. vSAN 6.7 builds on improvements introduced in 6.6 that consolidates replicas living across multiple hosts while maintaining FTT compliance. Is looks for the smallest component to move while results in less data being rebuilt and less temporary space usage. vSAN will provide more intelligence behind the data movement to reduce the time and effort it takes to put a host into maintenance mode.

Improved Space Efficiency:

In previous vSAN versions the VM swap object was always thick provisioned even if the VM it’s self was thin. in vSAN 6.7 this will now be thin by default and also inherit the policy from the VM so that the FTT is the swap object is consistent with the VM which results in more efficient storage. Previous to this, large environments would suffer with a large number of swap files taking up a higher proportionate amount of space.

 

Conclusion:

vSan continues to be improved by VMware and they have addressed some core usability and efficiency features in this 6.7 release. The move to the HTML5 web client was expected, but still good to see while the enhancements in resync and destaging all contributes to platform stability. The enhanced health checks add a new dimension to vSAN troubleshooting and the support insight allows users to get a better view of what’s happening on their instances.

References:

Pre release information and images sourced via VMware EABP

https://blogs.vmware.com/virtualblocks/2018/04/17/whats-new-vmware-vsan-6-7/

 

 

Released: NSX-v 6.3.6

Last week VMware released NSX-v 6.3.6 (Build 8085122) that doesn’t contain any new features but addresses a number of bug fixes from previous releases. This has been done independently of any updated release of NSX-v 6.4.0 that went GA in January.

This is good to see though interesting to also see that people are still not upgrading to 6.4.0 in droves meaning VMware needs to support both versions. Going through the release notes there are a lot of known issues that should be known and there are more than a few that apply to service providers.

Some key fixes are listed below:

Important Fixes :

  • Network outage of ~40-50 seconds seen on Edge Upgrade – During Edge upgrade, there is an outage of approximately 40-50 seconds
  • After upgrading to 6.3.5, the routing loop between DLR and ESG’s causes connectivity issues in certain BGP configurations –  A routing loop is causing a connectivity issue
  • NSX Manager CPU high due to edge in read-only file system mode – NSX Manager is slow to respond because it keeps 100% CPU and receives a lot of read-only file system events from edge.
  • After upgrade from vCNS edge 5.5.4 to NSX 6.3.6, customers could not configure Health-Check-Monitor port nor make any changes directly from vCD – Customers will not be able to configure Health-Check-Monitor port nor make any changes directly from vCD.
  • Distributed Firewall stays in Publishing state with certain firewall configurations – Distributed Firewall stays in “Publishing” state if you have a security group that contains an IPSet with 0.0.0.0/0 as an EXCLUDE member, an INCLUDE member, or as a part of ‘dynamic membership containing Intersection (AND)’

Those with the correct entitlements can download NSX-v 6.3.6 here.

References:

https://docs.vmware.com/en/VMware-NSX-for-vSphere/6.3/rn/releasenotes_nsx_vsphere_636.html

Setting up vSAN iSCSI and using it as a Veeam Repository

Probably one of the least talked about features of vSAN is it’s ability to serve out iSCSI volumes. The feature was released with vSAN 6.5 and was primarily focused on physical workloads and is easily configurable via the vSphere Web Client. iSCSI targets on vSAN are managed the same as any other vSAN objects using Storage Policy Based Management (SPBM). Deduplication, compression, mirroring, and erasure coding can be utilized with the iSCSI target service as well as CHAP and Mutual CHAP authentication.

Of late, i’ve been asked by service providers about using Object Storage platforms as Veeam Backup & Replication repositories. There are a lot of options out there but someone asked specifically about using vSAN. In theory you could just use a VMDK on a vSAN datastore but I thought it would be interesting to look at using iSCSI to mount a volume and use it as a repository.

Initial iSCSI Configuration for vSAN:

First thing we need to do is enable the iSCSI Target service from the vSphere Web Console. Under the Cluster Configuration tab and in the iSCSI Target menu you need to enabled the iSCSI service. Select the default iSCSI Network kernel interface and then modify the iSCSI port and add security if desired. Take note of the info message around using the Storage Policy for the home object.

From there we setup a new iSCIS Target. From here you will be given the IQN and we will give the target an alias. This window also lets us create the first LUN to the iSCSI Target. The LUN id can be specified along with the alias and finally the size. Just like creating a new VMDK on a vSAN datastore we are given the storage consumption of the object depending on the Storage Policy chosen.

Once completed under the iSCSI Target pane we see the details of the Target and LUN just created. Take note of the I/O Owner Host as that is what we will be using later on as the iSCSI Target from the Veeam repository server.

Configuring Host access and setting iSCSI Access Permissions:

On the creation of a LUN there is a default policy that allows all initiator sources to connect to it. To create specific permissions for host access and to also create access groups you need to first enable the iSCSI initiator at the hosts. For that, I’ve got a Windows VM (note only physicals are officially supported) that’s got Veeam Backup & Replication installed on it. To connect to the iSCSI network we have to add an additional vNIC that’s hooked into a PortGroup that’s configured with the vSAN iSCSI VLAN.

Below we can see the VMKernel configuration and IP address of the I/O Owner hosts.

I’ve created a new PortGroup for the new vNIC to be attached to and added it to the VM.

From there we need to start the Microsoft iSCSI Initiator service which will give us the Initiator name we need to configure host access in the vSphere Web Client. Note that we should also install and enable MPIO for iSCSI if not installed as a Windows Feature.

Under the iSCSI Initiator Groups menu in the Cluster Configuration tab you can add the initiator to a new group. This can contain one or many hosts as you would expect in any iSCSI initiator group configuration.

Once that’s been done we have to allow that new group access to the target where the LUN is contained. Under the iSCSI Target menu and under Target Details in the lower pane click on the + icon and add the group as an allowed initiator.

From here we can go back to the Windows VM and connect to the iSCSI Target. We are using the IP Address of the Host was was highlighted above in the initial configuration.

Once done we should have a connected disk that’s visible in the Devices configuration of the isCSI Initiator.

Configuring new iSCSI Volume as Veeam Repository:

From here the process to setup a Veeam Repository based on the vSAN iSCSI LUN is straight forward. Firstly we need to bring online the volume and create a partition. As you can see below, the disk is of Bus Type iSCSI and Name is VMware Virtual SAN.

As for the partition configuration, I’ve set it up as shown before. ReFS being used as the file system.

From here we can head into the Backup & Replication console and create a new Repository with the new volume selected.

Performance and Limitations:

Once configured I was interested in seeing how a vSAN iSCSI connected object performed against a vSAN disk. The results below show that there is a significant performance hit in going one way or the other. This seems logical as in addition to iSCSI overheads a native VMDK on vSAN is hooked into the ESXi kernel directly and should get line speed rates when it comes to data transfer.

Below are the configuration maximums with vSAN iSCSI as listed below:

  • Maximum 1024 LUNs per vSAN cluster
  • Maximum 128 targets per vSAN cluster
  • Maximum 256 LUNS per target
  • Maximum LUN size of 62TB
  • Maximum 128 iSCSI sessions per host.
  • Maximum 4096 iSCSI IO queue depth per host
  • Maximum 128 outstanding writes per LUN .
  • Maximum 256 outstanding IOs per LUN.
  • Maximum 64 client initiators per LUN

So the max size of an iSCSI LUN matches the max size of a VMDK. Therefore when considering iSCSI as a possible option for Veeam backups, Scale Out Backup Repositories should be used to enable the adding at extents once that limit is reached.

There are also limitation on offical support for virtual machines and other platforms:

  • Currently not supported for implementation for Microsoft clusters.
  • Currently not supported for use as a target for other vSphere hosts.
  • Currently not supported for use with third party hypervisors.
  • Currently not supported for use with virtual machines

So if this becomes a consideration, physical servers will need to be used in order to gain support.

Conclusion:

So after all is said an done, we have a Veeam Repository than is now sitting on vSAN via iSCSI. The question remains weather this is a good application of vSAN or weather it’s worth looking at as an option, however the option is now there. Again, you may be able to look at the native VMDK option, but I like the flexibility of iSCSI for physical repositories at the moment.

Probably the biggest consideration for using vSAN iSCSI as a Veeam repository is the design of the vSAN Cluster. vSAN has not traditionally been considered for storage only purposes, however you could put together some low compute nodes with large disk groups that would present decent storage for repository purposes.

In using vSAN you have the benefit of knowing your data is redundant across multiple nodes as per the vSAN Storage Policies. This is the benefit of using object storage like vSAN as a Veeam Repository.

References:

https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.virtualsan.doc/GUID-13ADF2FC-9664-448B-A9F3-31059E8FC80E.html 

https://kb.vmware.com/kb/2148216

 

« Older Entries