I’ve been grappling with vSphere with Tanzu for the past week or so and while I haven’t completely nailed how things operate end to end in this new world, i’ve been able to get to a point beyond the common “kubectl get nodes” which is where plenty of people kicking the tyres get to… and then stop. There also isn’t a lot of straight forward content out there on how to quickly fix issues that come up after the deployment of a Tanzu Kubernetes Grid instance into a fresh namespace. There are kubernetes security and kubernetes storage volumes to deal with and for those not familiar with or new to Kubernetes… it can drive you mad!

The Problem:

When looking to install stateful applications with stateful kubernetes volumes using Helm Charts the deployments where stuck with PODs status being stuck Pending. Basically the Dynamic Persistent Volumes Claims leveraging the Tanzu Storage Class was not able to do its thing. Persistent storage in Kubernetes is one of the things that I have struggled with… having really only clicked more recently. The concept of storage classes, persistent disks and claims can seem a little complex when coming from traditional physical or virtual machines.. but without persistence, Kubernetes and containers wouldn’t be gaining traction.

About Kubernetes Statefulness, Persistent Volume Claims and Disks and Dynamic Volumes
Stateful applications, save data between sessions and require persistent storage to store the data. The retained data is called the application’s state. You can later retrieve the data and use it in the next session. Kubernetes offers persistent volumes as objects capable of retaining their state and data. In the vSphere environment, the persistent volume objects are backed by virtual disks that reside on datastores. Datastores are represented by storage policies. After the vSphere administrator creates a storage policy, for example gold, and assigns it to a namespace in a Supervisor Cluster, the storage policy appears as a matching Kubernetes storage class in the Supervisor Namespace and any available Tanzu Kubernetes clusters. As a TKG Cluster user, you can use the storage class in your persistent volume claim specifications. You can then deploy an application that uses storage from the persistent volume claim.

Tanzu automatically configures a Storage Class backed by the Storage Policies we created and references in the Workload Management setup. This is leveraging the vSphere CSI Provisioner and as mentioned above, Dynamic Provisioning… so this should have worked?

The Error:

When looking to deploy solutions or individual POD s either directly from the kubectl command line or using Helm Charts into the default or a custom namespace, objects where getting created, but I was seeing PODs in a pending state

Looking into the events of the namespaces was seeing event entries as seen below.

So basically the POD has unbound immediate Persistent volume Claims because no persistent volumes are available for the claim due to some issue with detecting the available storage class. Working with a couple of people and through a lot of trial and error I came up with a couple ways to fix this.

The Fix: 

For the Helm Chat install, the quickest and surest way to ensure that the deployment is successful is to specify the Storage Class as part of the Helm install command.

By using the specific –set flag, we are forcing Helm to overwrite the default configuration which would be to pick up and use the default Storage Class. If this was setup correctly as it expects, there would be no issues. What I noticed in my initial troubleshooting was that the configured Storage Class was not marked as default as shown below

Using a kubectl patch operation, we can set the Storage Class to be default.

After that is done Helm deployments can be installed with or without that specific flag mentioned above… however, what I found was that a short time after patching the Storage Class to Default, the IsDefaultClass flag would set back to No. I’m still investigating this, so if anyone has a fix for this, please let me know directly or through the comments below.

References:

https://docs.vmware.com/en/VMware-vSphere/7.0/vmware-vsphere-with-tanzu/GUID-D875DED3-41A1-484F-A1CD-13810D674420.html#GUID-D875DED3-41A1-484F-A1CD-13810D674420

https://cloud.google.com/kubernetes-engine/docs/how-to/pod-security-policies

https://core.vmware.com/resource/vsphere-tanzu-quick-start-guide

pod s kubernetes deployment kubernetes persistent volumes security policy