For as long as I can remember I have been benchmarking storage. Knowing how storage would perform under certain IO profiles was a key part of the burn in process for new storage platforms. For the most part, what you got was what you paid for, but making sure reality matched expectation for storage was always critical. That said, benchmarking storage is not a new concept, and as hosted and cloud platforms became more used for IaaS and application deployments the need to know how the underlying storage would perform remained. I know from experience that once workloads are shifted from on-premises SANs to shared platforms that performance will vary and this impacts applications which in turn impacts business.
The uptake in modern platforms to deploy applications like what is happening on Kubernetes services (both cloud based and on-premises) has meant that the level of benchmarking has been further extracted away from the end user. In fact, there is no real easy way to test the capabilities in a cloud native, container based, Kubernetes world. You can’t just run Crystal DiskMark (On a Linux system FIO was used to do the job, and this script was created to mimic a default Crystal DiskMark test) to get a set of storage benchmarks with the single kubektl command.
Stateful Workloads and the Container Storage Interface
We are at this point with Kubernetes and containers because there has been a rise in stateful workloads and support around persistent storage for applications deployed in Kubernetes. Traditional workloads such as SQL Server, Oracle and SAP are being deployed along with data stores for microservices with the same storage system for MongoDB, Cassandra, Redis, MySQL and PostgreSQL. Depending on the application it serves, each of these stateful applications can have different performance requirements and it is at this point where it becomes necessary to benchmark the storage backing the Persistent Volume to ensure the applications will perform as desired.
As adoption of Kubernetes grows so have the persistent storage offerings that are available to users. The introduction of CSI(Container Storage Interface) has enabled storage providers to develop drivers with ease. In fact there are around a 100 different CSI drivers available today. Along with the existing in-tree providers, these options can make choosing the right storage difficult. CSI was developed as a standard for exposing arbitrary block and file storage storage systems to containerized workloads on Container Orchestration Systems (COs) like Kubernetes. With the adoption of the Container Storage Interface, the Kubernetes volume layer becomes truly extensible.
The CSI (Container Storage Interface) is the standard for creating custom components to work with data storage. This has enabled many more storage vendors to adopt their platforms to the cloud native approach and offerings. In fact, the ecosystem has flourished of late:
How do we ensure that the right datastore is used to achieve the performance required for our microservices running these stateful workloads?
Introducing Kubestr from Kasten by Veeam
Kubestr is an Open-Source tool that has been released by Kasten by Veeam today and has been created to easily assist in the benchmarking, testing and validation of storage across any Kubernetes environment configured with a CSI. The challenge being that all persistent storages are not equal and as described above choosing the right storage is critical. With Kubestr you can test the speed and profile of the StorageClass and make sure that is it ready for production. It does this by dynamically deploying and managing its own microservice pod based deployment. Within the deployment are a set of Open-Source tools such as fio (flexible I/O tester), BusyBox and others that are all executable via a command line and plug into the Kubernetes cluster it is being run against.
Kubestr can assist in three ways:
- Identify the various storage options present in a cluster.
- Validate if the storage options are configured correctly.
- Evaluate the storage using common benchmarking tools like fio.
Kubestr also have the ability to Identify the various storage options present in a cluster, understand the performance and also discovered wasted resources. From there is will validate is storage is configured correctly, make sure the storage dependencies are also as expected and run tests to see if the storage is critically capable of CSI snapshots which is leveraged by Kasten K10. Kubestr evaluates and understands the performance of the storage across multiple platforms while being able to simulate IO profiles. As can be see below, there are a number of options in the tool… a sample fio test can be seen below.
Next Steps and Download
Kasten has created a tool that is super simple to download, execute and have testing Kubernetes storage within minutes. More information can be found at the offical page or on GitHub… both listed below.
Again, it is worth mentioning that this is an Open-Source project from Kasten by Veeam so for those with the capability to contribute to the project… it is being welcomed and encouraged to help put more value into the tool.