Tag Archives: Storage

Quick Look – vSphere 6.5 Storage Space Reclamation

One of the cool newly enabled features of vSphere 6.5 is the come back of VMFS storage space reclamation. This feature was enabled in a manual way for VMFS5 datastores and was able to be triggered when you free storage space inside a datastore when deleting or migrating a VM…or consolidate a snapshot. At a Guest OS level, storage space is freed when you delete files on a thinly provisioned VMDK and then exists as dead or stranded space. ESXi 6.5 supports automatic space reclamation (SCSI unmap) that originates from a VMFS datastore or a Guest OS…the mechanism reclaims unused space from VM disks that are thin provisioned.

When storage space is deleted without this automated feature the delete operation leaves blocks of unused space on the datastore. VMFS uses the SCSI unmap command to indicate to the array that the storage blocks contain deleted data, so that the array can unallocate these blocks.

On VMFS6 datastores, ESXi supports automatic asynchronous reclamation of free space. VMFS6 generally supports automatic space reclamation requests that generate from the guest operating systems, and passes these requests to the array. Many guest operating systems can send the unmap command and do not require any additional configuration. The guest operating systems that do not support automatic unmaps might require user intervention.

I was interested in seeing if this worked as advertised, so I went about formatting a new VMFS6 datastore with the default options via the Web Client as shown below:

Heading over the hosts command line I checked the reclamation config using the new esxcli namespace:

Through the Web Client you can only set the Reclamation Priority to None or Low, however through the esxcli command you can set that value to medium or high as well as low or none, but as I’ve literally just found out, these esxcli only settings don’t actually do anything in this release.

For the low setting in terms of reclaim priority and how long before the process kicks off on the datastore, the expectation is that any blocks that are no longer used will be reclaimed within 12 hours. I was keeping track of a couple of VMs and the datastore sizes in general and saw that after a day or so there was a difference in the available storage. 

You can see that I clawed back about 22GB and 14GB on both datastores in the first 24 hours. So my initial testing with this new feature shows that it’s a valued and welcomed edition to the new vSphere 6.5 release. I know that for Service Providers that thin provision but charge based on allocated storage, they will benefit greatly from this feature as it automates a mechanism that was complex at best in previous releases.

There is also a great section around UNMAP in the vSphere 6.5 Core Storage White Paper that’s literally just been released as well and can be found here:

References:

http://pubs.vmware.com/vsphere-65/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-65-storage-guide.pdf

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2057513

vSphere 6.5 Core Storage White Paper Now Available

HomeLab – SuperMicro 5028D-TNT4 Storage Driver Performance Issues and Fix

Ok, i’ll admit it…i’ve had serious lab withdrawals since having to give up the awesome Zettagrid Labs. Having a lab to tinker with goes hand in hand with being able to generate tech related content…point and case, my new homelab got delivered on Monday and I have been working to get things setup so that I can deploy my new NestedESXi lab environment.

By way of an quick intro (longer first impression post to follow) I purchased a SuperMicro SYS-5028D-TN4T that I based off this TinkerTry Bundle which has become a very popular system for vExpert homelabers. It’s got an Intel Xeon D-1541 CPU and I loaded it up with 128GB or RAM. The system comes with an embedded Lynx Point AHCI Controller that allows up to six SATA devices and is listed on the VMware Compatibility Guide for ESXi 6.5.

The issue that I came across was to do with storage performance and the native driver that comes bundled with ESXi 6.5. With the release of vSphere 6.5 yesterday, the timing was perfect to install ESXI 6.5 and start to build my management VMs. I first noticed some issues when uploading the Windows 2016 ISO to the datastore with the ISO taking about 30 minutes to upload. From there I created a new VM and installed Windows…this took about two hours to complete which I knew was not as I had expected…especially with the datastore being a decent class SSD.

I created a new VM and kicked off a new install, but this time I opened ESXTOP to see what was going on, and as you can see from the screen shots below, the Kernel and disk write latencies where off the charts topping 2000ms and 700-1000ms respectivly…In throuput terms I was getting about 10-20MB/s when I should have been getting 400-500MB/s. 

ESXTOP was showing the VM with even worse write latency.

I thought to myself if I had bought a lemon of a storage controller and checked the Queue Depth of the card. It’s listed with a QD of 31 which isn’t horrible for a homelab so my attention turned to the driver. Again referencing the VMware Compatability Guide the listed driver for the conrtoller the device driver is listed as ahci version 3.0.22vmw.

I searched for the installed device driver modules and found that the one listed above was present, however there was also a native VMware device drive as well.

I confirmed that the storage controller was using the native VMware driver and went about disabling it as per this VMwareKB (thanks to @fbuechsel who pointed me in the right direction in the vExpert Slack Homelab Channel) as shown below.

After the host rebooted I checked to see if the storage controller was using the device driver listed in the compatability guide. As you can see below not only was it using that driver, but it was now showing the six HBA ports as opposed to just the one seen in the first snippet above.

I once again created a new VM and installed Windows and this time the install completed in a little under five minutes! Quiet a difference! Upon running a crystal disk mark I was now getting the expected speeds from the SSDs and things are moving along quiet nicely.

Hopefully this post saves anyone else who might by this, or other SuperMicro SuperServers some time and not get caught out by poor storage performance caused by the native VMware driver packaged with ESXi 6.5.


References
:

http://www.supermicro.com/products/system/midtower/5028/SYS-5028D-TN4T.cfm

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2044993

Free Guide: Building NetApp ONTAP 9 Lab

In computing, there is one thing you shouldn’t compromise on…and that thing is storage. This carries over to Lab or NestedESXi environments as poor lab performance can be just as frustrating as production performance issues. I’ve used a number of nested storage platform’s for my lab environments and I’m always on the lookout for alternative solutions.

When Neil Anderson asked my to write a short introductory post on his new how-to guide Build Your Own NetApp ONTAP 9 Lab I decided to flick through the guide to check it out and see if it could add any value to my future plans for a homelab. The e-book is professionally laid out and has excellent diagrams, notes and step by step…it’s extremely comprehensive.

NetApp Simulator 9 Free eBook – Build Your Own NetApp ONTAP 9 Lab!

While I’ve never been a NetApp guy there does seem to be a level of complexity in the NetApp VSA setup, but with the step by step in the e-book any ambiguity is removed. If you are looking for lab storage this is a great end to end example of how to install and configure one based on ONTOP 9 NetApp Simulator.

Give it a look over here.

VSAN Upgrading from 6.1 to 6.2 Hybrid to All Flash – Part 3

When VSAN 6.2 was released earlier this year it came with new and enhanced features and with the price of SSDs continuing to fall and an expanding HCL it seems like All Flash instances are becoming more the norm and for those that have already deployed VSAN in a Hybrid configuration the temptation to upgrade to All Flash is certainly there. Duncan Epping has previously blogged the overview of migrating from Hybrid to All Flash so I wanted to expand on that post and go through the process in a little more detail. This is the final part of a three part blog series with the process overview outlined below.

Use the links below to page jump.

In part one I covered upgrading existing hosts, expanding an existing VSAN cluster and upgrading the license and disk format. In part two I covered the actual Hybrid to All Flash migration steps and in this last part I will finish off by going through the process of creating a new VSAN Policy, migrate existing VMs to the new policy and  then enable deduplication and compression.

Before continuing it’s worth pointing out that after the Hybrid to All Flash migration you are going to be left with an unbalanced VSAN cluster as the full data evacuation off the last Hybrid host will leave that host without objects. Any new objects created will work to re-balance the cluster however if you want to initiate a proactive re-balance you can tit the re-balance button from the Health status window. For more on this process check out this post from Cormac Hogan.

Create new Policy and Migrate VMs:

To take advantage of the new erasure coding now in the VSAN 6.2 All Flash cluster we will need to create a new storage policy and apply that policy to any existing VMs. In my case all VMs where on the Default VSAN Policy with FTT=1. The example below shows the creation of a new Storage Policy that uses RAID5 erasure coding with FTT=1. If you remember from previous posts the reason for expanding the cluster to four hosts was to cater for this specific policy.

To create the new Storage Policy head to VM Storage Policies from the Home page of the Web Client and click on Create New VM Storage Policy. Give policy a name, click Next and construct Rule-Set 1 which is based on VSAN. Select the Failure tolerance method and choose RAID-5/6 (Erasure Coding) – Capacity.

In this case with FTT=1 chosen RAID5 will be used. Clicking on Next should show that the existing VSAN datastore is compatible with the policy. With that done we can migrate existing VMs off the Default VSAN Policy onto the newly created one.

To get an list of what VMs are going to be migrated have a look at the PowerCLI commands below to get the VMs on the VSAN Datastore and then get their Storage Policy. The last command below gets a list of existing policies.

To apply the new Erasure Coding Storage Policy its handy to get the full name of the policy.

To migrate the VMs to the new policy you can either do it one by one via the Web Client of do it on mass via the following PowerCLI script.

Once run the VMs will have the new policy applied and VSAN will work in the background to get those VM objects compliant. You can see the status of Virtual Disk Placement in the Virtual SAN tab of the Monitor Tab of the cluster.

Enable DeDupe and Compression:

Before I go into the details…for a brilliant overview and explanation of DeDupe and Compression with VSAN 6.2 head to this post from Cormac Hogan. To enable this feature we need to double check that the licensing is correct as detailed in the first post and also ensure that all previous steps relating to the Hybrid to All Flah migration has taken place. To turn on this feature head to the General window under the Virtual SAN Settings menu on the cluster Manage tab and click on the Edit button next to Virtual SAN is Turned ON.

Choose Enabled in the drop down and take note of the checkbox that talks about Allow Reduced Redundancy understanding what that means by reading the info box as shown above. Once you click on the process to enable DeDuplication and Compress will begin…this process will go through an reconfigure all Disk Groups similar to to the process to upgrade from between Hybrid and All Flash. Again this will take some time depending on the number of host, number of disk groups and type of disks in the cluster.

Below I have shown the before and after of the Capacity window under the Virtual SAN tab in the Monitor section of the Cluster view. You can see that before enabled, there is a message saying that DeDeuplication and Compression is disabled.

And after enabling DeDuplication and Compression you start to get some statistics relating to both of them in the window relating to savings and ratios. Even in my small lab environment I started to see some benefits.

With that complete we have finished this series and have gone through all the steps in order to get to an All Flash VSAN Cluster with the newest features enabled.

References:

VSAN 6.2 Part 1 – Deduplication and Compression

VSAN 6.2 Part 2 – RAID-5 and RAID-6 configurations

 

VSAN Upgrading From 6.1 To 6.2 Hybrid To All Flash – Part 2

When VSAN 6.2 was released earlier this year it came with new and enhanced features and with the price of SSDs continuing to fall and an expanding HCL it seems like All Flash instances are becoming more the norm and for those that have already deployed VSAN in a Hybrid configuration the temptation to upgrade to All Flash is certainly there. Duncan Epping has previously blogged the overview of migrating from Hybrid to All Flash so I wanted to expand on that post and go through the process in a little more detail. This is part two of what is now a three part blog series with the process overview outlined below.

Use the links below to page jump.

In part one I covered upgrading existing hosts, expanding an existing VSAN cluster and upgrading the license and disk format. In this part am going to go through the simple task of extending the cluster by adding new All Flash Disk Groups on the host I added in part one and then go through the actual Hybrid to All Flash migration steps.

The configuration of the VSAN Cluster after the upgrade will be:

  • Four Host Cluster
  • vCenter 6.0.0 Update 2
  • ESXi 6.0.0 Update 2
  • One Disk Groups Per Host
  • 1x 480GB SSD Cache and 2x 1000GB SSD Capacity
  • VSAN Erasure Coding Raid 5 FTT=1
  • DeDuplication and Compression On

As mentioned in part one I added a new host to the cluster in order to give me some breathing room while doing the Hybrid to All Flash upgrade as we need to perform rolling maintenance on each hosts in the cluster in order to get to the All Flash configuration. Each host will be entered into maintenance mode and all data evacuated. Before the process is started on the initial three hosts lets go ahead and create a new All Flash Disk Group on the new hosts.

To create the new Disk Group head to Disk Management under the Virtual SAN section of the Manage Tab whilst the Cluster and click on the Create New Disk Group Button. As you can see below I have the option of selecting any of the flash devices claimed as being ok for VSAN.

After the disk selection is made and the disk group created, you can see below that there is now a mixed mode scenario happening where the All Flash host is participating in the VSAN Cluster and contributing to the capacity.

Upgrade Disk Group from Hybrid to All Flash:

Ok, now that there is some extra headroom the process to migrate the existing Hybrid Hosts over to All Flash can begin. Essentially what the process involves is placing the hosts in maintenance mode with a full data migration, deleting any existing Hybrid disk groups, removing the spinning disk, replacing them with flash and then finally creating new All Flash disk groups.

If you are not already aware about maintenance mode with VSAN then it’s worth reading over this VMware Blog Post to ensure you understand that using the VI Client is a big no no. In this case I wanted to do a full data migration which moves all VSAN components onto remaining hosts active in the cluster.

You can track this process by looking at the Resyncing Components section of the Virtual SAN Monitor Tab to see which objects are being copied to other hosts.

As you can see the new host is actively participating in the Hybrid mixed mode cluster now and taking objects.

Once the copy evacuation has completed we can now delete the existing disk groups on the host by highlights the disk group and clicking on the Remove Disk Group button. A warning appears telling us that data will be deleted and also lets us know how much data is currently on the disks. The previous step has ensured that there should be no data on the disk group and it should be safe to (still) select Full data migration and remove the disk group.

Do this for all existing Hybrid disk groups and once all disk groups have been deleted from the host you are ready to remove the existing spinning disks and replace them with flash disks. The only thing to ensure before attempting to claim the new SSDs is that they don’t have any previous partitions on them…if so you can use the ESXi Embedded Host Client to remove any existing partitions.

Warning: Again it’s worth mentioning that any full data data migration is going to take a fair amount of time depending on the consumed storage of your disk groups and the types of disks being used.

Repeat this process on all remaining hosts in the cluster with Hybrid disk groups until you have a full All Flash cluster as shown above. From here we are now able to take advantage of erasure coding, DeDuplication and compression…I will finish that off in part three of this series.

 

VSAN Upgrading from 6.1 to 6.2 Hybrid to All Flash – Part 1

When VSAN 6.2 was released earlier this year it came with new and enhanced features and depending on what version you where running you might not have been able to take advantage of them all right away. Across all versions, Software Checksum was added with Advanced and Enterprise versions getting VSANs implementation of Erasure Coding (RAID 5/6) with Deduplication and Compression available for the All Flash version and QOS IOPS Limiting available in Enterprise only.

With the price of SSDs continuing to fall and an expanding HCL it seems like All Flash instances are becoming more the norm and for those that have already deployed VSAN in a Hybrid configuration the temptation to upgrade to All Flash is certainly there. Duncan Epping has previously blogged the overview of migrating from Hybrid to All Flash so I wanted to expand on that post and go through the process in a little more detail. This is a two part blog post with a lot of screen shots to compliment the process which is outlined below.

Use the links below to page jump.

Warning: Before I begin it’s worth mentioning that this is not a short process so make sure you plan this out relative to the existing size of your VSAN cluster. In talking with other people who have gone through the disk format upgrade the average rate seems to be about 10TB of consumed data per day depending on the type of disks being used. I’ll reference some posts at the end that relates to the disk upgrade process as it has been troublesome for some however also worth pointing out that the upgrade process is non disruptive for running workloads.

Existing Configuration:

  • Three Host Cluster
  • vCenter 6.0.0 Update 2
  • ESXi 6.0.0 Update 1
  • Two Disk Groups Per Host
  • 1x 200GB SSD and 2x 600GB HDD
  • VSAN Default Policy FTT=1

Upgrade Existing Hosts to 6.0 Update 2:

At the time of writing ESXi 6.0.0 Update 2 is the latest release and the builds that contain the VSAN 6.2 codebase. From the official VMware Upgrade matrix it seems you can’t upgrade from VSAN versions older than 6.1, so if you are on 5.x or 6.0 releases you will need to take note of this VMwareKB to get to ESXI 6.0.0 Update 2. A great resource for the latest builds as well as links to upgrade from head here:

https://esxi-patches.v-front.de/ESXi-6.0.0.html

For a quick upgrade directly from the VMware online host update repository you can do the following on each host in the cluster after putting them into VSAN Maintenance Mode. Note that there are also some advanced settings that are recommended as part of the VSAN Health Checks in 6.2

After rolling through each host in the cluster make sure that you have an updated copy of the VSAN HCL and run a health check to see where you stand. You should see a warning about the disks needing an upgrade and if any hosts didn’t have the above advanced settings applied you will have a warning about that as well.

Expanding VSAN Cluster:

As part of this upgrade I am also adding an additional host to the existing three to expand to a four host cluster. I am doing this for a couple of reasons, not withstanding the accepted design position on four host being better than three from a data availability point of view you also need a minimum of four hosts if you want to enable RAID5 erasure coding (six is required as a minimum for RAID6). The addition of the fourth host also allowed me to roll through the Hybrid to AF upgrade with a lot more headroom.

Before adding the new host to the existing cluster you need to ensure that the build is consistent with the existing hosts in terms of versioning and more importantly networking. Ensure that you have configured an VMkernel Interface for VSAN traffic and marked it as such through the Web Client. If you don’t do this prior to putting the host into the existing cluster I found that the management VMKernel interface was enabled by default for VSAN.

If you notice below this cluster is also NSX enabled, hence the events relating to Virtual NICs being added. Most importantly the host can see other hosts in the cluster and is enabled for HA.

Once in the cluster the host can be used for VM placement with data served from the existing hosts with configured disk groups over the VSAN network.

Upgrade License:

At this point I upgraded the licenses to enable the new features in VSAN 6.2. As a refresher on VSAN licensing there are three editions with the biggest change from previous versions being that to get the Deduplication and Compression, Erasure Coding and QoS features you need to be running All Flash and have an Enterprise license key.

To upgrade the license you need to head to Licensing under the Configuration section of the Manage Tab whilst the Cluster is selected. Apply the new license and you should see the following.

Upgrade Disk Format:

If you have read up around upgrading VSAN you know that there is a disk format upgrade required to get the benefits of the newer versions. Once you have upgraded both vCenter and Hosts to 6.0.0 Update 2 if you check the VSAN Health under the Monitor Tab of the Cluster you should see an failure talking about v2 disks not working with v3 disks as shown below.

You can click on the Upgrade On-Disk Format button here to kick off the process. This can also be triggered from the Disk Management section under the Virtual San menu in the Manage cluster section of the Web Client. Once triggered you will see some events trigger and an update in progress message near the version number.

Borrowing from one of Cormac Hogan’s posts on VSAN 6.2 the following explains what is happening during the disk format upgrade. Also described in the blog post is a way using the Ruby vSphere Client to monitor the progress in more detail.

There are a few sub-steps involved in the on-disk format upgrade. First, there is the realignment of all objects to a 1MB address space. Next, all vsanSparse objects (typically used by snapshots) are aligned to a 4KB boundary. This will bring all objects to version 2.5 (an interim version) and readies them for the on-disk format upgrade to V3. Finally, there is the evacuation of components from a disk groups, then the deletion of said disk group and finally the recreation of the disk group as a V3. This process is then repeated for each disk group in the cluster, until finally all disks are at V3.

As explained above the upgrade can take a significant amount of time depending on the amount of disk groups, data consumed on your VSAN datastore as well as the type of disks being used (SAS based vs SATA/NL-SAS) Once complete you should have a green tick and the On-Disk format version reporting 3.0

With that done we can move ahead to the Hybrid to All Flash conversion. For details on the look out for Part 2 of this series coming soon.

References:

Hybrid vs All-flash VSAN, are we really getting close?

VSAN 6.2 Part 2 – RAID-5 and RAID-6 configurations

VSAN 6.2 Part 12 – VSAN 6.1 to 6.2 Upgrade Steps

VSAN 6.2 + DELL PERC: Important Certified Driver Updates

As many of us rejoiced at the release of VSAN 6.2 that came with vSphere 6 Update 2…those of us running DELL PERC based storage controllers where quickly warned of a potential issues and where told to not upgrade. VMwareKB 2144614 referenced these issues and stated that the PERC H730 and FD332 found in DELL Server platforms where not certified for VSAN 6.2 pending onging investigations. The storage controllers that where impacted are listed below.

This impacted me as we have the FD332 Dual ROC in our production FX2s with VSAN 6.1 and a test bed with VSAN 6.2. With the KB initially saying No ETA I sat and waited like others impacted to have the controllers certified. Late last week however DELL and VMware finally released an updated FW driver for that PERC based which certifies the H730s and FD332s with VSAN 6.2.

Before this update if you looked at the VSAN Health Monitor you would have seen a Warning next to the VMware Certified check and official driver support.

As well as upgrading the controller drivers it’s also suggested that you make the following changes on each host in the cluster which adds two new VSAN IO timeout settings. No reboot is required after applying the advanced config and the command is persistent.

esxcfg-advcfg -s 100000 /LSOM/diskIoTimeout
esxcfg-advcfg -s 4 /LSOM/diskIoRetryFactor

Once the driver has been upgraded you should see all green in the VSAN Health Checks as shown below with the up to date driver info.

This is all part of the fun and games of using your own components for VSAN, but I still believe it’s a huge positive to be able to cater a design for specific use cases with specific hardware. In talking with various people within VMware and DELL (as it related to this and previous PERC driver issues) it’s apparent that both parties need to communicate better and go through much better QA before updating drivers and firmware releases however this is not only something that effects VMware and DELL and not only for storage drivers…it’s a common issues throughout the industry and it not only impacts VMware VSAN with every vendor having issues at some point.

Better safe than sorry here and well done on VMware and DELL on getting the PERC certified without too much delay.

References:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2144614

http://www.vmware.com/resources/compatibility/detail.php?deviceCategory=vsanio&productid=38055&deviceCategory=vsanio&details=1&vsan_type=vsanio&io_partner=23&io_releases=275&page=1&display_interval=10&sortColumn=Partner&sortOrder=Asc

It’s A Good Book! – vSphere Design Pocketbook 3.0

Last week Frank Denneman blogged about the release of the third installment of the vSphere Design Pocketbook. This is a great initiative from PernixData and Frank which gives bloggers the chance to have certain posts published in the form of an book of which gets distributed at industry events around the world, including EMC World and VMworld.

Having read through this years edition I can tell you that it’s well worth getting your hand on either in PDF format, or in book format if attending events with PernixData is sponsoring. The Social Media Edition is split into 7 Chapters going through specific areas of vSphere including Host Configuration, Cluster Design, Storage, Networking and Security, VM Configuration, Management and general Words of Wisdom and if I was to highlight a section I would make sue you check out Understanding Block Sizes in a Virtualized Environment by Pete Koehler which is becoming a lot more important in this day and age…it’s something that FVP Architect has made easier to discover and understand.

The contributors to the book include respected community and industry leaders like Chris Wahl, William Lam and Frank himself. The remaining contributors (myself included) all run excellent tech blogs and are active on Twitter so make sure you view the list on the download page and follow them on the social networks.

Again, thanks to Frank and the team at PernixData for taking the time to get this project together. Download the Book from the link below and look out for the Hard Copy at an event near you!

http://www.pernixdata.com/resource/vsphere-design-pocketbook-30-social-media-edition

Preserving VSAN + DELL PERC Critical Drivers after ESXi 6.0 CBT Update

Last week VMware released a patch to fix another issue with Change Block Tracking (CBT) which took the ESXi 6.0 Update 1 Build to 3247720. The update bundle contains a number of updates to the esx-base including the resolution of the CBT issue.

This patch updates the esx-base VIB to resolve an issue that occurs when you run virtual machine backups which utilize Changed Block Tracking (CBT) in ESXi 6.0, the CBT API call QueryDiskChangedAreas() might return incorrect changed sectors that results in inconsistent incremental virtual machine backups. The issue occurs as the CBT fails to track changed blocks on the VMs having I/O during snapshot consolidation.

Having just deployed and configured a new Management Cluster consisting of four ESXI 6.0 Update 1 hosts running VSAN I was keen to get the patch installed so that VDP based backups would work without issue however once I had deployed the update (via esxcli) to the first three hosts I saw that the VSAN Health Checker was raising a warning against the cluster. Digging into the VSAN Health Check Web Client Monitor view I saw the following under HCL Health -> Controller Driver Test

As I posted early November there was an important driver and firmware update that was released by VMware and DELL that resolved a number of critical issues with VSAN when put under load. The driver package is shown above against node-104 as 6.606.12.00-1OEM.600.0.0.2159203 and that shows a Passed Driver Health state. The others are all in the Warning state and the version is 6.605.08.00-7vmw.600.1.17.3029758.

What’s happened here is that the ESXi Patch has “updated” the Controller driver to the latest VMware driver number and has overwritten the driver released on the 19th of May and the one listed on the VMware VSAN HCL Page. The simple fix is to reinstall the OEM drivers so that you are left back with the VSAN Health Status as shown below.

Interestingly the Device now shows up as a Avago (LSI) MegaRAID SAS Invader Controller instead of a FD332-PERC (Dual ROC) … I questioned that with a member of the VSAN team and it looks as though that is indeed the OEM name for the FD332 Percs.

So be aware when updating ESXi builds to ensure the updated drivers haven’t removed/replaced it with anything that’s going to potentially give you a really bad time with VSAN…or any other component for that matter.

References:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2137546

VSAN for Service Providers

Since VMworld in San Francisco, VMware have been on a tear backing up all the VSAN related announcements at the show by starting to push a stronger message around the improvements in the latest VSAN release. Cormac Hogan and Rawlinson Rivera have published articles while Duncan Epping has also released a number of articles around VSAN since VMworld including this one on VSAN Licensing and what’s included as part of the different Enterprise Packages…There has also been an official post on the vCloud Team Blog on use cases for VSAN Storage Policies in a vCloud Director environment.

Last year I wrote a couple of posts around the time VSAN pricing was being released and also on the specifics of the vCloud Air Network program bundles that allow VSAN to be consumed via the VSPP. At the time there was no All Flash Array option and the pricing through the VSPP was certainly competitive when you compared it to a per socket price.

As a platform, VSAN is maturing as an option for hyper converged deployments and VMware Service Providers are starting to deploy it not only for their Management Clusters, but also main compute and resource clusters. The wording and messaging from VMware has shifted significantly from the first 5.5 VSAN release where they mainly talked about Test/Dev and VDI workloads to now talk about mission critical workloads with 6.x.

While doing research into our new Management Clusters that will use VSAN on top of the new Dell FX2 PowerEdge Converged Platform I was looking into the costs on the vCAN and how it stacks up next to per socket pricing…Shown below are the different Product Bundles included in the vCAN Program…each one contains a different combination of VMware Products which you get access to depending on the bundle of choice (details here).
While I can’t exactly disclose what level we get at Zettagrid (or other providers for that matter) due to the commercial nature of the programs its safe to assume that service providers at scale can drill that US $1 per point price point down by up to 50%…while some could actually pay more.

When you start to look at the cost of running a storage platform for IaaS you start to get an appreciation for the cost per month on the vCAN program that running VSAN offers. At a small to medium scale VSAN via the vCAN Program stacks up…mainly because the program is structured to make the Points Per Month value cheaper the more volume you transact against the program. So an SP consuming large amounts of vRAM will have a lower entry point for VSAN.

Looking at the larger picture, below is an an example software (only) cost of 10 Hosts (64GB RAM) with 100TB of Storage with an expected utilization of 80% assuming 2 hosts are reserved for HA.

VSAN 80TB Allocated vRAM 410GB (205GB Reserved) Per Month
 $6,400 $1,433  $ 7,833

If we scale that to 20 hosts with 128GB and 200TB of Storage with an expected utilization of 80% assuming 4 hosts are reserved for HA.

VSAN 160TB Allocated vRAM 1.6TB (820GB Reserved) Per Month
 $ 12,800 $5,734  $ 18,534.40

You start to see that the cost per month starts to get somewhat questionable when comparing OpEx vs CapEx costs of a traditional SAN purchase outside of the vCAN Program. As an example you should be able to source a traditional SAN under finance with roughly the same usable storage as whats in the second example for about US $4000-6000 per month on a finance plan over 36 months.

Personally I believe the cost per allocated GB is a little on the high side at scale and it could start to become cost prohibitive for Service Providers when comparing to traditional storage pricing models or even some of the latest pricing for newer scale out platforms on the market…and that’s not even thinking about the additional cost of the AFA Add-On.

So, for me VMware need to look at slightly tweaking the vCAN cost model for VSAN to either allow some form of tiering (ie 0-500GB .08, 500-1000GB .05, 1TB-5TB .02 and so on) and/or change over the metering from allocated GB to consumed GB which allows Service Providers to take advantage of over provisioning and only pay for whats actually being consumed in the VSAN Cluster.

If VMware can push those changes through it will make VSAN even more attractive to vCloud Air Network Partners and have VSAN move from mainly Management Cluster consideration to full blown production IaaS use.

References:

http://www.yellow-bricks.com/2015/09/14/virtual-san-licensing-packaging/
http://www.yellow-bricks.com/2015/08/31/what-is-new-for-virtual-san-6-1/
http://www.virten.net/2014/03/vmware-vsan-license-calculator/
https://blogs.vmware.com/vcat/2015/09/vcloud-director-and-virtual-san-sample-use-case.html

« Older Entries