Category Archives: VSAN

VSAN 6.2 + DELL PERC: Important Certified Driver Updates

As many of us rejoiced at the release of VSAN 6.2 that came with vSphere 6 Update 2…those of us running DELL PERC based storage controllers where quickly warned of a potential issues and where told to not upgrade. VMwareKB 2144614 referenced these issues and stated that the PERC H730 and FD332 found in DELL Server platforms where not certified for VSAN 6.2 pending onging investigations. The storage controllers that where impacted are listed below.

This impacted me as we have the FD332 Dual ROC in our production FX2s with VSAN 6.1 and a test bed with VSAN 6.2. With the KB initially saying No ETA I sat and waited like others impacted to have the controllers certified. Late last week however DELL and VMware finally released an updated FW driver for that PERC based which certifies the H730s and FD332s with VSAN 6.2.

Before this update if you looked at the VSAN Health Monitor you would have seen a Warning next to the VMware Certified check and official driver support.

As well as upgrading the controller drivers it’s also suggested that you make the following changes on each host in the cluster which adds two new VSAN IO timeout settings. No reboot is required after applying the advanced config and the command is persistent.

esxcfg-advcfg -s 100000 /LSOM/diskIoTimeout
esxcfg-advcfg -s 4 /LSOM/diskIoRetryFactor

Once the driver has been upgraded you should see all green in the VSAN Health Checks as shown below with the up to date driver info.

This is all part of the fun and games of using your own components for VSAN, but I still believe it’s a huge positive to be able to cater a design for specific use cases with specific hardware. In talking with various people within VMware and DELL (as it related to this and previous PERC driver issues) it’s apparent that both parties need to communicate better and go through much better QA before updating drivers and firmware releases however this is not only something that effects VMware and DELL and not only for storage drivers…it’s a common issues throughout the industry and it not only impacts VMware VSAN with every vendor having issues at some point.

Better safe than sorry here and well done on VMware and DELL on getting the PERC certified without too much delay.

References:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2144614

http://www.vmware.com/resources/compatibility/detail.php?deviceCategory=vsanio&productid=38055&deviceCategory=vsanio&details=1&vsan_type=vsanio&io_partner=23&io_releases=275&page=1&display_interval=10&sortColumn=Partner&sortOrder=Asc

vSphere 6 Update 2 – Whats In It for Service Providers

It’s been just over a week since VMware released vSphere 6 Update 2 and I thought I would go through some of the key features and fixes that are included in the latest versions of vCenter and ESXi. As usual I generally keep an eye out for improvements that relate back to Service Providers who use vSphere as the foundation of their Managed or Infrastructure as as Service offerings.

New Features:

Without question the biggest new feature is the release of VSAN 6.2. I’ve covered this release in previous blog posts and when you upgrade to ESXi 6.0 Update 2 the VSAN 6.2 bits are present within the kernel. Some VSAN services are actually in play regardless if you have it configured or not…which is interesting. With the new pricing for VSAN through the vCAN program, Service Providers now can seriously think about deploying VSAN for their main IaaS platforms.

The addition of support for High Speed Ethernet Links is significant not only because of the addition of 25G and 50G link speeds means increased throughput for converged network cards allowing for more network traffic to flow through hosts and switches for Fault Tolerance, vMotion, Storage vMotions and storage traffic but also because it allows SPs to think about building Edge Clusters for networking services such as NSX and allow the line speeds to take advantage of even higher backends.

From a manageability point of view the Host Client HTML5 user interface is a welcome addition and hopefully paves the way for more HTML5 management goodness from VMware for not only hosts…but also vCenter its self. There is a fair bit of power already in the Host Client and I can bet that admins will start to use it more and more as it continues to evolve.

For vCenter the addition of Two-Factor Authentication using RSA or Smartcard technology is an important feature for SPs to use if they are considering any sort of certification for their services. For example many government based certifications such as IRAP require this to be certified.

Resolved Issues:

There are a bunch of resolved issues in this build and I’ve gone through the rather extensive list to pull out the biggest fixes that relate to my experience in service provider operations.

vCenter:

  • Upgrading vCenter Server from 5.5 Update 3b to 6.0 Update 1b might fail if SSLv3 is disabled on port 7444 of vCenter Server 5.5 Update 3b. An upgrade from vCenter Server 5.5 Update 3b to 6.0 Update 2 works fine if SSLv3 is disabled by default on 7444 port of vCenter Server 5.5 Update 3b.
  • Deploying a vApp on vCloud Director through the vApp template fails with a Profile-Driven storage error. When you refresh the storage policy, an error message similar to the following is displayed: The entity vCenter Server is busy completing an operation.
  • Delta disk names of the source VM are retained in the disk names of the cloned VM. When you create a hot clone of a VM that has one or more snapshots, the delta disk names of the source VM are retained in the cloned VM
  • vCenter Server service (vpxd) might fail during a virtual machine power on operation in a Distributed Resource Scheduler (DRS) cluster.

ESXi:

  • Hostd might stop responding when you execute esxcli commands using PowerCLI resulting in memory leaks and memory consumption exceeding the hard limit.
  • ESXi mClock I/O scheduler does not work as expected. The ESXi mClock I/O scheduler does not limit the I/Os with a lesser load even after you change the IOPS of the VM using the vSphere Client.
  • After you upgrade Virtual SAN environment to ESXi 6.0 Update 1b, the vCenter Server reports a false warning similar to the following in the Summary tab in the vSphere Web Client and the ESXi host shows a notification triangle
  • Attempts to perform vMotion might fail after you upgrade from ESXi 5.0 or 5.1 to 6.0 Update 1 or later releases. An error message similar to the following is written to the vmware.log file.
  • Virtual machine performance metrics are not displayed correctly as the performance counter cpu.system.summation for a VM is always displayed as 0
  • Attempts to perform vMotion with ESXi 6.0 virtual machines that have two 2 TB virtual disks created on ESXi 5.0 fail with an error messages similar to the following logged in the vpxd.log file:2015-09-28T10:00:28.721+09:00 info vpxd[xxxxx] [[email protected] sub=vpxLro opID=xxxxxxxx-xxxxxxxx-xx] [VpxLRO] — BEGIN task-919 — vm-281 — vim.VirtualMachine.relocate — xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx(xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)

The mClock fix highlighted above is a significant fix for those that where looking to use IOPS limiting. It’s basically been broken since 5.5 Update 2 and also impacts/changes the way in which you would think IOPS are interpreted through the VM to storage stack. For service providers looking to introduce IOPS limited to control the impact noisy neighbors the fix is welcomed.

As usual there are still a lot of known issues and some that have been added or updated to the release notes since release date. Overall the early noise coming out from the community is that this Update 2 release is relatively solid and there have been improvements in network performance and general overall stability. Hopefully we don’t see a repeat of the 5.5 Update 2 issues or the more recent bug problems that have plagued previous released…and hopefully not more CBT issues!

vSphere 6.0 Update 2 has a lot of goodness for Service Providers and continues of offer the number one vitalization platform from which to build managed and hosted services on top of. Go grab it now and put it through it’s paces before pushing to production!

References:

http://pubs.vmware.com/Release_Notes/en/vsphere/60/vsphere-esxi-60u2-release-notes.html

http://pubs.vmware.com/Release_Notes/en/vsphere/60/vsphere-vcenter-server-60u2-release-notes.html

http://pubs.vmware.com/Release_Notes/en/vsan/62/vmware-virtual-san-62-release-notes.html

http://pubs.vmware.com/Release_Notes/en/vsphere/60/vmware-host-client-10-release-notes.html

VSAN 6.2 – Price Changes for Service Providers!

Who said big corporations don’t listen to their clients! VMware have come to the party in a huge way with the release of VSAN 6.2…and not only from a technical point of view. Ever since the release of VSAN the pricing structure for vCloud Air Service Provider partners has been off the mark in terms of the commercial viability in having VSAN deployed at scale. The existing model was hurting any potential uptake in the HCI platform beyond deployments for Management Clusters and alike.

I have been on VMware’s back since March 2014 when VSPP pricing was first revealed and I wrote a detailed blog post back in October where I compared the different vCAN bundles options and showed some examples of how it did not scale.

For me VMware need to look at slightly tweaking the vCAN cost model for VSAN to either allow some form of tiering (ie 0-500GB .08, 500-1000GB .05, 1TB-5TB .02 and so on) and/or change over the metering from allocated GB to consumed GB which allows Service Providers to take advantage of over provisioning and only pay for whats actually being consumed in the VSAN Cluster.

Since that post (obviously not only off the back of the noise I was making) the VSAN Product and Marketing teams have gone out to vCAN Partners and spent time going over possible tweaks to the billing structure for VSAN by surveying partners and trying to achieve the best balance going forward to help increase VSAN uptake.

With the release of VSAN 6.2 in ESXi 6.0 Update 2 this week, VMware have announced new pricing for vCAN Partners…the changes are significant and will represent a complete rethink of VSAN at scale for IaaS Providers. Furthermore the changes are also strategically important for VMware in an attempt to secure the storage market for existing vCAN partners.

The changes are indeed significant and not only is the billing metric based on used or consumed per GB storage now, but in somewhat of a surprise to me the VSPP Point Per Month component has been slashed. Further from that the Enterprise Plus was rumored to be listed at .18 VSPP Point per allocated GB which was going to price out AF even more…now with AF Enterprise costing as much as what the Standard cost in VSPP points per used GB that whole conversation has changed.

Below is an an example software only cost of 10 Hosts (64GB RAM) with 100TB of Storage (60% used capacity average) with an expected utilization of 80% assuming 2 hosts are reserved for HA. Old numbers are in the brackets to the right and is based on VSAN Standard. It must be noted that these are rough numbers based on the new pricing and for the specifics of the new costings you will need to engage with your local vCAN Partner Account manager.

VSAN 80TB Allocated
(48TB Used)
vRAM 410GB (205GB Reserved) Per Month
 $1,966 ($6,400) $1,433  $3,399 ($7,833)

If we scale that to 20 hosts with 128GB and 200TB of Storage (60% used capacity average) with an expected utilization of 80% assuming 4 hosts are reserved for HA.

VSAN 160TB Allocated
(96TB Used)
vRAM 1.6TB (820GB Reserved) Per Month
 $3,932 ($12,800) $5,734  $9,666 ($18,534)

In a real world example based on figures I’ve seen…Taking into account just VSAN…if you have 500TB worth of storage provisioned, of which 200TB was consumed with Advanced plus the Enterprise Add-On the approx. cost of running VSAN comes down from ~30K to ~6K per month.

The idea now that Service Providers can take advantage of thin provisioning plus the change in metric to used or consumed storage and makes VSAN a lot more attractive at scale…while there are still no break points in terms of total storage blocks the conversation around VSAN being to expensive has now, for the most disappeared.

Well done to the VSAN and vCAN product and marketing teams!

Disclaimer:

These figures are based on my own calculations and are based on the VSPP Point value being $1US. This value will be different for vCAN partners depending on the bundle and points level they are on through the program. I have been accurate with my figures but errors and omissions may exist.

VSAN 6.2 – Things Just Got Interesting!

There is a saying in our industry that Microsoft always get their products right on the third attempt…and while this has been less and less the case of late (Hyper-V 2012 didn’t exactly deliver) it is more or less an accurate statement. Having been part of the beta and early access blogger sessions for VSAN 6.2 I can say with confidence that VMware have hit the nail on the head with this 6.2 release.

The Hyper-converged storage platform which is built into the worlds leading hypervisor platform (VMware ESXi) has reached a level of maturity and feature set that should and will make the more established HCI vendors take note and certainly act towards lowering the competitive attack surface that existed with previous releases of VSAN.

The table below shows you the new features of 6.2 together with the existing features of 6.1. As you can see by the number of green dots there are not a lot of new features…but they certainly pack a punch and fill in the gaps that had stopped VSAN being adopted for higher end workloads in comparison with existing market leaders.

Across all versions, Software Checksum has been added with Advanced and Enterprise versions getting VSANs implementation of Erasure Coding (RAID 5/6) with Deduplication and Compression available for the All Flash version and QOS IOPS Limiting available in Enterprise only.

With the initial 5.x releases of VSAN VMware where very reluctant to state that it was suitable for “enterprise” workloads and only mentioned VDI, Test and Development workloads…the language changed to extend to more enterprise workloads in VSAN 6.x but as you can see below the 6.2 release now targets all workloads…and more importantly VMware are openly confident of backing the claim.

VMware have achieved this mostly through the efficiencies that come with their deduplication and compression feature along with erasure coding which in effect adds RAID5/6 support with a FTT level of 1 or 2 set which is in addition to the RAID1 implementation in previous versions. Software Checksum has been used as a huge point of difference in comparing other HCI platforms to the previous VSAN releases so it’s great to see this added tick box to further ensure data consistency across VSAN disk group and datastore objects.

The QOS feature that applies IOPS limiting on a per VM basis is also significant for extending VSAN workload reach and allows the segmentation of noisy neighbours and allows operators to apply limits that have had a flaky history up to this point on vSphere platforms and this is probably my favourite new feature.

As with previous 6.x releases of VSAN there is an AFA option available in Enterprise and Enterprise Plus editions though you will pay a premium compared to the hybrid version and while I’m still not convinced VMware have the pricing right I do know that there is ongoing work to make it more attractive for enterprises and service providers alike.

One of the great things about VSAN is the ability to build your own platform from whatever combination of HCL approved hardware you want. This flexibility is only comparable to EMCs ScaleIO but also means that some extra thought needs to go into a VSAN build if you don’t want to go down the Ready Node path. In my testing…if sized correctly the only limitation in terms of performance is the speed of your network cards and I’ve been able to push VSAN (Hybrid) to impressive throughput numbers with importantly low latency numbers.

Finally, the 6.2 version of VSAN expands on the Health and Monitoring components that existed in previous versions. VMware have baked in new performance and capacity monitoring into the vCenter Web Client that gives insights in VM storage consumption and how that capacity is taken up by the various VSAN components.

There is also a new Cluster Performance Menu to gives greater details into VSAN Cluster throughput, IOPS and latency so there should be no need to get into the vSphere Ruby Client which is a blessing. The UI is limited by the Web Client and not as sexy and modern as others out there but it’s come a long way and now means you don’t need to hook in external systems to get VSAN related metrics.

As suggested by the posts title, I believe that this VSAN release represents VMware’s official coming of age into the HCI market and will make the other players take note which will no doubt spark the odd Twitter fuelled banter and Slack Channel discussions about what’s missing or what’s been copied…but at the end of the day competition in tech is great and better products are born out of competition.

Things just got Interesting!

For a more detailed look at the new features check out Duncan Epping‘s post here:

vCloud Air Rumours – vCAN in Focus…Again | VMware Rethinks Strategy

Well…the news isn’t great filtering out over the internets about the VMware Job Cuts and the apparent clipping of vCloud Air’s wings. While this is yet to be 100% confirmed nor are there any specifics about what it actually means for the vCloud Air Network. If what I am reading is true and no more CapEx will be spent on existing vCloud Air zones then hopefully VMware has realised that the best way to fight the fight in terms of IaaS is to let its key partners deliver VMware based IaaS using core platform technologies from them such as vCenter, ESXi, NSX, vCloud Director and possibly throwing in VSAN.

Originally positioned as VMware’s public cloud service and a vehicle for customers to manage hybrid clouds, vCloud Air now offers specialty cloud services and software with characteristics unique to VMware, Gelsinger said. The vCloud Air service will still exist, but it sounds like the business’ main focus will be to provide techology for partner-run clouds.

UPDATE: Having just listened to the Earnings Call and reading through the transcript, I’ve included a key Pat Gelsinger quote below:

I’d like to take a moment to clarify our strategy for vCloud Air; the service will have narrower focus providing specialized cloud software and services unique to VMware and distinct from other public cloud providers. We will aggressively provide these innovations to our vCloud Air Network partners helping them to accelerate their growth.

VMware is creating cloud software and cloud services for cloud providers. It’s important to note that given that’s narrower focus, we believe the capital expenses we’ve already invested in vCloud Air will be adequate for our needs and that we expect our vCloud Air service to be accretive by the end of 2017.

There is a massive opportunity here for the vCAN and together with the news in December at the renewed vCloud Director push (which I assume is still happening) the time is now to work to fully exploit the power of the APIs that are offered and exposed as part of the VMware Cloud stack. vCAN Service Providers should be a little more relaxed this morning on the news of vCloud Air’s apparent scaling back and the worry that was front and centre in terms of VMware’s reluctance to drive business to partners and VMware competing against vCAN partners in deals…should go away.

Again…time will tell!

#LongLivevCD

References:

https://www.sdxcentral.com/articles/news/vmware-cuts-800-rethinks-cloud-vcloud-air/2016/01/

http://seekingalpha.com/article/3836736-vmware-vmw-ceo-pat-gelsinger-q4-2015-results-earnings-call-transcript?page=2

http://www.crn.com/news/cloud/300079456/microsoft-partners-fed-up-vmware-customers-are-switching-to-azure-cloud.htm

https://rcpmag.com/articles/2016/01/26/vmware-layoffs-begin.aspx

Sidenote:

I have to eat a little humble pie and give credit to CRN journalist Kevin McLaughlin who has been hot on vCloud Air for a while now and has put together a couple of articles on vCloud Air struggling…I still don’t agree with the claims that current VMware customers are flocking to AWS and Azure, but certainly if the news of today is correct I acknowledge the reporting 🙂

Preserving VSAN + DELL PERC Critical Drivers after ESXi 6.0 CBT Update

Last week VMware released a patch to fix another issue with Change Block Tracking (CBT) which took the ESXi 6.0 Update 1 Build to 3247720. The update bundle contains a number of updates to the esx-base including the resolution of the CBT issue.

This patch updates the esx-base VIB to resolve an issue that occurs when you run virtual machine backups which utilize Changed Block Tracking (CBT) in ESXi 6.0, the CBT API call QueryDiskChangedAreas() might return incorrect changed sectors that results in inconsistent incremental virtual machine backups. The issue occurs as the CBT fails to track changed blocks on the VMs having I/O during snapshot consolidation.

Having just deployed and configured a new Management Cluster consisting of four ESXI 6.0 Update 1 hosts running VSAN I was keen to get the patch installed so that VDP based backups would work without issue however once I had deployed the update (via esxcli) to the first three hosts I saw that the VSAN Health Checker was raising a warning against the cluster. Digging into the VSAN Health Check Web Client Monitor view I saw the following under HCL Health -> Controller Driver Test

As I posted early November there was an important driver and firmware update that was released by VMware and DELL that resolved a number of critical issues with VSAN when put under load. The driver package is shown above against node-104 as 6.606.12.00-1OEM.600.0.0.2159203 and that shows a Passed Driver Health state. The others are all in the Warning state and the version is 6.605.08.00-7vmw.600.1.17.3029758.

What’s happened here is that the ESXi Patch has “updated” the Controller driver to the latest VMware driver number and has overwritten the driver released on the 19th of May and the one listed on the VMware VSAN HCL Page. The simple fix is to reinstall the OEM drivers so that you are left back with the VSAN Health Status as shown below.

Interestingly the Device now shows up as a Avago (LSI) MegaRAID SAS Invader Controller instead of a FD332-PERC (Dual ROC) … I questioned that with a member of the VSAN team and it looks as though that is indeed the OEM name for the FD332 Percs.

So be aware when updating ESXi builds to ensure the updated drivers haven’t removed/replaced it with anything that’s going to potentially give you a really bad time with VSAN…or any other component for that matter.

References:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2137546

Dell PowerEdge FX2: VSAN Disk Configuration Steps

When you get your new DELL FX2s out of the box and powered on for the first time you will notice that the disk configuration has not been setup with VSAN in mind…If you where to log into ESXi on the blades in SLOT1a and 1c you would see that each host will have each SAS disk configured as a datastore. There is a little pre-configuration you need to do in order to get the drives presented correctly to the blades servers as well as remove and reconfigure the datastores and disks from within ESXi.

With my build I had four FC430 Blades with two FD332 Storage Sleds that contained 4x200GB SSDs and 8x600GB SAS drives in each sled.  By default the storage mode is configured in Split Single Host mode which results in all the disks being assigned to the hosts in SLOT1a and SLOT1c and both controllers as also assigned to the single host.

You can configure individual storage sleds containing two RAID controllers to operate in the following modes:

  • Split-single – Two RAID controllers are mapped to a single compute sled. Both the controllers are enabled and each controller is connected to eight disk drives
  • Split-dual – Both RAID controllers in a storage sled are connected to two compute sleds.
  • Joined – The RAID controllers are mapped to a single compute sled. However, only one controller is enabled and all the disk drives are connected to it.

To take advantage of the FD332-PERC (Dual ROC) controller you need to configure Split-Dual mode. All hosts need to be powered off to change the default configuration and change it to Split Dual Hosts for the VSAN configuration.

Head to Server Overview -> Power and from here Gracefully Shutdown all four servers

Once the servers have been powered down, click on the Storage Sleds in SLOT-03 and SLOT-04 and go to the Setup Tab. Change the Storage Mode to Split Dual Host and Click Apply.

To check the distribution of the disks you can Launch the iDRAC to each blade and go to Storage -> Enclosures and check to see that each Blade now has 2xSSDs and 4xHDD drives assigned. With the FD332 there are 16 total slots with 0-7 belonging to the first blade and 8-16 belonging to the seconds blade. As shown below we are looking at the config of SLOT1a.

The next step is to reconfigure the disks within ESXi to make sure VSAN can claim them when configuring the Disk Groups. Part of the process below is to delete any datastores that exist and clear the partition table…by far the easiest way to achieve this is via the new Embedded Host Client.

Install the Embedded Host Client on each Host

Log into the Hosts via the Embedded Client from https://HOST_IP/ui and go to the Storage Menu and delete any datastores that where preconfigured by DELL.

Click on Devices Tab in the Storage Menu and Clear the Partition Table so the VSAN can claim the disks that have been just deleted.

From here all disks should be available to be claimed by VSAN to create your disk groups.

As a side note it’s important to update to the latest driver for the PERC.

References:

http://www.dell.com/support/manuals/au/en/aubsd1/dell-cmc-v1.20-fx2/CMCFX2FX2s12UG-v1/Notes-cautions-and-warnings?guid=GUID-5B8DE7B7-879F-45A4-88E0-732155904029&lang=en-us

Dell PowerEdge FX2: CMC Configuration Gotchya

Back in September I wrote an introductory post (If you haven’t read that post click here) on the DELL PowerEdge FX2 HCI hardware and why we had selected it for our VSAN Management platform. After a busy two months consisting of a VMworld, vForumAU and VeeamOn it’s finally time to put start working towards putting these babies into production.

I’m hoping to do a series of posts around the FX2s and VSAN and thought I would kick things off with the short but very important public service announcement around the default configuration behavior of the Chassis Management Controller network port settings and how if you don’t RTFM you could be left with an angry network guy beating down at your door!

CAUTION: Connecting the STK/Gb2 port to the management network will have unpredictable results if the CMC setting is not changed from default Stacking to Redundant, to implement NIC failover. In the default Stacking mode, cabling the Gb1 and STK/Gb2 ports to the same network (broadcast domain) can cause a broadcast storm. A broadcast storm can also occur if the CMC setting is changed to Redundant mode, but the cabling is daisy chained between chassis in the Stacking mode. Ensure that the cabling model matches the CMC setting for the intended usage.

That warning should be one of the first things you read as you go through the CMC for PowerEdge FX2 User Guide but just in case you don’t read that and are looking to take advantage of the redundant NIC feature the CMC offers similar to that found in the DELL M1000e Chassis you need to Network -> General Settings and change the default radio option shown below from Stacking to Redundant.

If this isn’t done and you do attempt to set up redundant management ports in the stacking option you will more than likely as the caution suggests impact your network due to the switches grinding to a halt under the stress of the broadcast storm…and in turn have some not to happy networking admins coming after you once they work out whats going on.

The diagram above, pulled from the online documentation shows you what not to do if Management Port 2 is configured in stacking Mode. Stacking mode is used to daisy chain a number of FX2 Chassis for single access management if required. I would have thought that having the least dangerous option set as default was the way to go but it is certainly a case of be aware that some assumptions can lead to major headaches…so a final reminder to RTFM just in case…and be aware of this default behavior in the FX2 CMCs.

http://www.dell.com/support/manuals/au/en/aubsd1/dell-cmc-v1.20-fx2/CMCFX2FX2s12UG-v1/Checklist-to-set-up-chassis?guid=GUID-767EC114-FE22-477E-AD20-E3356DD53395&lang=en-us

 

VSAN for Service Providers

Since VMworld in San Francisco, VMware have been on a tear backing up all the VSAN related announcements at the show by starting to push a stronger message around the improvements in the latest VSAN release. Cormac Hogan and Rawlinson Rivera have published articles while Duncan Epping has also released a number of articles around VSAN since VMworld including this one on VSAN Licensing and what’s included as part of the different Enterprise Packages…There has also been an official post on the vCloud Team Blog on use cases for VSAN Storage Policies in a vCloud Director environment.

Last year I wrote a couple of posts around the time VSAN pricing was being released and also on the specifics of the vCloud Air Network program bundles that allow VSAN to be consumed via the VSPP. At the time there was no All Flash Array option and the pricing through the VSPP was certainly competitive when you compared it to a per socket price.

As a platform, VSAN is maturing as an option for hyper converged deployments and VMware Service Providers are starting to deploy it not only for their Management Clusters, but also main compute and resource clusters. The wording and messaging from VMware has shifted significantly from the first 5.5 VSAN release where they mainly talked about Test/Dev and VDI workloads to now talk about mission critical workloads with 6.x.

While doing research into our new Management Clusters that will use VSAN on top of the new Dell FX2 PowerEdge Converged Platform I was looking into the costs on the vCAN and how it stacks up next to per socket pricing…Shown below are the different Product Bundles included in the vCAN Program…each one contains a different combination of VMware Products which you get access to depending on the bundle of choice (details here).
While I can’t exactly disclose what level we get at Zettagrid (or other providers for that matter) due to the commercial nature of the programs its safe to assume that service providers at scale can drill that US $1 per point price point down by up to 50%…while some could actually pay more.

When you start to look at the cost of running a storage platform for IaaS you start to get an appreciation for the cost per month on the vCAN program that running VSAN offers. At a small to medium scale VSAN via the vCAN Program stacks up…mainly because the program is structured to make the Points Per Month value cheaper the more volume you transact against the program. So an SP consuming large amounts of vRAM will have a lower entry point for VSAN.

Looking at the larger picture, below is an an example software (only) cost of 10 Hosts (64GB RAM) with 100TB of Storage with an expected utilization of 80% assuming 2 hosts are reserved for HA.

VSAN 80TB Allocated vRAM 410GB (205GB Reserved) Per Month
 $6,400 $1,433  $ 7,833

If we scale that to 20 hosts with 128GB and 200TB of Storage with an expected utilization of 80% assuming 4 hosts are reserved for HA.

VSAN 160TB Allocated vRAM 1.6TB (820GB Reserved) Per Month
 $ 12,800 $5,734  $ 18,534.40

You start to see that the cost per month starts to get somewhat questionable when comparing OpEx vs CapEx costs of a traditional SAN purchase outside of the vCAN Program. As an example you should be able to source a traditional SAN under finance with roughly the same usable storage as whats in the second example for about US $4000-6000 per month on a finance plan over 36 months.

Personally I believe the cost per allocated GB is a little on the high side at scale and it could start to become cost prohibitive for Service Providers when comparing to traditional storage pricing models or even some of the latest pricing for newer scale out platforms on the market…and that’s not even thinking about the additional cost of the AFA Add-On.

So, for me VMware need to look at slightly tweaking the vCAN cost model for VSAN to either allow some form of tiering (ie 0-500GB .08, 500-1000GB .05, 1TB-5TB .02 and so on) and/or change over the metering from allocated GB to consumed GB which allows Service Providers to take advantage of over provisioning and only pay for whats actually being consumed in the VSAN Cluster.

If VMware can push those changes through it will make VSAN even more attractive to vCloud Air Network Partners and have VSAN move from mainly Management Cluster consideration to full blown production IaaS use.

References:

http://www.yellow-bricks.com/2015/09/14/virtual-san-licensing-packaging/
http://www.yellow-bricks.com/2015/08/31/what-is-new-for-virtual-san-6-1/
http://www.virten.net/2014/03/vmware-vsan-license-calculator/
https://blogs.vmware.com/vcat/2015/09/vcloud-director-and-virtual-san-sample-use-case.html

First Look: Dell PowerEdge FX2 Converged Platform

For the last six months or so I’ve been on the look out for server and storage hardware to satisfy the requirement for new Management Clusters across our Zettagrid vCloud Zones… After a fairly exhaustive discovery and research stage the Dell PowerEdge FX2 dropped at the right time to make the newly updated converged architecture hardware platform a standout choice for a HCI based solution.

I plan on doing a couple of posts on the specifics of the hardware chosen as part of the build that will end up as a VMware VSAN configuration but for the moment there is a little more info on the FX2 PowerEdge (below) as well as a Virtual Unboxing video that goes through the initial familiarization with the CMC and then walks through the FC430 System and Storage Configuration as well as what the new BIOS menu looks like:

 

Below are some specs from the Dell site going through the compute and storage hardware…as you saw in the video above we went for the 1/4 Blade FC430’s with two FD322 Storage Sleds.

Server blocks at the heart of the FX converged architecture are powered by the latest Intel® Xeon® processors. They include:

  • FC430: 2-socket, quarter-width 1U high-density server block with optional InfiniBand configuration
  • FC630: 2-socket, half-width 1U workhorse server block ideal for a wide variety of business applications
  • FC830: Powerful 4-socket, full-width 1U server block for mid-size and enterprise data centers
  • FM120x4: Half-width 1U sled housing up to four separate Intel® Atom® powered single-socket microservers offers up to 16 microservers per 2U.

The FC430 features:

  • Two multi-core Intel® Xeon® E5-2600 v3 processors or one multi-core Intel® Xeon® E5-1600 v3 processor (up to 224 cores per FX2)
  • Up to 8 memory DIMMs (up to 64 DIMMs per FX2)
  • Two 1.8″ SATA SSDs or one 1.8″ SATA SSD (w/front IB Mezzanine port)
  • Dual-port 10Gb LOM
  • Access to one PCIe expansion slot in the FX2 chassis

The FD332 provides massive direct attached storage (DAS) capacity in easily scalable, modular half-width, 1U blocks. Each block can house up to 16 direct-attached small form factor (SFF) storage devices. Combined with FX servers, the FD332 drives highly flexible, scale out computing solutions and is an excellent option for dense VSAN environments using optimized ratios of HDD/SSD storage (including all flash) .

  • Up to 16 SFF 2.5″ SSDs/HDDs , both SATA and SAS
  • Up to three FD332 blocks per chassis (with one FC630 for processing). Other storage options include one or two blocks with different combinations of server blocks
  • 12Gbps SAS 3.0 and 6Gbps SATA 3.0
  • PowerEdge RAID Controller (PERC9), single or dual controllers, RAID or HBA modes, or mix and match modes with dual controllers

My first impressions are that this is a very very sexy bit of kit! I am looking forward to getting it up and firing and putting it to use as the basis for a solid Management Cluster platform.

http://www.dell.com/us/business/p/poweredge-fx/pd?oc=pe_fc430_1085&model_id=poweredge-fx&l=en&s=bsd 

« Older Entries Recent Entries »