CBT Bugs – VMware Can’t Keep Letting This Happen!

[UPDATE] – VMware have released an official KB for the CBT issue.

Sadly if you recognize the title of this post it’s because this isn’t the first time I’ve felt compelled to write about the continued industry frustration with some repeat ESXi bugs. In February I wrote in general around the recent history of bugs slipping through VMware QA. Four months later and there has been another CBT bug slip through the net…just to reaffirm the core message of my last post I talked about the fact:

There are a number of competing vendors (and industry watchers) waiting to capitalize on any weakness shown in the VMware stack and with the recent number of QA issues leading to a significant bugs popping up not abating, I wonder how much longer VMware can afford to continue to slip up before it genuinely hurts its standing

The one area of absolute concern is the amount of Change Blog Tracking bugs that seems to slip into new builds of ESXi. This time it’s Express Patch 6 for ESXi 6 (Build 3825889) that contains an apparently new symptom of our old friend the CBT Bug. The patch it’s self is a fairly critical one for those running VSAN and VMXNET3 NICs as it addresses some core issues around them but if you use quiesced snapshots duing a VM Backup may have issues with CBT. The vmware.log of a VM being backed up will contain:

vcpu-0| xxxx: SNAPSHOT:SnapshotBranchDisk: Failed to acquire current epoch for disk /vmfs/volumes/
vmdk : Change tracking is not active for this disk xxx.

For a detailed explanation of the issue go to: http://www.running-system.com/take-care-express-patch-6-esxi-6-can-break-backup-cbt-bug/ 

[UPDATE]

VMware Support is aware of this issue and are currently working on it.
This KB article will be updated once the fix for this issue is released.

To work around this issue, apply one of these options:

Again as a Service Provider the CBT bugs are the most worrying because they fundamentally threaten the integrity of backup data which is not something that IT Operation staff or end users who’s data is put at risk should have to worry about and most backup vendor’s use CBT to make backups more efficient. In this case…specifically if you use Veeam the lack of CBT will extend backup windows and increase the chances of VMs not being backed up as expected.

VMware need to continue to nail ESXi (and vCenter) as well as keeping focus on the new products. VSAN, NSX and everything that VMware offers runs on or off of ESXi and though hypervisors are not as front of mind anymore, everything that VMware does relies on ESXi and VMware partners who create products to work with ESXi need it to be stable…especially around backups. Everyone needs to backup with absolute confidence…the more these CBT bugs appear the less confident pundits become…I already hear of people not wanting to go to ESXi 6.0 because of issues like such as this latest one.

That’s not a good place for VMware to be.

Note: I had sat on this post since Friday, but reading through Anton’s Veeam Community Forums Digest this morning where he lamented the lack of QC and repeat issues. He suggest’s that this is the new normal…and that maybe the thing to do is wait and hope for vSphere 6.5…not a good situation. However, like me he also believes that this can be fixed…but it needs to happen before the next release.

References:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2144685

 

 

6 comments

  • Hey Anthony,

    This really does feel like the norm these days, and it’s so frustrating. I have found and logged a number of vSphere 6.0 bugs in the past few weeks and have been left completely disappointed with the support experience. In one recent case, I was asked to log this “bug” as a feature request – I kid you not. Why should we log poorly implemented features, that worked before, as new features? Sadly in my opinion, the quality of releases have significantly gone downhill in the past year or so. Let’s hope this get’s fixed.

    Cheers,
    Jon

  • FYI you’re shadowbanned on reddit – just went to submit this to /r/vmware it didn’t show up in search as previously submitted but after submission showed as already submitted by anthonyspiteri2 but if you go to that user page while not logged in you’ll see a page not found error which indicates a shadowban.

  • Same case for us,If Not patched it’s a VMXNET3-PSOD, if patched Full of CBT and Backup Issues..

  • Electric Steve

    Due to the repeated CBT bugs, and not being able to trust CBT-based backups as a consequence, we completely stopped using VMware’s CBT mechanism in our VEEAM backups and use it’s own, internal mechanism to work with incrementals.

Leave a Reply