For a look sneak peek at the highly anticipated Cloud Tier Copy mode… head here to veeam.com
Tag Archives: Cloud Tier
Last week at VeeamON 2019, Dustin Albertson and myself delivered a two part deep dive session on Cloud Tier, which was released in Update 4 of Veeam Backup & Replication 9.5 in January. I’ve blogged about how Cloud Tier is one of the most innovative features i’ve seen in recent times and I have been able to dig under the covers of the technology from early in the development cycle. I have presented basic overviews to more complex deep dives over the past six or so months however at VeeamON 2019, Dustin and myself took it a step further and went even deeper.
The first part of the Deep Dive was presented as the first session of the event, just after the opening keynote. It was on main stage and was all slide driven content that introduces the Cloud Tier, talks about the architecture and then dives deeper into its inner workings as well as us talking about some of the caveats.
From the first session to the last session slot of the event…to finish up, Dustin and I presented a demo only super session which I have to admit… was one of the best sessions i’ve ever been a part of in terms of flow, audience participation and what we where able to actually show. We even where able to show off some of the new COPY functionality coming in v10.
There are a few scripts that we used in that session that I will look to release on GitHub over the next week or so.. so stay tuned for those! But for now, enjoy the session recordings embedded above.
At the recent Cloud Field Day 5 (CFD#5) I presented a deep dive on the Veeam Cloud Tier which was released as a feature extension of our Scale Out Backup Repository (SOBR) in Update 4 of Veeam Backup & Replication. Since we went GA we have been able to track the success of this feature by looking at Public Cloud Object Storage consumption by Veeam customers using the feature. As of last week Veeam customers have been offloading petabytes of backup data into Azure Blob and Amazon S3…not counting the data being offloaded to other Object Storage repositories.
During the Cloud Field Day 5 presentation, Michael Cade talked about the Portability of Veeam’s data format, around how we do not lock our customers into any specific hardware or format that requires a specific underlying File System. We offer complete Flexibility and Agnosticity where your data is stored and the same is true when talking about what Object Storage platform to choose for the offloading of data with the Cloud Tier.
I had a need recently to setup a Capacity Tier extent that was backed by an Object Storage Repository on Azure Blob. I wanted to use the same backup data that I had in an existing Amazon S3 backed Capacity Tier while still keeping things clean in my Backup & Replication console…luckily we have built in a way to migrate to a new Object Storage Repository, taking advantage of the innovative tech we have built into the Cloud Tier.
Cloud Tier Data Migration:
During the offload process data is tiered from the Performance Tier to the Capacity Tier effectively Dehydrating the VBK files of all backup data only leaving the metadata with an Index that points to where the data blocks have been offloaded into the Object Storage.
In this small example, as you can see below, the SOBR was configured with a Capacity Tier backed by Amazon S3 and using about 15GB of Object Storage.
There are two ways to achieve the rehydration or download operation.
- Via the Backup & Replication Console
- Via a PowerShell Cmdlet
Rehydration via the Console:
From the Home Menu under Backups right click on the Job Name and select Backup Properties. From here there is a list of the Files contained within the job and also the objects that they contain. Depending on where the data is stored (remembering that the data blocks are only even in one location… the Performance Tier or the Capacity Tier) the icon against the File name will be slightly different with files offloaded represented with a Cloud.
Right Clicking on any of these files will give you the option to Copy the data back to the Performance Tier. You have the choice to copy back the backup file or the backup files and all its dependancies.
The one caveat to this method is that you can’t select bulk files or multiple backup jobs so the process to rehydrate everything from the Capacity Tier can be tedious.
Rehydration via PowerShell:
To solve that problem we can use PowerShell to call the Start-VBRDownloadBackupFile cmdlet to do the bulk of the work for us. Below are the steps I used to get the backup job details, feed that through to variable that contains all the file names, and then kick off the Download Job.
PS C:\> $backup = Get-VBRBackup -Name LOCAL-02
PS C:\> $files = Get-VBRBackupFile -Backup $backup
PS C:\> $files
PS C:\> Start-VBRDownloadBackupFile -BackupFile $files -ThisBackupAndIncrements
CreationTime : 5/6/2019 2:42:26 PM
EndTime : 5/6/2019 2:42:32 PM
JobId : 3f86363c-5821-4d48-8764-a6192fc794a2
Result : Success
State : Stopped
Id : 6d1c4c11-549c-4452-85ad-252fde1d6181
PS C:\> Start-VBROffloadBackupFile -BackupFile $files -ThisBackupAndIncrements
CreationTime : 5/6/2019 2:52:09 PM
EndTime : 5/6/2019 3:22:02 PM
JobId : f656c2fb-3783-4504-9803-3ed196440b31
Result : Success
State : Stopped
Id : f3dea5d2-e10e-4843-a2c8-b49554c19557
The PowerShell window will then show the Download Job running
Completing the Migration:
No matter which way the Download job is initiated, we can see the progress form the Backup & Replication Console under the Jobs section.
And looking at the Disk and Network sections of Windows Resource Monitor we can see connections to Amazon S3 pulling the required blocks of data down.
Once the Download job has been completed and all VBKs have been rehydrated, the next step is to change the configuration of the SOBR Capacity Tier to point at the Object Storage Repository backed by Azure Blob.
The final step is to initiate an offload to the new Capacity Tier via an Offload Job…this can be triggered via the console or via Powershell (as shown in the last command of the PowerShell code above) and because we have already a set of data that satisfies the conditions for offload (sealed chains and backups outside the operational restore window) data will be dehydrated once again…but this time up to Azure Blob.
As mentioned in the intro, the ability for Veeam customers to have control of their data is an important principal revolving around data portability. With the Cloud Tier we have extended that by allowing you to choose the Object Storage Repository of your choice for cloud based storage or Veeam backup data…but also given you the option to pull that data out and shift when and where desired. Migrating data between AWS, Azure or any platform is easily achieved and can be done without too much hassle.
Yesterday at Cloud Field Day 5, I presented a deep dive on our Cloud Tier feature that was released as a feature for Scale Out Backup Repository (SOBR) in Veeam Backup & Replication Update 4. The section went through an overview of its value proposition as well as deep dive into how we are tiering the backup data into Object Storage repositories via the Capacity Tier Extend of a SOBR. I also covered the space saving and cost saving efficiencies we have built into the feature as well as looking at the full suite of recoverability options still available with data sitting in an Object Storage Repository.
This included a live demo of a situation where a local Backup infrastructure had been lost and what the steps would be to leverage the Cloud Tier to bring that data back at a recovery site.
Quick Overview of Offload Job and VBK Dehydration:
Once a Capacity Tier Extent has been configured, the SOBR Offload Job is enabled. This job is responsible for validating what data is marked to move from the Performance Tier to the Capacity Tier based on two conditions.
- The Policy defining the Operational Restore Window
- If the backup data is part os a sealed backup chain
The first condition is all about setting a policy on how many days you want to keep data locally on the SOBR Performance Tiers which effectively become your landing zone. This is often dictated by customer requirements and now can be used to better design a more efficient approach to local storage with the understanding that the majority of older data will be tiered to Object storage.
The second is around the sealing of backup chains which means they are no longer under transformation. This is explained in this Veeam Help Document and I also go through it in the CFD#5 session video here.
Once those conditions are met, the job starts to dehydrate the local backup files and offload the data into Object Storage leaving a dehydrated shell with only the metadata.
The importance of this process is that because we leave the shell locally with all the metadata contained, we are able to still perform every Veeam Recovery option including Instant VM Recovery and Restore to Azure or AWS.
Resiliency and Disaster Recovery with Cloud Tier:
Looking at the above image of the offload process you can see that the metadata is replicated to the Object Storage as well as the Archive Index which keeps track of which blocks are mapped to what backup file. In fact for every extent we keep a resilient copy of the archive index meaning that if an extent is lost, there is still a reference.
Why this is relevant is because it gives us disaster recovery options in the case of a loss of whole a whole backup site or the loss of an extent. During the synchronization, we download the backup files with metadata located in the object storage repository to the extents and rebuild the data locally before making it available in the backup console.
After the synchronization is complete, all the backups located in object storage will become available as imported jobs and will be displayed under the Backups and Imported in the inventory pane. But what better way to see this in action than a live demo…Below, I have pasted in the Cloud Field Day video that will start at the point that I show the demo. If the auto-start doesn’t kick in correctly the demo starts at the 31:30 minute mark.
When Veeam Backup & Replication 9.5 Update 4 went Generally Available in late January I posted a What’s in it for Service Providers blog. In that post I briefly outlined all the new features and enhancements in Update 4 as it related to our Veeam Cloud and Service Providers. As mentioned each new major feature deserves it’s own seperate post. I’ve covered off the majority of the new feature so far, and today i’m covering what I believe is Veeam’s most innovative feature that has been released of late… The Cloud Tier.
As a reminder here are the top new features and enhancements in Update 4 for VCSPs.
- Cloud Tier
- Cloud Mobility
- vCloud Director Support for Cloud Connect Replication
- Gateway Pools for Cloud Connect
- Tape as a Service for Cloud Connect Backup
- vSphere RBAC Self Service Portal
- External Repository for N2WS
When I was in charge of the architecture and design of Service Provider backup platforms, without question the hardest and most challenging aspect of designing the backend storage was how to facilitate storage consumption and growth. The thirst to backup workloads into the cloud continues to grow and with it comes the growth of that data and the desire to store it for longer. Even yesterday I was talking to a large Veeam Cloud & Service Provider who was experiencing similar challenges with managing their Cloud Connect and IaaS backup repositories.
Cloud Tier in Update 4 fundamentally changes the way in which the initial landing zone for backups is designed. With the ability to offload backup data to cheaper storage the Cloud Tier, which is part of the Scale-Out Backup Repository allows for a more streamlined and efficient Performance Tier of backup repository while leveraging scalable Object Storage for the Capacity Tier.
How it Works:
The innovative technology we have built into this feature allows for data to be stripped out of Veeam backup files (which are part of a sealed chain) and offloaded as blocks of data to Object Storage leaving a dehydrated Veeam backup file on the local extents with just the metadata remaining in place. This is done based on a policy that is set against the Scale-out Backup Repository that dictates the operational restore window of which local storage is used as the primary landing zone for backup data and processed as a Tiering Job every four hours. The result is a space saving, smaller footprint on the local storage without sacrificing any of Veeam’s industry-leading recovery operations. This is what truly sets this feature apart and means that even with data residing in the Capacity Tier, you can still perform:
- Instant VM Recoveries
- Entire computer and disk-level restores
- File-level and item-level restores
- Direct Restore to Amazon EC2, Azure and Azure Stack
What this Means for VCSPs:
Put simply it means that for providers who want to offload backup data to cheaper storage while maintaining a high performance landing zone for more recent backup data to live the Cloud Tier is highly recommended. If there are existing space issues on the local SOBR repositories, implementing Cloud Tier will relieve pressure and in reality allow VCSPs to not have to seek further hardware purchase to expand the storage platforms backing those repositories.
When it comes to Cloud Connect Backup, the fact that Backup Copy Jobs are statistically the most used form of offsite backup sent to VCSPs the potential for savings is significant. Self contained GFS backup files are prime candidates for the Cloud Tier offload and given that they are generally kept for extended periods of time, means that it also represents a large percentage of data stored on repositories.
Having a look below you can see an example of a Cloud Connect Backup Copy job from the VCSP side when browsing from Explorer.
With the small example shown above, VCSPs should be starting to understand the potential impact Cloud Tier can have on the way they design and manage their backup repositories. The the ability to leverage Amazon S3, Azure Blog and any S3 Compatible Object Storage Platform means that VCSPs have the choice in regards to what storage they use for the Capacity Tier. If you are a VCSP and haven’t looked at how Cloud Tier can work for your service offering…what are you waiting for?
Object Storage Repository -> Name given to repository that is backed by Amazon S3, S3, Azure Blob or IBM Cloud
Capacity Tier -> Name given to extent on a SOBR using an Object Storage Repository
Cloud Tier -> Marketing name given to feature in Update 4
With the release of Update 4 for Veeam Backup & Replication 9.5 we introduced the Cloud Tier, which is an extension of the Scale Out Backup Repository (SOBR). The Cloud Tier allows for data to be stripped out of Veeam backup files and offloaded as blocks of data to Object Storage leaving a dehydrated Veeam backup file on the local extents with just the metadata remaining in place. This is done based on a policy that is set against the SOBR that dictates the operational restore window of which local storage is used as the primary landing zone for backup data. The result is a space saving, smaller footprint on the local storage.
Overview of Offload Job:
By default the offload job is run against the data located on the Performance Tier extents of the SOBR every 4 hours. This is a set value that can not be changed. To offload the backup data to the Capacity Tier, the Offload job does the following:
- Verifies whether backup chains located on the Performance Tier extents satisfy validation criteria and can be offloaded to object storage.
- Collects verified backup chains from each Performance Tier extent and sends them directly to object storage in the form of data blocks.
- Saves each session results to the configuration database so that you can review them upon request.
The job and job details can be viewed from the History Menu under System or the Home Menu under Last 24 Hours.
The details of the job will show how much data was offloaded to the Capacity Tier per VM residing on the SOBR. It will show statistics on how much data was processed, read and transferred. Once this job has completed, the local backup files only contain job metadata with the data residing on the Object Storage.
Forcing The Offload Job:
As mentioned, the Offload Job by default is set to run every 4 hours from the creation initial configuration of the Capacity Tier extent on the SOBR. The default value of 4 hours can not be modified however if you want to force the job to run you have two options.
First option is through the UI, under the Backup Infrastructure Menu and under Scale-Out Repositories, do a CONTROL+Click against the SOBR and select the Run Tiering Job Now option. This is hidden by default as an option and will only be shown with the CONTROL+Click
Second option is to run the following PowerShell command:
PS C:\> Start-VBRCapacityTierSync -Repository SOBR-01
CreationTime : 2/21/2019 2:13:22 PM
EndTime : 2/21/2019 2:13:29 PM
JobId : 1e76d6cb-2192-4b25-bc59-adf725523318
Result : Success
State : Stopped
Id : 1f07476c-8419-42dd-b42a-7e221fa19f14
This tiggers the Offload Job to run.
Note that once the Offload Job has been forced the 4 hours counter is reset to when the job was run…ie the next job will be 4 hours from the time the job was forced.
It’s important to understand that running the job on demand doesn’t necessary mean that you will offload data to the Capacity Tier any quicker. The conditions around operations restore window and sealed backup chains still need to be in place for the job to do its thing. Having the job run six times a day (every 4 hours) is generally going to be more than enough for most instances.
If no data has been offloaded, you will see the following in the job details:
Wrap Up and More Cloud Tier:
To learn more about the Cloud Tier head to my veeam.com post here, and also check our Rhys Hammonds post here. Also look out for a new Veeam White Paper being released in the next month or so which will deep dive into the Cloud Tier in more detail. I will post a few more posts on the Cloud Tier over the next few weeks as well looking at some more use cases and features.