Monthly Archives: February 2019

Update 4 for Service Providers – Tenant Connectivity with Cloud Connect Gateway Pools

When Veeam Backup & Replication 9.5 Update 4 went Generally Available a couple of weeks ago I posted a What’s in it for Service Providers blog. In that post I briefly outlined all the new features and enhancements in Update 4 as it related to our Veeam Cloud and Service Providers. As mentioned each new major feature deserves it’s own seperate post. I’ve covered off Tape as a Service and RBAC Self Service, and today i’m focusing on a much requested feature…Cloud Connect Gateway Pools

As a reminder here are the top new features and enhancements in Update 4 for VCSPs.

Gateway Pools for Cloud Connect

Cloud Connect has become the central mechanism for connectivity and communication between multiple Veeam services. When first launched with Cloud Connect Backup in v8 of Backup & Replication, the Cloud Connect Gateways where used for all secure communications between tenant backup server instances and the Veeam Cloud and Service Provider (VCSP) Cloud Connect backup infrastructure. This expanded to support Cloud Connect Replication in v9 and from there we have added multiple products that rely on communications brokered by Cloud Connect Gateways.

As of today Cloud Connect Gateways facilitate:

  • Cloud Connect Backup
  • Cloud Connect Replication
  • Full and Partial Failovers for Cloud Connect Replication
  • Remote Console Access
  • Veeam Availability Console Tenant and Agent Management
  • Veeam Backup for Microsoft Office 365 Self Service

With regards to acting as the broker for Cloud Connect Backup or Replication, prior to Update 4 the only way in which a VCSP could design and deploy the Gateways was in an all or nothing approach when it came to configuring the IP address and DNS for the service endpoint. When considering VCSPs that also provide connectivity such as MPLS for their customers it meant that to leverage direct connections that might be private the options where to either use the public address or setup a whole new Cloud Connect environment for the customer.

Now with Update 4 and Gateway Pools a VCSP can configure one or many Gateway Pools and allocate one or more Cloud Connect Gateways to those pools. From there, tenants can be assigned to Gateway Pools.

Cloud Gateways in a Gateway Pool operate no differently to regular Cloud Gateways. As with previous Cloud Gateways, If the primary gateway is unavailable, the logic built into Veeam Backup & Replication will failover to another Cloud Gateway in the same pool.

If tenants are not assigned a Cloud Gateway Pool they can use only gateways that are not a part of any cloud gateway pool. That situation is warned in the UI when configuring the gateways.

Wrap Up:

The introduction of Cloud Connect Gateway Pools un Update 4 was undertaken due to direct feedback from our VCSPs who wanted more flexibility in the way in which the Cloud Gateways where deployed and configured for customers. Not only can they be used to seperate tenants connecting from public and private networks, but they can also be used for Quality of Service by assigning a Gateway Pool to specific tenants. They can also be used to control access into a VCSPs Cloud Connect infrastructure if located in different geographic locations.

For a great overview and design considerations of Cloud Connect Gateway Pools and Gateways themselves, check out Luca’s Cloud Connect Book here.

References:

https://helpcenter.veeam.com/docs/backup/cloud/cloud_gateway_pool.html?ver=95u4

Quick Look: Cloud Tier SOBR Offload Job

With the release of Update 4 for Veeam Backup & Replication 9.5 we introduced the Cloud Tier, which is an extension of the Scale Out Backup Repository (SOBR). The Cloud Tier allows for data to be stripped out of Veeam backup files and offloaded as blocks of data to Object Storage leaving a dehydrated Veeam backup file on the local extents with just the metadata remaining in place. This is done based on a policy that is set against the SOBR that dictates the operational restore window of which local storage is used as the primary landing zone for backup data. The result is a space saving, smaller footprint on the local storage.

Overview of Offload Job:

By default the offload job is run against the data located on the Performance Tier extents of the SOBR every 4 hours. This is a set value that can not be changed. To offload the backup data to the Capacity Tier, the Offload job does the following:

  • Verifies whether backup chains located on the Performance Tier extents satisfy validation criteria and can be offloaded to object storage.
  • Collects verified backup chains from each Performance Tier extent and sends them directly to object storage in the form of data blocks.
  • Saves each session results to the configuration database so that you can review them upon request.

The job and job details can be viewed from the History Menu under System or the Home Menu under Last 24 Hours.

The details of the job will show how much data was offloaded to the Capacity Tier per VM residing on the SOBR. It will show statistics on how much data was processed, read and transferred. Once this job has completed, the local backup files only contain job metadata with the data residing on the Object Storage.

Forcing The Offload Job:

As mentioned, the Offload Job by default is set to run every 4 hours from the creation initial configuration of the Capacity Tier extent on the SOBR. The default value of 4 hours can not be modified however if you want to force the job to run you have two options.

First option is through the UI, under the Backup Infrastructure Menu and under Scale-Out Repositories, do a CONTROL+Click against the SOBR and select the Run Tiering Job Now option. This is hidden by default as an option and will only be shown with the CONTROL+Click

Second option is to run the following PowerShell command:

This tiggers the Offload Job to run.

Note that once the Offload Job has been forced the 4 hours counter is reset to when the job was run…ie the next job will be 4 hours from the time the job was forced.

It’s important to understand that running the job on demand doesn’t necessary mean that you will offload data to the Capacity Tier any quicker. The conditions around operations restore window and sealed backup chains still need to be in place for the job to do its thing. Having the job run six times a day (every 4 hours) is generally going to be more than enough for most instances.

If no data has been offloaded, you will see the following in the job details:

Wrap Up and More Cloud Tier:

To learn more about the Cloud Tier head to my veeam.com post here, and also check our Rhys Hammonds post here. Also look out for a new Veeam White Paper being released in the next month or so which will deep dive into the Cloud Tier in more detail. I will post a few more posts on the Cloud Tier over the next few weeks as well looking at some more use cases and features.

References:

https://helpcenter.veeam.com/docs/backup/vsphere/capacity_tier.html?ver=95u4

 

 

Update 4 for Service Providers – Self Service Backup through RBAC for vSphere

When Veeam Backup & Replication 9.5 Update 4 went Generally Available a couple of weeks ago I posted a What’s in it for Service Providers blog. In that post I briefly outlined all the new features and enhancements in Update 4 as it related to our Veeam Cloud and Service Providers. As mentioned each new major feature deserves it’s own seperate post. I started last week with a look at Tape as a Service and today i’m looking at another underrated feature…vSphere RBAC Self Service Portal.

As a reminder here are the top new features and enhancements in Update 4 for VCSPs.

vSphere RBAC Self Service Portal:

When Veeam Backup & Replication 9.5 was released one of the top new features was the vCloud Director Self Service Portal. This was aimed at our Veeam Cloud & Service Providers that leverage vCloud Director as their Cloud Management Platform to offer self service capabilities. The portal was part of Veeam Enterprise Manager and uses vCloud Director Organizations and leverages vCloud Director authentication.

For Update 4, we have used this feature as a base to release the vSphere RBAC Self Service Portal. This has been primarily marketed as a non service provider feature that enterprises can use to drive self service backup internally.

My fellow Product Strategy Technologist, Melissa Wright (@vmiss) has released a great overview of the vSphere RBAC Self Service Portal here. She goes through the setup and configuration and takes a look at how to configure users and permissions and shows the power of the feature as it pertains to enterprise customers.

RBAC for vSphere IaaS:

The great thing about this new portal is that it can be used either in conjunction with the vCloud Director Self Service Portal or standalone in the case that a service provider is not running vCloud Director. That is where this portal will come into play…while there are a number of VCSPs that do run vCloud Director the large majority of service providers or managed service providers do not. If they are running IaaS off native vSphere, the portal can be used to offer self service backup and recovery to their tenants.

The self service permissions can be retrofitted to existing vCenter permissions or can be started fresh by using vSphere Tags. Personally, I believe the vSphere Tags is the best way to configure the multi-tenancy aspect of the configuration. In the setup, tags are matched to users which will dictate what tenants will be able to see and select when they log in.

Tenant Functions:

Tenants get access to the self service web portal which the VCSP makes available externally. Depending on the user roles and permissions that have been configured, they can select virtual machines to manage backup jobs, as well as restore VMs, files and application items within the bounds of their permissions. Tenants can also a manage retention, schedule and notification settings as well as guest OS processing options.

To simplify job management for the tenants, advanced job parameters (like backup mode and repository settings) are automatically populated from the job templates if desired.

Wrap Up:

Once again, the vSphere RBAC Self Service Portal is one of the sleeper hits of Update 4 for Veeam Backup & Replication 9.5 and should be considered by all VCSPs to offer a level of self service capability to their tenants. The way in which this has been implemented on the back of Enterprise Manager with a one to many portal means this is the best self service portal for IaaS and/or vCloud Director…also we do not need specialised appliances per tenant which is a massive up side on how Veeam differentiates itself in this space.

References:

https://vmiss.net/2019/02/14/veeam-enterprise-manager-self-service-vsphere/amp/

https://helpcenter.veeam.com/docs/backup/em/em_working_with_vsphere_portal.html?ver=95u4

Update 4 for Service Providers – Tape as a Service

When Veeam Backup & Replication 9.5 Update 4 went Generally Available a couple of weeks ago I posted a What’s in it for Service Providers blog. In that post I briefly outlined all the new features and enhancements in Update 4 that pertain to our Veeam Cloud and Service Providers. As mentioned each new major feature deserves it’s own seperate post and today I’m kicking off the series with what I feel was probably the least talked about new feature in Update 4…Tape as a Service for Cloud Connect Backup.

As a reminder here are the top new features and enhancements in Update 4 for VCSPs.

Tape as a Service for Cloud Connect Backup:

When we introduced Cloud Connect Backup in version 8 of Backup & Replication we offered the ability for VCSPs to offer a secure, remote offsite repository for their tenants. When thinking about air-gapped backups…though protected at the VCSP end, ultimate control for what was backed up to the Cloud Repository is in the hands of the tenant. From the tenant’s server they could manipulate the backups stored via policy or a malicious user could gain access to the server and delete the offsite copies.

In Update 3 of Backup & Replication 9.5 we added Insider Protection to Cloud Connect Backup, which allowed the VCSP to put a policy on the tenant’s Cloud Repository that would protect backups from a malicious attack. With this option enabled, when a backup or a specific restore point in the backup chain is deleted or aged out from the cloud repository. The actual backup files are not deleted immediately, instead, they are moved to a _RecycleBin folder on the repositories.

In Update 4 we have taken that a step further to add true air-gapped backup options that VCSPs can create services around for longer term retention with the Tenant to Tape feature. This allows a VCSP to offer additional level of data protection for their tenants. The tenant sends a copy of the backup data to their cloud repository, and the VCSP then configures backup to tape to send another copy to the tape media. If there is a situation that requires recovery if data in the cloud repository becomes unavailable, the VCSP can initiate a restore from tape.

VCSPs can also offer a tape out services to help their tenants achieve compliance and internal policies without maintaining their own tape infrastructure. Tapes can be stored by the service providers, or shipped back to tenant as shown in the diagram below.

To take advantage of this new Update 4 feature VCSPs will need to configure Tape Infrastructure on the Cloud Connect server. What’s great about Veeam is that we have the option to use traditional tape infrastructure or take advantage of Virtual Tape Libraries (VTLs) which can then be backed by Object Storage such as Amazon S3. I am not going to walk through that process in this post, there are a number of blogs and White Papers available that guide you on the setup of an Amazon Storage Gateway to use as a VTL.

Once the Tape Infrastructure is in place, as a VCSP with a Cloud Connect license when you upgrade to Update 4, under Tape Infrastructure you will see a new option called Tenant to Tape.

A tenant backup to tape job is a variant of a backup to tape job targeted at a GFS Media Pool which is available for Veeam customers with regular licensing. What’s interesting about this feature is that there are a number of options that allow flexibility on how the jobs are created which also leads to a change of use case for the feature depending on which option is chosen.

Choosing Backup Jobs will allow VCSPs to add any jobs that may be registered on the Cloud Connect server…though in reality there shouldn’t be any configured due to licensing constraints. The other two options provide the different use cases.

Backup Repositories:

This allows the VCSP to backup to tape one or more cloud repositories that can contain one or multiple tenants. The can allow the VCSP to backup the Cloud Connect repository in whole to an offsite location for longer term retention.

The ability to archive tenant Cloud Connect Backups to tape can help VCSPs protect their own infrastructure against disasters that may result in loss of tenant data. It can be used as another level of revenue generating service. As an example, there could be two service offerings for Cloud Connect Backup… one with a basic SLA which only has one copy of the backup data stored… and another with an advanced SLA that has data saved in two locations…the Cloud Connect Repository and the tape media. 

Tenants:

This option offers a lot more granularity and gives the VCSP the ability to offer an additional level of protection on a per tenant level. In fact you can also drill down to the Tenant repository level and select individual repositories if tenants have more than one configured.

Again, this can be done per tenant, or there can be one master job for all tenants.

It’s important to understand that all tasks within the tenant backup to tape feature are performed by the VCSP. Unless the VCSP has created a portal that has information about the jobs, the tenant is generally unaware of the tape infrastructure and the tenant can’t view or manage backup to tape jobs configured or perform operations with backups created by these jobs. There is scope for VCSPs to integrate such jobs and actions into their automation portals for self service.

Restores:

VCSPs can restore tenant data from tape for one tenant or more tenants at the same time. The restore can go to the original location or to a new location or be exported to backup files on local disk

Wrap Up:

Tenant to Tape or Tape as a Service for Cloud Connect Backup was a feature that didn’t get much airplay in the lead-up to the Update 4 launch, however it give VCSPs more options to protect tenant data and truly offer an air-gapped solution to better protect that data.

References:

https://www.veeam.com/wp-using-aws-vtl-gateway-deployment-guide.html

https://aws.amazon.com/about-aws/whats-new/2016/08/backup-and-archive-to-aws-storage-gateway-vtl-with-veeam-backup-and-replication-v9/

Configuring Amazon S3 Access from VMware Cloud on AWS through an S3 Endpoint

When looking at how to configure networking for interactions between a VMware Cloud on AWS SDDC and an Amazon VPC there is a little bit to grasp in terms of what needs to be done to achieve traffic flow between the SDDC and the rest of the world.

As an example, by default if you want to connect to S3 the default configuration is to go through the Amazon ENI (Elastic Network Interface) which means that unless configured correctly, connectively to Amazon S3 will fail. Brian Gaff has a really good series of posts on Networking and Security Groups when working on VMware Cloud on AWS and are worth a read to get a deeper understanding of VMC to AWS networking.

There is a way to change this behaviour to make connectivity to Amazon S3 connect via the SDDCs Internet Gateway. This is done through the VMware Cloud Portal by going to the Networking section of the relevant SDDC.

Doing this, while easy enough means that you loose a lot of the benefits that passing traffic through the ENI provides. That is a high-bandwidth, low latency connection between the VPC and the SDDC which also provides free egress. In the case of S3 and the utilising the Veeam Cloud Tier it means more optimal connectivity between a Veeam Backup & Replication instance hosted in the SDDC and Amazon S3.

To allow communication between the SDDC and Amazon S3 over the ENI the following needs to be actioned.

Create Endpoint:

First step is to go into the AWS Console, go to the VPC thats connected to the VMC service and create a new Endpoint for S3 as shown below making sure you select the correct Route Table.

Configure Security Group:

Next is to configure the Security Group associated with your VPC to allow traffic to the logical network or networks. It’s a basic HTTPS Inbound rule where your source is the SDDN network or networks you want access from.

Create Compute Gateway Firewall Rule:

The final step is to configure a firewall rule on the SDDC Compute Gateway to allow HTTPS traffic to the Amazon VPC from the network or networks you want access to Amazon S3 from.

That’s pretty much it! After that, you should be able to access Amazon S3 over the ENI and get all the benefits that delivers.

References:

https://docs.vmware.com/en/VMware-Cloud-on-AWS/services/com.vmware.vmc-aws-operations/GUID-B501FA3C-EAF9-4005-AC72-155C3F592281.html

How to Copy Amazon S3 Buckets with AWS CLI

I am doing some work on validated restore scenarios using the new Veeam Cloud Tier that backed by an Object Storage Repository pointing at an Amazon S3 Bucket. So that I am not messing with the live data I wanted a way to copy and access the objects from another bucket or folder. There is no option at the moment to achieve this via the AWS Console, however it can be done via the AWS CLI.

First step was to ensure I had the AWS CLI installed on my MBP and it was at the latest version:

For the first part of the copy process, I cheated and created a new Bucket from the AWS Console that was based on the one I wanted to copy.

Next step is to make sure that the AWS CLI is configured with the correct AWS Access and Secret keys. Once done, the command to copy/sync buckets is a simple one.

Obviously the time to complete the operation will depend on the amount of Objects in the Bucket and whether its cross region or local. It took about 4 hours to copy across ~50GB of data from US-EAST-2 to US-WEST-2 going at about 4MB/s. By default the process is shown on the screen.

Once the first pass was complete I ran the same command again which will this time look for differences between the source and destination and only sync the differences. You can run the command below to view the Total Objects and Total Size of both buckets for comparison.

That is it! Pretty simple process. I’ll blog around the actual reason behind the Veeam Cloud Tier requirement and put this into action at a later date!

References:

https://docs.aws.amazon.com/cli/latest/userguide/install-macos.html

https://aws.amazon.com/premiumsupport/knowledge-center/move-objects-s3-bucket

More Than Meets the Eye… Veeam Backup Performance

Recently I was sent a link to a video that showed an end user comparing Veeam to a competitors offering covering backup performance, restore capabilities and UI. It mainly focused on the comparison of incremental backup jobs and their completion times. It showed that the Veeam job was taking a significantly longer time to complete for the same dataset. The comparison was chalk and cheese and didn’t paint Veeam in a very good light.

Now, without knowing 100% the backend configuration that the user was testing against or the configuration of the Veeam components, storage platforms and backup jobs vs the competitors setup…the discrepancy between both job completion times was too great and something had to be amiss. This was not an apples to apples comparison.

TL:DR – I was able to cut the time to complete an incremental backup job from 24 minutes to under 4 minutes by scaling out Veeam infrastructure components and tweaking transport mode options to suit the dataset from using the default configuration settings and server setup. Lesson being to not take inferred performance at face value, there are a lot of factors that go into backup speed.

Before I continue, it’s important for me to state that I have seen Veeam perform exceptionally well under a number of different scenarios and know from my own experience at my previous roles at large service providers that it can handle 1000s of VMs and scale up to handle larger environments. That said, like any environment you need to understand how to properly scope and size backup components to suite…that includes more than just the backup server and veeam components… storage obviously plays a huge role in backup performance as does the design of the virtualisation platform as well as networking.

I haven’t set out in this post to put together a guide on how to scale Veeam…rather I have focused on trying to debunk the differential in job completion time I saw in the video. I went into my lab and started to think about how scaling Veeam components and choosing different options for backups and proxies can hugely impact the time it takes for backup jobs to complete. For the testing I used a Veeam Backup & Replication server that I had deployed with the Update 4 release and had active jobs that where in operation for more than a month.

The Veeam Backup & Replication server is on a VMware Virtual Machine running on modest 2vCPU and 8GB of RAM. Initially I had this running as an all in one Backup Server and Proxy setup. I have a SOBR repository consisting of two ReFS formatted local VMDK (underlying storage is vSAN) extents and a Capacity Tier extent going to Amazon S3. The backup job consisted of nine VMs with a footprint of about 162GB. A small dataset but one which was based of real world workloads. The job was running Forward Incremental, keeping 14 restore points running every 4 hours with a Synthetic Full running every 24 hours (initial purpose of was to demo Cloud Tier) and on average the incremental’s where taking between 23 to 25 minutes to complete.

The time to complete the incremental job was not an issue for me in the lab, but it provided a good opportunity to test out what would happen if I looked to scale out the Veeam components and tweak the default configuration settings.

Adding Proxies

As a first step I deployed three virtual proxies (2vCPU and 4GB RAM) into the environment and configured the job to use them in hot-add mode. Right away the job time decreased down by ~50% to 12 minutes. Basically, more proxies means more disks are able to be processed in parallel when in hot-add mode so it’s logical that the speed of the backup would increase.

Adding More Proxies

As a second step I deployed three more proxies into the environment and configured the job to use all six in hot-add mode. This didn’t result in a significantly faster time to what it was at three proxies, but again, this will vary depending on the amount of VMs and size of those VMs disks in a job. Again, Veeam offers the flexibility to scale and grow with the environment. This is not a one size fits all approach and you are not locked into a particular appliance size that may max out requiring additional significant spend.

Change Transport Mode

Next I changed the job back to use three proxies, but this time I forced the proxies to use network mode. To read more about Transport modes, head here.

This resulted in a sub 4 minute job completion to read a similar incremental data set as the previous runs. A ~20 minute difference after just a few tweaks of the configuration!

Removing Surplus Proxies and Balancing Things Out

For the example above I introduced proxies however the right balance of proxies and network mode was the most optimal configuration for this particular job in order to lower the job completion window. In fact in my last test I was able to get the job to complete consistently around the 5 minute mark by just using the one proxy with network mode.

Conclusion:

So with that, you can see that by tweaking some settings and scaling out Veeam components I was able to bring a job completion time down by more than 20 minutes. Veeam offers the flexibility to scale and grow with any environment. This is not a one size fits all approach and you are not locked into a particular appliance size that will scale out requiring additional and significant spend while also locking you in by way of restricted backup date portability. Again, this is just a quick example of what can be done with the flexibility of the Veeam platform and that what you see as a default out of the box experience (or a poorly configured/problematic environment) isn’t what should be expected for all use cases. Milage will vary…but don’t let first/misleading impressions sway you…there is always more than meets the eye!

Sources:

https://bp.veeam.expert/