Monthly Archives: June 2017

Connecting to Home or Office Networks with Veeam Powered Network

A few weeks ago I wrote an article on how Veeam Powered Network can make accessing your homelab easy with it’s straight forward approach to creating and connection site-to-site and point-to-site VPN connections. Since then I’ve done a couple of webinars on Veeam PN and I was asked a number of times if Veeam PN can be setup without the use of a central hub appliance.

To refresh the use case that I went through in my first post, I wanted to access my homelab/office machines while on the road.

Click here to enlarge.

With the use of the Tunnelblick OpenVPN Client on my MBP I am able to create a point-to-site connection to the Veeam PN HUB which is in turn connected via site-to-site to each of the subnets I want to connect into.

Single Veeam PN Appliance Deployment Model

After fielding a couple of similar questions during the webinars it became apparent that the first use case I described was probably more complicated than it needed to be for the average home office user…that is create a simple point-to-site VPN to allows remote access into the network. This use case can also be used to access a simple (flat) company network for remote users.

In this scenario I want to have access via the OpenVPN endpoint client to my internal network of 192.168.1.0/24 via a single Veeam PN appliance that’s been deployed in my home office network. To go over the Veeam PN deployment process read my first post and also visit this VeeamKB that describes where to get the OVA and how to deploy and configure the appliance for first use.

Components

  • Veeam PN Hub Appliance x 1
  • OpenVPN Client

Networking Requirements

  • Veeam PN Hub Appliance – Incoming Ports UDP 1194, 6179 and TCP 443
  • OpenVPN Client – Outgoing access to at least UDP 6179

In my setup the Veeam PN Hub Appliance has been deployed into VMware Workstation and has picked up a DHCP address. Unlike the Azure Market Place deployment you need to go through an initial configuration wizard to setup the Hub appliance to be ready to accept connections. Go to the Veeam PN URL, enter in the default username and password and click through to the Initial Configuration wizard.

Next step is to configure the SSL certificate that is used for a number of services, but importantly is used to facilitate authentication between the Hub, site and endpoints.

Next step is to configure the Site-to-site and the Point-to-site VPN settings which will be used in the OVPN configuration files that are generated later on.

Once that’s done you are sent to the Veeam PN home dashboard page. In order to have the 192.168.1.0/24 network accessible remotely you need to configure it as a site, as shown below from the Clients menu. This is a bit of a workaround to ensure that the correct static routes are included in the endpoint OVPN configuration files but note that the site will never become connected in the client status window.

To be able to connect into my home office when on the road the final step is to register a standalone client. Again, because Veeam PN is leveraging OpenVPN what we are producing here is an OVPN configuration file that has all the details required to create the point-to-site connection…noting that there isn’t any requirement to enter in a username and password as Veeam PN is authenticating using SSL authentication. As a recap from my previous post, for my MPB I’m using the Tunnelblick OpenVPN Client that I’ve found it to be an excellent client but obviously being OpenVPN there are a bunch of other clients for pretty much any platform you might be running. Once I’ve imported the OVPN configuration file into the client I am able to authenticate against the Hub Appliance endpoint and the home office routing is injected into the network settings.

You can see above that the 192.168.1.0 static route has been added and set to use the tunnel interfaces default gateway which is on the Hub Appliance running in my home office. This means that from my MPB I can now get to any device on that subnets no matter where I am in the world…in this case I can RDP to my Windows workstation, and access other resources on 192.168.1.0/24.

Conclusion:

Summerizing the steps that where taken in order to setup and configure remote access into my home office using Veeam PN:

  • Deploy and configure Veeam PN Hub Appliance
  • Go through initial Hub Network Wizard
  • Register local network as a Site
  • Register Endpoints
  • Setup Endpoint and connect to Hub Appliance

Those five steps took me less than 10 minutes which also took into consideration the OVA deployment as well. The simplicity of the solution is what makes it very useful for home users wanting a quick and easy way to access their systems…but also, as mentioned for configuring external access to simple office networks!

Again, Veeam PN is free and is deployable from the Azure Marketplace to help extend availability for Microsoft Azure…or downloadable in OVA format directly from the veeam.com site.

 

VMworld 2017 – Session Breakdown and Analysis

Everything to do with VMworld this year feels like it’s earlier than in previous years. The call for papers opened in Feburary with session voting happening around the end of March. A couple of weeks ago presenters where notified if their session was accepted…or if it was rejected and the content catalog for the US event went live last week! At the moment there is 736 sessions listed which will grow when the #vBrownBag Tech Talks hosted by the VMTN Community get added.

As I do every year I like to filter through the content catalog and work out what technologies are getting the airplay at the event. What first struck me as being interesting was the track names:

Do you see a common thread? They obviously centre around the “digital transformation” theme that we have been fed at every major conference for the last four to five years. I don’t mind it so much, but I know it’s becoming a bit of an industry joke when we hear the same messaging around transformation, digital workspace and modernization.

Shown above are all the products and topics listed in the content catalog and previously when the public voting took place I did some analysis around the number of sessions relating to the filters shown below.

  • vCD 32
  • vCloud 305
  • vCloud Director 64
  • NSX 426
  • NSX-T 116
  • vSAN 223
  • AWS 51
  • Containers 85
  • Devops 69
  • Automation 223

Using those same filters, below are the numbers from what made the cut and are in the content catalog for 2017.

What’s interesting in looking at the submitted sessions vs what was picked up…to be included in the content catalog for the event if you want a better than even chance of having your session accepted, submit around NSX, NSX-T, vSAN, AWS and Containers. In the case of vSAN and Containers, working with these numbers about 60% of the submitted sessions got approved and in the case of AWS the number of sessions approved was more than what was submitted!

Even though the number of vCD related sessions didn’t make it through the numbers are still well up from the dark days of vCD around the 2013 and 2014 VMworlds. For anyone working on cloud technologies this year promises to be a bumper year for content so if you haven’t registered for VMworld 2017 yet…what are you waiting for!

Register here:

Top vBlog 2017 – Last week to Vote!

While I had resisted the temptation to put out a blog on this years Top vBlog voting I thought with the voting coming to an end it was worth giving it a shout just in case there are some of you who hadn’t had the chance to vote or didn’t know about the Top vBlog vLaunchPad list created and maintained by Eric Siebert of vShere-Land.

This year’s voting has a slightly different format with the total vote being determined by the following:

  • 60% – public voting – general voting – anyone can vote – votes are tallied and weighted for points based on voting rankings as done in past years
  • 20% – private judges scoring – chosen judges who will grade a select group of blogs based on several factors, combined rankings will equal points
  • 10% – number of posts in a year – how much effort a blogger has put into writing posts over the course of a year based on Andreas hard work adding this up each year (aggregator’s excluded)
  • 10% – Google PageSpeed score – how well a blogger has done to build and optimize their site as scored by Google’s PageSpeed tools

As Eric mentions the vBlog voting should be based on blog content based around longevity, length, frequency and quality of the posts. There is an amazing amount of great content that gets created daily by this community and all things aside, this Top vBlog vote goes someway to recognizing the hard work most bloggers put into the creation of content for the community. Special mention to Duncan Epping and Frank Denneman for pulling out of the voting this year to give others a shot at moving up the ranks…it’s a classy move!

Good luck to all those who are listed and for those who haven’t voted yet click on the link below to cast your vote. If you feel inclined and enjoy my content around vCloud Director, Availability, NSX, vSAN and Cloud and Hosting in general…It would be an honor to have you consider anthonyspiteri.net in your Top 12 and also in the Independent Blogger category.

http://topvblog2017.questionpro.com

Thanks again to Eric Siebert.

References:

http://vsphere-land.com/news/voting-now-open-for-top-vblog-2017.html

http://vsphere-land.com/news/coming-soon-top-vblog-2017-with-a-new-scoring-method.html

VMware vSphere 6.5 Host Resources Deep Dive – A Must Have!

Just after I joined Zettagrid in June of 2013 I decided to load up vSphere 5.1 Clustering Deepdive by Duncan Epping and Frank Denneman on my iPad to read on my train journey to and from work. Reading that book allowed me to gain a deeper understanding of vSphere through the in depth content that Duncan and Frank had produced. Any VMware administrator worth their salt would be familiar with the book (or the ones that proceeded it) and it’s still a brilliant read.

Fast forward a few versions of vSphere and we finally have follow up:

VMware vSphere 6.5 Host Resources Deep Dive

This time around Frank has been joined by Niels Hagoort and together they have produced another must have virtualization book…though it goes far beyond VMware virtualization. I was lucky enough to review a couple of chapters of the book and I can say without question that this book will make your brain hurt…but in a good way. It’s the deepest of deep dives and it goes beyond the previous books best practice and dives into a lot of the low level compute, storage and networking fundamentals that a lot of us have either forgotten about, never learnt or never bothered to learn about.

This book explains the concepts and mechanisms behind the physical resource components and the VMkernel resource schedulers, which enables you to:

  • Optimize your workload for current and future Non-Uniform Memory Access (NUMA) systems.
  • Discover how vSphere Balanced Power Management takes advantage of the CPU Turbo Boost functionality, and why High Performance does not.
  • How the 3-DIMMs per Channel configuration results in a 10-20% performance drop.
  • How TLB works and why it is bad to disable large pages in virtualized environments.
  • Why 3D XPoint is perfect for the vSAN caching tier.
  • What queues are and where they live inside the end-to-end storage data paths.
  • Tune VMkernel components to optimize performance for VXLAN network traffic and NFV environments.
  • Why Intel’s Data Plane Development Kit significantly boosts packet processing performance.

If any of you have read Frank’s NUMA Deep Dive blog series you will start to get an appreciation of the level of technical detail this book covers, however it is written in a way that allows you absorb the information in a way that is digestible, though some parts may need to be read twice over. Well done to Frank and Niels on getting this book out and again, if you are working in and around anything to do with computers this is a must read so do yourself a favour and grab a copy.

The current Amazon locals that have access to purchase the book can be found below:

Amazon US: http://www.amazon.com/dp/1540873064
Amazon France: https://www.amazon.fr/dp/1540873064
Amazon Germany: https://www.amazon.de/dp/1540873064
Amazon India: http://www.amazon.in/dp/1540873064
Amazon Japan: https://www.amazon.co.jp/dp/1540873064
Amazon Mexico: https://www.amazon.com.mx/dp/1540873064
Amazon Spain: https://www.amazon.es/dp/1540873064
Amazon UK: https://www.amazon.co.uk/dp/1540873064

CPU Overallocation and Poor Network Performance in vCD – Beware of Resource Pools

For the longest time all VMware administrators have been told that resource pools are not folders and that they should only be used under circumstances where the impact of applying the resource settings is fully understood. From my point of view I’ve been able to utilize resource pools for VM management without too much hassle since I first started working on VMware Managed Service platforms and from a managed services point of view they are a lot easier to use as organizational “folders” than vSphere folders themselves. For me, as long as the CPU and Memory Resources Unlimited checkbox was ticked nothing bad happened.

Working with vCloud Director however, resource pools are heavily utilized as the control mechanism for resource allocation, sharing and management. It’s still a topic that can cause confusion when trying to wrap ones head around the different allocation models vCD offers. I still reference blog posts from Duncan Epping and Frank Denneman written nearly seven years ago to refresh my memory every now and then.

Before moving onto an example of how overallocation or client undersizing in vCloud Director can cause serious performance issues it’s worth having a read of this post by Frank that goes through in typical Frank detail around what resource management looks like in vCloud Director.

Proper Resource management is very complicated in a Virtual Infrastructure or vCloud environment. Each allocation models uses a different combination of resource allocation settings on both Resource Pool and Virtual Machine level

Undersized vDCs Causing Network Throughput Issue:

The Allocation Pool model was the one that I worked with the most and it used to throw up a few client related issues when I worked at Zetttagrid. When using the Allocation Pool method which is the default model you are specifying the amount of resources for your Org vDC and also specifying how much of these resources are guaranteed. The guarantee means that a reservation will be set and that the amount of guaranteed resources is taken from the Provider vDC. The total amount of resources specified is the upper boundary, which is also the resource pool limit.

Because tenants where able to purchase Virtual Datacenters of any size there was a number of occasions where the tenants undersized their resources. Specifically, one tenant came to us complaining about poor network performance during a copy operation between VMs in their vDC. At first the operations team thought that is was the network causing issues…we where also running NSX and these VMs where also on a VXLAN segment so fingers where being pointed there as well.

Eventually, after a bit of troubleshooting we where able to replicate the problem…it was related to the resources that the tenant had purchased or lack thereof. In a nutshell because the allocation pool model allows the over provisioning or resources not enough vCPU was purchased. The vDC resource pool had 1000Mhz of vCPU with a 0% reservation but he had created 4 dual vCPU VMs. When the network copy job started it consumed CPU which in turn exhausted the vCD CPU allocation.

What happened next can be seen in the video below…

With the resource pool constrained ready time is introduced to throttle the CPU which in turn impacts the network throughput. As shown in the video when the resource pool has the the unlimited button checked the ready goes away and the network throughput returns to normal.

Conclusion:

Again, its worth checking out the impact on the network throughput in the video as it clearly shows what happens what tenants underprovision or overallocate their Virtual Datacenters in vCloud Director. Outside of vCloud Director it’s also handy to understand the impact of applying reservations on Resource Pools in terms of VM compute and networking performance.

It’s not always the network!

References:

http://www.vmware.com/resources/techresources/10325

http://frankdenneman.nl/2010/09/24/provider-vdc-cluster-or-resource-pool/

http://www.yellow-bricks.com/2012/02/28/resource-pool-shares-dont-make-sense-with-vcloud-director/

https://kb.vmware.com/kb/2006684

Allocation Pool Organization vDC Changes in vCloud Director 5.1

VMworld 2017 – #vGolf Las Vegas

[Please head to this page for updated information]

#vGolf is back! Bigger and better than the inaugural #vGolf held last year at VMworld 2016!

Last year we had 24 participants and everyone who attended had a blast at the majestic Bali Hai Golf complex which is in view of the VMworld 2017 venue, Mandalay Bay. This year the event will expand with more sponsors and a more structured golfing competition with prizes going out for the top 2 placed two ball teams.

Details will be updated on this site and on the Eventbrite page once the day is finalised. For the moment, if you are interested please reserve your spot by securing a ticket. At this stage there are 32 spots, but depending on popularity that could be extended.

Last year the golfing fee’s where heavily subsidised to $40 USD per person (green fees usually $130-150) thanks to the sponsors and I expect the same or lower depending on final sponsorship numbers this year. For now, please head to the Eventbrite page and reserve your ticket and wait for further updates as we get closer to the event.

Registration Page

There is a password on the registration page to protect against people registering directly via the public page. The password is vGolf2017Vegas. I’m looking forward to seeing you all there bright and early on Sunday morning!

Take a look at what awaits you…don’t miss out!

Sponsorship Call:

If you, or your company can offer some sponsorship for the event, please email [email protected] to discuss arrangements. I am looking to subsidise most of the green fee’s if possible and for that we would need four to five sponsors.

Important ESXi 6.0 Patch – Upgrade Now!

Last week VMware released a new patch (ESXi 6.0 Build 5572656) that addresses a number of serious bugs with Snapshot operations. Usually I wouldn’t blog about a patch release, but when I looked through the rest of the fixes in the VMwareKB it was apparent to me that this was more than your average VMware patch and addresses a number of issues around storage but again, a lot around Snapshot operations which is so critical to most VM backup operations.

Here are some of the key resolutions that I’ve picked out from the patch release:

  • When you take a snapshot of a virtual machine, the virtual machine might become unresponsive
  • After you create a virtual machine snapshot of a SEsparse format, you might hit a rare race condition if there are significant but varying write IOPS to the snapshot. This race condition might make the ESXi host stop responding
  • Because of a memory leak, the hostd process might crash with the following error: Memory exceeds hard limit. Panic. The hostd logs report numerous errors such as Unable to build Durable Name. This kind of memory leak causes the host to get disconnected from vCenter Server
  • Using SESparse for both creating snapshots and cloning of virtual machines, might cause a corrupted Guest OS file system
  • During snapshot consolidation a precise calculation might be performed to determine the storage space required to perform the consolidation. This precise calculation can cause the virtual machine to stop responding, because it takes a long time to complete
  • Virtual Machines with SEsparse based snapshots might stop responding, during I/O operations with a specific type of I/O workload in multiple threads
  • When you reboot the ESXi host under the following conditions, the host might fail with a purple diagnostic screen and a PCPU xxx: no heartbeat error.
    • You use the vSphere Network Appliance (DVFilter) in an NSX environment
    • You migrate a virtual machine with vMotion under DVFilter control
  • Windows 2012 domain controller supports SMBv2, whereas Likewise stack on ESXi supports only SMBv1. With this release, the likewise stack on ESXi is enabled to support SMBv2
  • When the unmap commands fail, the ESXi host might stop responding due to a memory leak in the failure path. You might receive the following error message in the vmkernel.log file: FSDisk: 300: Issue of delete blocks failed [sync:0] and the host gets unresponsive.
  • In case you use SEsparse and enable unmapping operation to create snapshots and clones of virtual machines, after the wipe operation (the storage unmapping) is completed, the file system of the guest OS might be corrupt. The full clone of the virtual machine performs well.

There is also a number of vSAN related fixes in the patch so overall it’s worth looking to apply this patch as soon as is possible.

References:

https://kb.vmware.com/kb/2149955

Attack from the Inside – Protecting Against Rogue Admins

In July of 2011, Distribute.IT, a domain registration and web hosting services provider in Australia was was hit with a targeted, malicious attack that resulted in the company going under and their customers left without their hosting or VPS data. The attack was calculated, targeted and vicious in it’s execution… I remember the incident well as I was working for Anittel at the time and we where offering similar services…everyone in the hosting organization was concerned when starting to think about the impact a similar attack would have within our systems.

“Hackers got into our network and were able to destroy a lot of data. It was all done in a logical order – knowing exactly where the critical stuff was and deleting that first,”

While it was reported at the time that a hacker got into the network, the way in which the attack was executed pointed to an inside job and all though it was never proved to be so it almost 100% certain that the attacker was a disgruntled ex-employee. The very real issue of an inside attack has popped up again…this time Verelox, a hosting company out of the Netherlands has effectively been taken out of business with a confirmed attack from within by an ex-employee.

My heart sinks when I read of situations like this and for me, it was the only thing that truely kept me up at night as someone who was ultimately responsible for similar hosting platforms. I could deal and probably reconcile with myself if I found myself in a situation where a piece of hardware failed causing data loss…but if an attacker had caused the data loss then all bets would have been off and I might have found myself scrambling to save face and along with others in the organization, may well have been searching for a new company…or worse a new career!

What Can Be Done at an Technical Level?

Knowing a lot about how hosting and cloud service providers operate my feeling is that 90% of organizations out there are not prepared for such attacks and are at the mercy of an attack from the inside…either by a current or ex-employee. Taking that a step further there are plenty that are at risk of an attack from the inside perpetrated by external malicious individuals. This is where the principal of least privileged access needs to be taken to the nth degree. Clear separation of operational and physical layers needs to be considered as well to ensure that if systems are attacked, not everything can be taken down at once.

Implementing some form of certification or compliancy such as ISO 27001, SOC and iRAP will force companies to become more vigilant through the stringent processes and controls that are forced upon companies once they meet compliancy. This in turn naturally leads to better and more complete disaster and business continuity scenarios that are written down and require testing and validation in order to pass certification.

From a backup point of view, these days with most systems being virtual it’s important to consider a backup strategy that not only looks to make use of the 3-2-1 rule of backups, but also look to implement some form of air-gapped backups that in theory are completely seperate and unaccessible from production networks, meaning that only a few very trusted employees have access to the backup and restore media. In practice implementing a complete air-gapped solution is complex and potentially costly and this is where service providers are chancing their futures on scenarios that have a small percentage chance of happening however the likelihood of that scenario playing out is greater than it’s ever been.

In a situation like Verelox, I wonder if, like most IaaS providers they didn’t backup all client workloads by default, meaning that backup services was an additional service charge that some customers didn’t know about…that said, if backup systems are wiped clean is there any use of having those services anyway? That is to say…is there a backup of the backup? This being the case I also believe that businesses need to start looking at cross cloud backups and not rely solely on their providers backup systems. Something like the Veeam Agent’s or Cloud Connect can help here.

So What Can Be Done at an Employee Level?

The more I think about the possible answer to this question, the more I believe that service providers can’t fully protect themselves from such internal attacks. At some point trust supersedes all else and no amount of vetting or process can stop someone with the right sort of access doing damage. To that end making sure that you are looking after your employee’s is probably the best defence against someone feeling aggrieved enough to carry out an malicious attack such as the one Verelox has just gone through. In addition to looking after employee’s well being it’s also a good idea to…within reason, keep tabs on an employee’s state in life in general. Are they going through any personal issues that might make them unstable, or have they been done wrong by someone else within the company? Generally social issues should be picked up during the hiring process, but complete vetting of employee stability is always going to be a lottery.

Conclusion

As mentioned above, this type of attack is a worst case scenario for every service provider that operates today…there are steps that can be taken to minimize the impact and protect against an employee getting to the point where they choose to do damage but my feeling is we haven’t seen the last of these attacks and unfortunately more will suffer…so where you can, try to implement policy and procedure to protect and then recover when or if they do happen.

Vote for your favorite blogs at vSphere-land!

Top vBlog Voting 2017

Resources:

https://www.crn.com.au/news/devastating-cyber-attack-turns-melbourne-victim-into-evangelist-397067/page1

https://www.itnews.com.au/news/distributeit-hit-by-malicious-attack-260306

https://news.ycombinator.com/item?id=14522181

Verelox (Netherlands hosting company) servers wiped by ex-admin from sysadmin