Category Archives: General

VMworld 2017: Don’t Take it for Granted!

This time next week VMworld 2017 will be kicking off with the Sunday evening Welcome Reception among other sponsor and community events and for me, it will mark my fifth VMworld since 2012 having only missed the 2013 event. It’s become an annual pilgrimage to the west coast of the US so much so that my wife locks in the dates at the beginning of every year. It just so happens that Father’s Day in Australia is the Sunday after VMworld and it’s also around the time of my wedding anniversary…so if anything, VMworld reminds to take time out from the event and pick up that year’s anniversary gift.

Having been lucky enough to attend five out of the last six VMworld’s it has almost become automatic that I am at the event, and it could be easy for me to take VMworld for granted. I am very mindful of the fact that while the event is starting to loose a little bit of it’s perceived shine in certain circles it’s still the #1 Information Technology Industry Ecosystem event of the year and with that it’s still the must attend event for IT professionals, customers, partners and vendors alike.

I am also mindful of the fact that even after attending so many VMworld’s to not waste the opportunity that presents it’s self as an attendee. If I think back to my first VMworld in 2012, I still remember being somewhat timid and reluctant to participate in not much more than the sessions and official parties however the one thing I did do was observe how others where using the event to their advantage. While there is brilliant technology to be uncovered and lots of learning to be done, those that have been do VMworld before come to understand that networking is a primary benefit of attending and the networking should be milked for all it’s worth!

Someone told me while at VMworld 2014 that “you never know who is interviewing you”. This is very true and should be something that first timers and regulars understand and use to their advantage as a mechanism for potential career advancement…there is no better event to rub shoulders with industry peers, community leaders a tech rockstars. With that you should always be aware of your surroundings and not to waste any opportunity the may present it’s self. I’m not saying that you will get a new role just by attending and seeking out conversation..but what I am saying is to constantly be on your game!

Even for those, like me that have been lucky enough to attend multiple VMworld’s it’s easy to fly in and just go with the flow. Easy to not appreciate what it means to be there and easy to turn it into a week long drinking event. So my closing message is for everyone attending VMworld this year, be it your 10th or you 1st is to make sure you maximize everything that VMworld has to offer. Take advantage of the opportunity to not only get exposure to new technologies and products but also to network and realize the value that being at such an event offers. You never know when this VMworld could be your last…

Don’t take it for granted!

Runecast: Overview and Service Provider Use Case

A few months ago I was lucky enough to spend time with a couple of the founders of Runecast, Stanimir Markov and Ched Smokovic and got to know a little more about their real time analytics platform for VMware based infrastructure. Soon after that I downloaded and deployed it in my lab and have been running it for a few months. In that time I’ve come to understand and appreciate the value that it adds to the operations and management of any vSphere platform.

Having been part of, and led teams that operated and managed large vSphere based cloud platforms one of the challenges of managing any platform of size is how to stay on top of issues operationally…not only when and as they happen, but also before then happen. Proactive monitoring and alerting that pinpoints issues before they happen is invaluable and up to this point I haven’t found a product that focuses in as specifically as Runecast does to help solve that challenge.

In the past I have researched and used more than a few tools on the market and probably the closest comparison that I can make with Runecast is what CloudPhysics tried to do with their Knowledge Base Adviser feature. For those that have used CloudPhysics in the past Runecast will feel somewhat similar in theory, however Runecast have taken what CloudPhsyics had done and taken it to the next level.

By using a number of resources within VMware’s knowledgebase Runecast is been able to deliver a platform that looks at best practices, log information and security hardening guides to monitor your vSphere infrastructure which in turn brings to your attention through a simple yet intuitive interface to issues that may exist.

Runecast for Service Providers:

Proactive analysis is the name of the game and it’s one of the holy grail’s for any operations team. Prevention of an issue before it occurs is what Runecast sets out to achieve and for service providers that are running critical line of business applications for their clients (which is all service providers) the ability to prevent service disruption is huge.

Apart from the obvious benefits around proactive analytics, one of the best features for service providers is the security hardening feature. Lots of service providers these days are being governed by specific regulations and compliance and security has become front and center of any platform owner. With the security hardening feature it points out specifically what passes and what fails as per the official VMware hardening guide.

I can also see how the specific inventory feature for vCenter objects can be developed in the future to allow service providers to expose certain information via the Runecast APIs to their tenants. I’d love to see some integration with vCloud Director, NSX and vSAN among other VMware platforms…there is serious potential here.

The API endpoints that are being exposed version to version means that service providers can take the information presented and manipulate it their hearts content. It providers a powerful way for service providers to take full advantage of the data that’s being collect and analyised.

Final Thoughts:

This is, for the most a targeted analytics system that focuses on getting you the relevant information quickly and without fuss and allows you to ascertain issues and work towards their resolution. I’m looking forward to seeing what the guys come up with over the next twelve to eighteen months as they further enhance the capabilities.

For your free 14 day Trial register here and if you are heading to VMworld this year make sure to visit them at Booth #832

Disclaimer: Runecast are sponsors of Virtualization is Life!

Top vBlog 2017: Notable Representation and Thanks

It feels like this year moving along at ludicrous speed so it’s no surprise that the Top vBlog for 2017 has been run and won. This year Eric Siebert changed things up by introducing new voting mechanisms to try and deliver a more palatable outcome for all who where involved…I think it worked well and delivered interesting results for all those active bloggers listed on the vLaunchpad.

Eric introduced a point system based on Google Page Speed and the number of posts in 2016 to help level the playing field and make it less of a perceived popularity contest. Introducing tangible metrics to make up a portion of the total ranking points was an interesting move and seemed to work well. If nothing else it made people (myself included) more aware around the dark art of web page speed optimization…and this has meant a better browsing experience for those visiting Top vBlog sites.

The Results:

As expected, with Duncan Epping bowing out of the race William Lam deservedly took out the #1 spot with Vladan Seget, Cormac Hogan, Chris Wahl and Scott Lowe rounding out the top 5. There was lots of movement in the top 25 and I managed to sneak into the top 20 at #19 which is extremely humbling.

Creating content for this community is a pleasure and has become somewhat of a personal obsession so it’s nice to get some recognition and I’m happy that what I’m able to produce is (for the most) found useful by people in the community. I’m a passionate guy in most things that I am involved in so it’s no surprise that I feel so strongly in being able to contribute to this great vCommunity…especially when it comes to my strong passion around Hosting, Cloud, Backup and DR.

Aussie Representation:

As with previous years I like to highlight the Aussie and Kiwi (ANZ) representation in the Top vBlog and this year is no different. We have a great blogging scene here in the VMware community and that is reflected with the quality of the bloggers listed below. Special mention to Matt Allford who debuted at #190 …watch out for him to climb up the list over the next few years!

Blog Rank Prev +/- Total Points Total Votes Voting Points #1 Votes # 2016 posts Post Pts PS % PS Pts
Virtualization is Life! (Anthony Spiteri) 19 44 25 1420 165 1042 15 123 246 66% 132
Long White Virtual Clouds (Michael Webster) 20 13 -7 1374 201 1228 7 17 34 56% 112
CloudXC (Josh Odgers) 29 17 -12 1166 144 930 7 53 106 65% 130
Penguinpunk.net (Dan Frith) 106 78 -28 545 48 317 4 46 92 68% 136
Virtual 10 (Manny Sidhu) 115 82 -33 520 44 268 0 32 64 94% 188
Proudest Monkey (Grant Orchard) 127 93 -34 501 53 341 0 11 22 69% 138
Demitasse (Alastair Cooke) 140 168 28 467 42 261 0 33 66 70% 140
Pragmatic IO (Brett Sinclair) 174 153 -21 418 29 222 6 12 24 86% 172
Virtual Tassie (Matt Allford) 190 NEW NEW 392 29 170 1 23 46 88% 176
ukotic.net (Mark Ukotic) 201 179 -22 369 24 159 6 14 28 91% 182
Virtual Notions (Derek Hennessy) 266 298 32 253 21 121 0 25 50 41% 82
Veeam Representation:

My follow colleagues at Veeam made it into the list and all below made the top 50!

Blog Rank Prev +/- Total Points Total Votes Voting Points #1 Votes # 2016 posts Post Pts PS % PS Pts
Virtualization is Life! (Anthony Spiteri) 19 44 25 1420 165 1042 15 123 246 66% 132
Notes from MWhite (Michael White) 31 38 7 1080 108 642 2 132 264 87% 174
Virtual To The Core (Luca Dell’Oca) 38 41 3 927 111 695 2 36 72 80% 160
vZilla (Michael Cade) 44 120 76 871 109 645 9 27 54 86% 172
Tim’s Tech Thoughts (Tim Smith) 48 100 52 837 92 583 5 32 64 95% 190
The Results Show:

Again a massive thank you to Eric for putting together the voting and organising the whole thing. It’s a huge undertaking and we should all be in gratitude to Eric for making it all happen.

The whole list and category winners can been viewed here.

 

vSphere 6.5 Update 1 – What’s in it for Service Providers

Late last week VMware released vSphere 6.5 Update 1 which included updated builds of both vCenter and ESXi and as per usual I will go through some of the key features and fixes that are included in the latest versions of vCenter and ESXi. When looking through the release notes I generally keep an eye out for improvements that relate back to Service Providers who use vSphere as the foundation of their Managed or Infrastructure as a Service offerings. This update also contains an update to vSAN which is now at 6.6.1 so I’ll spend some time looking at what’s been added there.

 

New Features and Enhancements:

Without question this is a significant patch release for vCenter and ESXi and the length of the release notes is testament to that point. In terms of new features there isn’t anything groundbreaking but there are a few nice additions like being able to run the VCSA GUI and CLI installers on Windows 2012 and 2012 R2 as well as 2016 and macOS Sierra and Ubuntu 17.04 OS is supported for Guest OS Customization. vCenter now supports Microsoft SQL Server 2014 SP2 2016 and SP1 as well as some increased configuration maximums supporting Linked Mode with 15 vCenter Instances, 5000 ESXi hosts and 50,000 powered on virtual machines.

Ability to Upgrade or Migrate from vCenter 6.0 Update 3:

This release addresses the previous limitation in the upgrade and migration path for those running vSphere 6.0 U3 in going to vSphere 6.5. I know this will make a lot of providers happy as I know a lot that had to go to 6.0 Update 3 to address existing bug in the platform but where not yet ready or able to go to 6.5 at the time.

HTML5 Client Update:

The HTML5 Web Client has gotten it’s own update that brings it up to speed with the 3.15 Fligng version however it’s still partially functional which remains somewhat frustrating…The online documentation for supported functionality has been updated to vSphere 6.5U1 and is available here.

The list below is of the main updates in this release.

  • DRS/HA VM overrides
  • SDRS rules
  • Content Library – further actions
  • Roles and Global Permissions
  • Download multiple files as zip
  • Distributed Switch – further actions
  • Fault Tolerance
  • SPBM
  • VM Hardware – further items
  • Apply Customize Guest OS during Clone
  • VM Migration – further actions (compute+storage, Cross VC, batch)
vSAN Features:

For service providers, vSAN 6.6 was another major release that sured up vSANs status as a serious storage platform for service provider platforms.

vSAN 6.6.1 introduces three key new features:

  • VMware vSphere Update Manager (VUM) integration
  • Performance Diagnostics in vSAN Cloud Analytics
  • Storage Device Serviceability enhancement

The ability to upgrade with VUM is a nice touch and continues to improve on the usability and manageability of vSAN. For a full look at what’s new in this release for vSAN 6.6.1 head to this blog post.

Resolved Issues:

There are a bunch of resolved issues in this release and I’ve gone through the rather extensive list to pull out the biggest fixes that relate to my experience in service provider operations and have also extended this to include fixes that relate to backup operations. The majority of what I pick out related to storage, networking hosts and VM operations…the core of any platform, but even more important in the service provider world. The ones in red are specific fixes that relate to issues that iv’e come across…good to see them addressed!

vCenter:
  • First-boot failure occurs when upgrading from vSphere 5.5 or 6.0 to vSphere 6.5 on Windows If an older version of the OpеnSSL DLLs are installed, upgrading to vSphere 6.5 fails to run because the older DLL versions are loaded
  • Affinity rules configured on vCenter Server 5.5 can cause crashes after upgrading to vCenter Server 6.5 Migrating a VM with affinity rules configured while on vCenter Server 5.5 to a cluster that has affinity rules configured on vCenter Server 6.0 or 6.5 can cause vCenter Server to crash.
  • VM Snapshot Size (GB) alarm is not triggered after the VM is powered on. VM Snapshot Size (GB) alarm is reset if the virtual machine is shut down. Alarm fails to trigger after the VM is powered on. This issue occurs in alarms based on VM Snapshot (GB) and Vm Total Size on Disk because their status is altered when the power state of the VM is changed. This issue occurs because disk usage of a VM is the same regardless of the VM power state.
  • When you add ports to a vSphere Distributed Switch you get an error Because of a race condition, when you add ports to a vSphere Distributed Switch you get the error message: Cannot create a new port because number of ports exceeds 2147483647, maximum number of ports allowed on vDS.
  • A runtime exception “Unable to retrieve data about the distributed switch” might occur while upgrading vSphere Distributed Switch (vDS) from 5.0 to 6.5 version When you try to upgrade an existing distributed switch after the vCenter upgrade is completed, the runtime exception Unable to retrieve data about the distributed switch might occur in the wizard and the distributed switch cannot be upgraded. The exception is a result of unexpected value NULL for a LACP property of the distributed switch, instead of TRUE or FALSE, as LACP is not supported for the current version of vSphere Distributed Switch.
  • Host configuration might not be available after vCenter Server restarts After a vCenter Server restart, the host configuration might not be available if vCenter Server cannot communicate with the host. After connectivity is restored, the configuration becomes available.
  • OVF tool fails to upload OVF or OVA files larger than 10 GB If you use OVF tool fails to upload OVF or OVA files larger than 10 GB, the upload might fail.

ESXi:

  • Virtual machine crashes on ESXi 6.5 when multiple users log on to Windows Terminal Server VM Windows 2012 terminal server running VMware tools 10.1.0 on ESXi 6.5 stops responding when many users are logged in.vmware.log will show similar messages to2017-03-02T02:03:24.921Z| vmx| I125: GuestRpc: Too many RPCI vsocket channels opened.
    2017-03-02T02:03:24.921Z| vmx| E105: PANIC: ASSERT bora/lib/asyncsocket/asyncsocket.c:5217
    2017-03-02T02:03:28.920Z| vmx| W115: A core file is available in "/vmfs/volumes/515c94fa-d9ff4c34-ecd3-001b210c52a3/h8-
    ubuntu12.04x64/vmx-debug-zdump.001"
    2017-03-02T02:03:28.921Z| mks| W115: Panic in progress... ungrabbing 
  • An ESXi host might fail with purple diagnostic screen when collecting performance snapshots
    An ESXi host might fail with purple diagnostic screen when collecting performance snapshots with vm-support due to calls for memory access after the data structure has already been freed.An error message similar to the following is displayed:
  • Full duplex configured on physical switch may cause duplex mismatch issue with igb native Linux driver supporting only auto-negotiate mode for nic speed/duplex setting
    If you are using the igb native driver on an ESXi host, it always works in auto-negotiate speed and duplex mode. No matter what configuration you set up on this end of the connection, it is not applied on the ESXi side. The auto-negotiate support causes a duplex mismatch issue if a physical switch is set manually to a full-duplex mode.
  • An ESXi host might fail with a purple screen and a Spin count exceeded (refCount) – possible deadlock with PCPU error An ESXi host might fail with a purple screen and a Spin count exceeded (refCount) - possible deadlock with PCPU error, when you reboot the ESXi host under the following conditions:
    • You use the vSphere Network Appliance (DVFilter) in an NSX environment
    • You migrate a virtual machine with vMotion under DVFilter control
  • A Virtual Machine (VM) with e1000/e1000e vNIC might have network connectivity issues For a VM with e1000/e1000e vNIC, when the e1000/e1000e driver tells the e1000/e1000e vmkernel emulation to skip a descriptor (the transmit descriptor address and length are 0), a loss of network connectivity might occur.
  • An ESXi host might stop responding when you migrate a virtual machine with Storage vMotion between ESXi 6.0 and ESXi 6.5 hosts The vmxnet3 device tries to access the memory of the guest OS while the guest memory preallocation is in progress during the migration of virtual machine with Storage vMotion. This results in an invalid memory access and the ESXi 6.5 host failure.
  • Modification of IOPS limit of virtual disks with enabled Changed Block Tracking (CBT) fails with errors in the log files To define the storage I/O scheduling policy for a virtual machine, you can configure the I/O throughput for each virtual machine disk by modifying the IOPS limit. When you edit the IOPS limit and CBT is enabled for the virtual machine, the operation fails with an error The scheduling parameter change failed. Due to this problem, the scheduling policies of the virtual machine cannot be altered. The error message appears in the vSphere Recent Tasks pane.You can see the following errors in the /var/log/vmkernel.log file:2016-11-30T21:01:56.788Z cpu0:136101)VSCSI: 273: handle 8194(vscsi0:0):Input values: res=0 limit=-2 bw=-1 Shares=1000
    2016-11-30T21:01:56.788Z cpu0:136101)ScsiSched: 2760: Invalid Bandwidth Cap Configuration
    2016-11-30T21:01:56.788Z cpu0:136101)WARNING: VSCSI: 337: handle 8194(vscsi0:0):Failed to invert policy
  • When you hot-add an existing or new virtual disk to a CBT (Changed Block Tracking) enabled virtual machine (VM) residing on VVOL datastore, the guest operation system might stop responding When you hot-add an existing or new virtual disk to a CBT enabled VM residing on VVOL datastore, the guest operation system might stop responding until the hot-add process completes. The VM unresponsiveness depends on the size of the virtual disk being added. The VM automatically recovers once hot-add completes.
  • When you use vSphere Storage vMotion, the UUID of a virtual disk might change When you use vSphere Storage vMotion on vSphere Virtual Volumes storage, the UUID of a virtual disk might change. The UUID identifies the virtual disk and a changed UUID makes the virtual disk appear as a new and different disk. The UUID is also visible to the guest OS and might cause drives to be misidentified.
  • An ESXi host might become unresponsive if the VMFS-6 volume has no space for the journal When opening a VMFS-6 volume, it allocates a journal block. Upon successful allocation, a background thread is started. If there is no space on the volume for the journal, it is opened in read-only mode and no background thread is initiated. Any intent to close the volume, results in attempts to wake up a nonexistent thread. This results in the ESXi host failure.
  • SSD congestion might cause multiple virtual machines to become unresponsiv Depending on the workload and the number of virtual machines, diskgroups on the host might go into permanent device loss (PDL) state. This causes the diskgroups to not admit further IOs, rendering them unusable until manual intervention is performed.
  • Unable to collect vm-support bundle from an ESXi 6.5 host Unable to collect vm-support bundle from an ESXi 6.5 host because when generating logs in ESXi 6.5 by using the vSphere Web Client, the select specific logs to export text box is blank. The options: network, storage, fault tolerance, hardware etc. are blank as well. This issue occurs because the rhttpproxy port for /cgi-bin has a value different from 8303.This issue is resolved in this release.
  • vSphere Storage vMotion might fail with an error message if it takes more than 5 minutes The destination virtual machine of the vSphere Storage vMotion is incorrectly stopped by a periodic configuration validation for the virtual machine. vSphere Storage vMotion that takes more than 5 minutes fails with the The source detected that the destination failed to resume message.
    The VMkernel log from the ESXi host contains the message D: Migration cleanup initiated, the VMX has exited unexpectedly. Check the VMX log for more details.

vSAN:

  • Hosts in a vSAN cluster have high congestion which leads to host disconnects When vSAN components with invalid metadata are encountered while an ESXi host is booting, a leak of reference counts to SSD blocks can occur. If these components are removed by policy change, disk decommission, or other method, the leaked reference counts cause the next I/O to the SSD block to get stuck. The log files can build up, which causes high congestion and host disconnects.
  • vSAN cluster becomes partitioned after the member hosts and vCenter Server reboot If the hosts in a unicast vSAN cluster and the vCenter Server are rebooted at the same time, the cluster might become partitioned. The vCenter Server does not properly handle unstable vpxd property updates during a simultaneous reboot of hosts and vCenter Server.
  • Large File System overhead reported by the vSAN capacity monitor When deduplication and compression are enabled on a vSAN cluster, the Used Capacity Breakdown (Monitor > vSAN > Capacity) incorrectly displays the percentage of storage capacity used for file system overhead. This number does not reflect the actual capacity being used for file system activities. The display needs to correctly reflect the File System overhead for a vSAN cluster with deduplication and compression enabled.

It’s also worth reading through the Known Issues section as there is a fair bit to be aware of in Update 1 and that remain from the GA.

Happy upgrading!

References:

https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vsphere-esxi-651-release-notes.html

https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vsphere-vcenter-server-651-release-notes.html

Second vSphere Client (HTML5) update in vSphere 6.5U1

Introducing vSAN 6.6.1 and New Operational Savings

ESXI 6.5 Storage Performance Issues Resolved in Update 1

I originally came across the issue of slow storage performance with the native vmw_ahci driver that comes bundled with ESXi 6.5 just as I was first playing with my SuperMicro SYS-5028D-TN4T in my homelab. After publishing a couple of posts about the workaround shortly afterwards the issue become quiet prevalent in the community and the post continues to get decent traffic, meaning that the issues impacted quiet a few people out there.

The good news is that with the release of vSphere 6.5 Update 1 there is a fix for the problem in the form of updated drivers for the AHCI module. William Lam has been quick to blog about the fix and if you had previously disabled the driver you will need to re-enable it.

This VMwareKB covers the specific patch as listed in the release notes:

No confirmation as of yet if it actually does the trick, but the release notes look promising as the assumption is that it will resolve the issues so that homelabbers and people using the driver in production systems can rest easy.

References:

https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vsphere-esxi-651-release-notes.html

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2149910

http://www.virtuallyghetto.com/2017/07/ahci-vmw_ahci-performance-issue-resolved-in-esxi-6-5-update-1.html

NestedESXi – Network Performance Improvements with Learnswitch

I’ve been running my NestedESXi homelab for about eight months now but in all that time I had not installed or enabled the ESXi MAC Learning dvFilter. As a quick refresher the VMware Fling addresses the issues with nested ESXi hosts and the impact that promiscuous mode has when enabled on virtual switches. In a nutshell, network traffic will hit all the network interfaces attached to the portgroup which reduces network throughput and also increases latency and impacts CPU.

The ESXi MAC Learn dvFilter Fling was released about two years ago and its a must have for those running homelabs or work labs running nested ESXi. However earlier this year a new fling was released that improves on the dvFilter and addresses some of it’s limitations. The new native MAC Learning VMkernel module is called Learnswitch.

ESXi Learnswitch is a complete implementation of MAC Learning and Filtering and is designed as a wrapper around the host virtual switch. It supports learning multiple source MAC addresses on virtual network interface cards (vNIC) and filters packets from egressing the wrong port based on destination MAC lookup. This substantially improves overall network throughput and system performance for nested ESX and container use cases.

For a more in depth look at it’s functionality head over to William Lams blog post here.

dvFilter vs Learnswitch:

I was interested to see if the new Learnswitch offered any significant performance improvements over the dvFilter in addition to its main benefits. I went about installing and enabling the dvFilter in my lab and ran some basic performance tests using Crystal Disk Mark. Before that, I ran the performance test without either installed as a base.

Firstly to see what the network traffic looks like hitting the nested hosts you can see from the ESXTOP output below that each host is dealing with about the same amount of received packets. Overall throughput is reduced when this happens.

In terms of performance the Crystal Disk Mark test run on a nested VM (right) showed reduced performance across all tests when compared to one run on the parent host (left) directly.

There was also elevated datastore latency and significant CPU usage due to the overheads with the increased traffic hitting all interfaces.

The CPU usage alone shows the value in having the dvFilter or Learnswitch installed when running nested ESXi hosts.

With the baseline testing done I installed and enabled the dvFilter and then ran the same tests. For a detailed look at how to install the dvFilter (just in case you don’t fit the requirements for using the Learnswitch module) check out my initial post on the dvFilter here. Having gone through that I went about uninstalling the dvFilter and installing and configuring the Learnswitch.

Like the dvFilter you need to download and install am ESXi software bundle but unlike the dvFilter, you need to reboot the host to enable the Learnswitch module.

As per the instructions on William Lam’s post or the Fling page you then need to configure and run a Python script to enable the Learnswitch against the NestedESXi portgroups that have promiscuous mode enabled.

From there the impact of the module is immediate and you can see a normalization of network traffic hitting the interfaces of each NestedESXi host. When running the performance test the ESXTOP output is significantly different to what you see if the module is not loaded as shown below.

You also have access to a new command that lists out stat’s of the Learnswitch showing packet and port statistics as well as the current MAC address table.

In terms of what it looks like from a performance point of view, below are the results of all Crystal Disk Mark tests. The bottom two represent the dvFilter (left) and the Learnswitch (right).

And finally to have a look at the improvement in CPU performance with the modules installed you can see below a timeline showing the performance tests run at different times across the last 24 hours…again a significant improvement looking at the graphs on the left hand side which was during the testing without any module and then moving across to the dvFilter test with the Learnswitch test on the right hand side. It does seem like the Learnswitch is a little better on CPU, but can’t be 100% with my limited testing.

Conclusion:

As expected there isn’t a huge different in performance between both modules but certainly the features of the Learnswitch make it the new preferred choice out of the two if the requirements are met. Again, the main advantages of the Learnswitch over the dvFilter make it a must have addition to any NestedESXi environment. If you haven’t installed either yet…get onto it!

Veeam Vault #7: Nutanix Support?!, Backup for Office365 1.5 BETA, VeeamON Forums plus Vanguard Roundup

It’s been just over two months since my last Veeam Vault went out and can you believe that was just before VeeamON 2017 in New Orleans. Again, for a recap of what was announced at VeeamON check out my wrap up post here…two months on and we haven’t stopped here at Veeam. As soon as VeeamON was done and dusted focus turned to EMEA SE training in Warsaw which my whole team attended and where the group got an extended look at the new features coming in v10. Since then, i’ve had a good stretch at home where i’ve been preparing for a series of webinars but mainly focused on the upcoming VeeamON Forums happening around the APAC region.

I’ll be presenting sessions at all events and be on stage with Clint Wyckoff for the Sydney and Auckland keynotes where our co-CEO, Peter McKay and VP of Global Cloud Group, Paul Mattes will be headlining. There are other events happening in Asia, so please register here and if you are able to attend any of those cities it would be great to get you down and learn about all that’s happening with Veeam as we move into the second half of the year an into next year.

Nutanix AHV Announcement:

At Nutanix’s .NET conference we announced the intent to support Acropolis Hypervisor (AHV) by years end and also became the Premier Availability solution for supported Nutanix virtualized environments. I’ll be honest and say that this took a lot of us by surprise…and probably most Nutanix employees as well. However it shows our commitment to providing availability for the modern enterprise…of which Nutanix is also pushing hard into.

Backup for Office365 1.5 BETA:

Last week we released the first beta for Backup for Office365 1.5 which is a significant release for our VCSP community as it now introduces multi-tenancy and also an advanced API feature for automation. If you are a VCSP, take some time to download the beta and put the new features to work…there is a significant opportunity to offer backup services for Office365 which now scale.

Version 1.5 Enhancements:

  • A multi-repository, multi-tenant architecture enabling protection of larger Office 365 deployments with a single installation. Also empowering service providers to deliver Office 365 backup services.
  • Automation possibilities via RESTful API and PowerShell SDK to minimize management overhead, improve recovery times and reduce costs

https://go.veeam.com/beta-backup-office-365

Update 1 for Veeam Agent for Linux 1.0:

Last month we released Update 1 for Veeam Agent for Linux so the next time you update the software from your Linux update repositories you will get the update. While this is for the most a bug release we still included file indexing for 1-Click file recovery through Veeam Enterprise Manager, the ability to add storage and network drivers to the recovery media from the Linux OS and the addition of an ssh server to the recovery media. There is also support added for ExaGrid and general wizard improvements.

https://www.veeam.com/kb2290

Veeam Vanguard Blog Post Roundup:

Top vBlog 2017 – Last week to Vote!

While I had resisted the temptation to put out a blog on this years Top vBlog voting I thought with the voting coming to an end it was worth giving it a shout just in case there are some of you who hadn’t had the chance to vote or didn’t know about the Top vBlog vLaunchPad list created and maintained by Eric Siebert of vShere-Land.

This year’s voting has a slightly different format with the total vote being determined by the following:

  • 60% – public voting – general voting – anyone can vote – votes are tallied and weighted for points based on voting rankings as done in past years
  • 20% – private judges scoring – chosen judges who will grade a select group of blogs based on several factors, combined rankings will equal points
  • 10% – number of posts in a year – how much effort a blogger has put into writing posts over the course of a year based on Andreas hard work adding this up each year (aggregator’s excluded)
  • 10% – Google PageSpeed score – how well a blogger has done to build and optimize their site as scored by Google’s PageSpeed tools

As Eric mentions the vBlog voting should be based on blog content based around longevity, length, frequency and quality of the posts. There is an amazing amount of great content that gets created daily by this community and all things aside, this Top vBlog vote goes someway to recognizing the hard work most bloggers put into the creation of content for the community. Special mention to Duncan Epping and Frank Denneman for pulling out of the voting this year to give others a shot at moving up the ranks…it’s a classy move!

Good luck to all those who are listed and for those who haven’t voted yet click on the link below to cast your vote. If you feel inclined and enjoy my content around vCloud Director, Availability, NSX, vSAN and Cloud and Hosting in general…It would be an honor to have you consider anthonyspiteri.net in your Top 12 and also in the Independent Blogger category.

http://topvblog2017.questionpro.com

Thanks again to Eric Siebert.

References:

http://vsphere-land.com/news/voting-now-open-for-top-vblog-2017.html

http://vsphere-land.com/news/coming-soon-top-vblog-2017-with-a-new-scoring-method.html

VMware vSphere 6.5 Host Resources Deep Dive – A Must Have!

Just after I joined Zettagrid in June of 2013 I decided to load up vSphere 5.1 Clustering Deepdive by Duncan Epping and Frank Denneman on my iPad to read on my train journey to and from work. Reading that book allowed me to gain a deeper understanding of vSphere through the in depth content that Duncan and Frank had produced. Any VMware administrator worth their salt would be familiar with the book (or the ones that proceeded it) and it’s still a brilliant read.

Fast forward a few versions of vSphere and we finally have follow up:

VMware vSphere 6.5 Host Resources Deep Dive

This time around Frank has been joined by Niels Hagoort and together they have produced another must have virtualization book…though it goes far beyond VMware virtualization. I was lucky enough to review a couple of chapters of the book and I can say without question that this book will make your brain hurt…but in a good way. It’s the deepest of deep dives and it goes beyond the previous books best practice and dives into a lot of the low level compute, storage and networking fundamentals that a lot of us have either forgotten about, never learnt or never bothered to learn about.

This book explains the concepts and mechanisms behind the physical resource components and the VMkernel resource schedulers, which enables you to:

  • Optimize your workload for current and future Non-Uniform Memory Access (NUMA) systems.
  • Discover how vSphere Balanced Power Management takes advantage of the CPU Turbo Boost functionality, and why High Performance does not.
  • How the 3-DIMMs per Channel configuration results in a 10-20% performance drop.
  • How TLB works and why it is bad to disable large pages in virtualized environments.
  • Why 3D XPoint is perfect for the vSAN caching tier.
  • What queues are and where they live inside the end-to-end storage data paths.
  • Tune VMkernel components to optimize performance for VXLAN network traffic and NFV environments.
  • Why Intel’s Data Plane Development Kit significantly boosts packet processing performance.

If any of you have read Frank’s NUMA Deep Dive blog series you will start to get an appreciation of the level of technical detail this book covers, however it is written in a way that allows you absorb the information in a way that is digestible, though some parts may need to be read twice over. Well done to Frank and Niels on getting this book out and again, if you are working in and around anything to do with computers this is a must read so do yourself a favour and grab a copy.

The current Amazon locals that have access to purchase the book can be found below:

Amazon US: http://www.amazon.com/dp/1540873064
Amazon France: https://www.amazon.fr/dp/1540873064
Amazon Germany: https://www.amazon.de/dp/1540873064
Amazon India: http://www.amazon.in/dp/1540873064
Amazon Japan: https://www.amazon.co.jp/dp/1540873064
Amazon Mexico: https://www.amazon.com.mx/dp/1540873064
Amazon Spain: https://www.amazon.es/dp/1540873064
Amazon UK: https://www.amazon.co.uk/dp/1540873064

Attack from the Inside – Protecting Against Rogue Admins

In July of 2011, Distribute.IT, a domain registration and web hosting services provider in Australia was was hit with a targeted, malicious attack that resulted in the company going under and their customers left without their hosting or VPS data. The attack was calculated, targeted and vicious in it’s execution… I remember the incident well as I was working for Anittel at the time and we where offering similar services…everyone in the hosting organization was concerned when starting to think about the impact a similar attack would have within our systems.

“Hackers got into our network and were able to destroy a lot of data. It was all done in a logical order – knowing exactly where the critical stuff was and deleting that first,”

While it was reported at the time that a hacker got into the network, the way in which the attack was executed pointed to an inside job and all though it was never proved to be so it almost 100% certain that the attacker was a disgruntled ex-employee. The very real issue of an inside attack has popped up again…this time Verelox, a hosting company out of the Netherlands has effectively been taken out of business with a confirmed attack from within by an ex-employee.

My heart sinks when I read of situations like this and for me, it was the only thing that truely kept me up at night as someone who was ultimately responsible for similar hosting platforms. I could deal and probably reconcile with myself if I found myself in a situation where a piece of hardware failed causing data loss…but if an attacker had caused the data loss then all bets would have been off and I might have found myself scrambling to save face and along with others in the organization, may well have been searching for a new company…or worse a new career!

What Can Be Done at an Technical Level?

Knowing a lot about how hosting and cloud service providers operate my feeling is that 90% of organizations out there are not prepared for such attacks and are at the mercy of an attack from the inside…either by a current or ex-employee. Taking that a step further there are plenty that are at risk of an attack from the inside perpetrated by external malicious individuals. This is where the principal of least privileged access needs to be taken to the nth degree. Clear separation of operational and physical layers needs to be considered as well to ensure that if systems are attacked, not everything can be taken down at once.

Implementing some form of certification or compliancy such as ISO 27001, SOC and iRAP will force companies to become more vigilant through the stringent processes and controls that are forced upon companies once they meet compliancy. This in turn naturally leads to better and more complete disaster and business continuity scenarios that are written down and require testing and validation in order to pass certification.

From a backup point of view, these days with most systems being virtual it’s important to consider a backup strategy that not only looks to make use of the 3-2-1 rule of backups, but also look to implement some form of air-gapped backups that in theory are completely seperate and unaccessible from production networks, meaning that only a few very trusted employees have access to the backup and restore media. In practice implementing a complete air-gapped solution is complex and potentially costly and this is where service providers are chancing their futures on scenarios that have a small percentage chance of happening however the likelihood of that scenario playing out is greater than it’s ever been.

In a situation like Verelox, I wonder if, like most IaaS providers they didn’t backup all client workloads by default, meaning that backup services was an additional service charge that some customers didn’t know about…that said, if backup systems are wiped clean is there any use of having those services anyway? That is to say…is there a backup of the backup? This being the case I also believe that businesses need to start looking at cross cloud backups and not rely solely on their providers backup systems. Something like the Veeam Agent’s or Cloud Connect can help here.

So What Can Be Done at an Employee Level?

The more I think about the possible answer to this question, the more I believe that service providers can’t fully protect themselves from such internal attacks. At some point trust supersedes all else and no amount of vetting or process can stop someone with the right sort of access doing damage. To that end making sure that you are looking after your employee’s is probably the best defence against someone feeling aggrieved enough to carry out an malicious attack such as the one Verelox has just gone through. In addition to looking after employee’s well being it’s also a good idea to…within reason, keep tabs on an employee’s state in life in general. Are they going through any personal issues that might make them unstable, or have they been done wrong by someone else within the company? Generally social issues should be picked up during the hiring process, but complete vetting of employee stability is always going to be a lottery.

Conclusion

As mentioned above, this type of attack is a worst case scenario for every service provider that operates today…there are steps that can be taken to minimize the impact and protect against an employee getting to the point where they choose to do damage but my feeling is we haven’t seen the last of these attacks and unfortunately more will suffer…so where you can, try to implement policy and procedure to protect and then recover when or if they do happen.

Vote for your favorite blogs at vSphere-land!

Top vBlog Voting 2017

Resources:

https://www.crn.com.au/news/devastating-cyber-attack-turns-melbourne-victim-into-evangelist-397067/page1

https://www.itnews.com.au/news/distributeit-hit-by-malicious-attack-260306

https://news.ycombinator.com/item?id=14522181

Verelox (Netherlands hosting company) servers wiped by ex-admin from sysadmin

« Older Entries