Monthly Archives: July 2013

First Look: CloudPhysics – Knowledge Base Advisor

I’ve been lucky enough to have been early access to a new Card from the guys at CloudPhyics which, at it’s core lets VMware admins quickly and precisely get an overview of issues in their environment linked back to VMware and other Vendor KB articles. CloudPhysics description is below:

So what is it? The new service is called Knowledge Base Advisor. This service matches the virtual datacenter profile of CloudPhysics users to knowledge base articles published by top virtual datacenter vendors including VMware, DELL, IBM and more. We join detailed knowledge of our customers with support content to create filtered, highly relevant and personalized support content for issues which may be present in their environments.
 
As of today the CloudPhysics Web App has had a major make over and now looks even slicker than before. The card its self is shown below, and gives you a brief overview of what’s currently happening in your environment as picked up by the CloudPhysics Probe VAs.
Clicking into the card you are presented with a list of Critical Issues relevant to your environment. It’s worth mentioning that just because an alert has been been detected it’s more of a proactive warning at this point. Obviously everybody’s environments are different and with that, these potential issues manifest themselves under varied conditions with differing trigger points.
Clicking in the alert box will expand to show you the affected hosts as it applies and by clicking on the KB Description are taken to a framed CloudPhysics page that loads up the specified KB Article. In my environment I had 200+ alerts, (16 of which where Critical, 86 High and 104 Medium), so being able to sort through and organise the alerts is crucial. The left hand menu lets you construct your own search queries base on a number of heading options.

The CloudPhysics team are great at reaching out for feedback and I’m proud to report that based on personal feedback the card has enhanced features and options that let you better deal with the initial large number of alerts that are shown. Once you have taken note of the issue you now have three options to deal with the alerts, Problem Fixed, Uninteresting and Not Relevant.

It’s great to see CloudPhysics evolve past numbers and metrics…that isn’t to say that that’s not where it’s greatest strength lies…but the more CloudPhysics can take your collated data and spit it back to you in the form of useful information, the more powerful and invaluable this platform becomes.

Once again, CloudPhysics continue to blow me away with what they have been able to achieve in a little over 12 months since announcing themselves at VMWorld 2012…I look forward to more card releases and how they will continue to assist myself and other operations teams in better understanding and managing their vSphere environments.

VMware Series 2013 – EUC and vDC Ready and Waiting

Over the last couple of weeks I’ve been fortunate to represent ZettaGrid, as a Platinum Sponsor of the VMware Series 2013 road shows in Melbourne and Sydney. The event has also been held in Brisbane and Canberra, and finishes up in Perth this week.

The road shows main theme is showing off VMware’s End User Computing pillar and how it’s finally ready for serious adoption. After almost two years of hype and missed release schedules, Horizon Workspace and View 5.2 has arrived and delivers on its promise of streamlining the day to day tasks of todays mobile worker…View has been around for a while, and there are plenty of other solutions that can deliver SaaS/Remote/Thin Apps…but with the addition of Data (Project Octopus) and Blast (AppBlast) into the stack, the suite delivers significant enhancements over other options in the market.

The video above may seem a little unrealistic (certainly a Mirage Laptop re-image can’t happen that fast with current internet speeds) and over the top, but the reality is that it’s a scenario that is true to life and possible with Horizon. The keynotes of the road shows have focused on EUC and it’s with a great sense of pride that what’s been demoed on stage, and in the presentation videos is something that ZettaGrid can deliver to it’s clients today. The reality is that what I blogged about last year just after VMWorld 2012 on the EUC Revolution is finally happening and is available.

I’ve even become a convert to VDI! That is personally a huge realization that the technology that View 5.2 uses to deliver remote desktop instances over PCoIP or HTML Blast is mature enough for adoption. Seamless device hopping while maintaining a desktop state is possible on iPads/iPhones, MAC and Windows end points…or any compatible HTML5 capable browser.

ZettaGrid is also showing off the power of it’s automation technologies to provision VMware Backed vCloud Powered Virtual Datacenters…this is being shown in real time during a 15 minute presentation at the road shows…and (minus any presenter related ID10T issues), shows that a scalable, flexible vDC based upon an initial set of defined compute, network and storage options can be delivered just minutes after clicking confirm.

Again, something that used to take a day or so in provisioning, now takes minutes and the value proposition for any business thinking of moving their on premise servers to a VMware vCloud vDC platform over other Public Clouds like AWS, Azure or Rackspace is the fact that it’s VMware end to end…and for most people that equates to a smoother migration/setup and a sense of familiarity in the hyper-visor technology. It’s an exciting time to be in a position to help deliver these technologies to companies…they are ready and waiting for consumption!


PSOD Warning: IBM HS23 Blades, Emulex 10GB Network Adaptors and ESXi 5.1

This is a quick informational post to warn anyone running an Emulex based 10GbE Converged Ethernet adapter in an IBM Blade Center with HS23 Series Blade servers with ESXi 5.x … Fix is below.

You are at risk of simultaneous host failure via the VMware PSOD under unspecified conditions if you do not update to the latest combination FW from IBM (Emulex) and be2net driver from VMware.

While the trigger is still unclear (some suggestion linking guest based operations in combination with DVS environments) …searching online for PSOD + ESX + Emulex returns many results…it seems to be an issue that’s been around from release. And while it’s not limited to IBM Blades…there are reports of this happening on HP/Emulex platforms, this post is specific to the IBM HS23 Blades.

The below KB links suggest issues in FC or FCoE SAN management stacks, but in our situation we where running iSCSI software initiators with the following revision of Emulex FW and be2net driver. It must be noted that this combination had been stable for 7 months.

This Emulex KB suggests that the condition is triggered when frames sourced from the SAN management are aborted. Once the abort occurs the conditions for a PSOD exist. One particular example of an aborted frame is if the target does not respond to a request. For example the management application sends a read to a controller LUN but the controller LUN does not respond. The driver will then send out an abort for this particular read command.

VMware KB2031192: As of the 6th of May, VMware is suggesting that you update the Emulex FW to at least version 4.6.146.62. We struggled with IBM support to get access to an updated FW 4.6.166.9 (direct download to FW Deployment ISO here) and VMware support also suggested upgrading the be2net driver to at least 4.6.142.10 which was released on the 6th of June (direct download here). We later confirmed with VMware that running the driver version slightly behind the FW version was ok.

At this point in time our systems have been stable since applying the updates, but the is the second occurrence of the PSOD since the IBM Blade System has gone into production (within the last 7-8 months) and confidence in the platform has taken a hit. We, along with other users of this Emulex based platform will be hoping this is the last round of issues with this card…I struggle to understand how such a serious issue can be allowed to manifest. But this combination seems problematic!

More Links:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2039912

http://emulex.force.com/knowledgebase/articles/Knowledgebase_Article/PSOD-Discovery-in-LightPulse-and-OneConnect-Adapters-on-ESX-ESXi-Using-SAN-or-Adapter-Management-Applications/

http://www.redbooks.ibm.com/abstracts/tips0828.html

https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014903697

http://communities.vmware.com/thread/422381?start=0&tstart=0http://vstorage.wordpress.com/2010/04/25/ibm-bladecenter-virtual-fabric-solution-and-vsphere/