The boys at CloudPhysics are working hard behind the scenes at adding new features to their current stable of Analytic Cards based on data collected from their Probe VA’s hooked into vCenter environments. Check out this post on their DataStore Contention Card: For a general overview, go here: I am a massive fan of analytics and trend metrics and I use a
I’ve been waiting to deploy Project Octopus for the best part of 18 months… I’m still actively running the Octopus Beta and for my personal use/internal testing and it’s lived up to expectation for the most. There have been a number of bugs identified and general limitations with the Beta release builds, but all in all it does the
I was luckey to attend PEX at Australia Technology Park this week and thought I would share some of my take always. The venue was a little different to what you would come to expect from a tech event in Sydney… Usually we are in and around Darling Harbour at the Convention Centre… And even
During last weeks #APACVirtual Podcast (Episode 70 – Engineers Anonymous pt1 – Engineer2PreSales) the panelists (of which, I was one) where discussing what it took to become a successful candidate in transitioning from a technical engineering role to a pre-sales/architecture role. It was universally agreed upon that passion is a much sort after trait in those roles. Someone who is passionate about what they
The boys at CloudPhysics are working hard behind the scenes at adding new features to their current stable of Analytic Cards based on data collected from their Probe VA’s hooked into vCenter environments.
Check out this post on their DataStore Contention Card:
For a general overview, go here: I am a massive fan of analytics and trend metrics and I use a number of systems to gain a wide overview of the performance and monitoring of our Hosting and Cloud Platform.
A few weeks ago, the CloudPhysics team released to a limited number of users a Custom Card Designer. This pretty much lets you construct custom cards based on a huge number of metrics presented via a builder wizard.
Cards you design and save are listed on the page above. From here you can view your custom cards and edit them if they require tweaking. Once you click the Create Card + button you are presented with a list of property data metrics from which to construct your card.
Properties fall under four main categories and there are a large number of available metrics under each category. The wizard lets you drag and drop items into the builder window. From there you can preview and then save your custom card for future use.
As a quick example I needed a quick way to see which datastores where connected to their respective hosts in each cluster so that consistency in datastore availability was maintained. It was as simple as dragging across Host:Name and Host:Datastore, putting in a filter to only view hosts of a certain name it was ready to go.
You have the option to preview and continue editing, or saving to the Card Designer main page. From that page you can execute the query. The results of my quick test card are shown below.
One thing I would like to see is an option to export the results to a csv or excel document…but other than that it’s a great example of what CloudPhysics is all about…data and how to get the most out of it as efficiently as possible.
I’ve been waiting to deploy Project Octopus for the best part of 18 months… I’m still actively running the Octopus Beta and for my personal use/internal testing and it’s lived up to expectation for the most. There have been a number of bugs identified and general limitations with the Beta release builds, but all in all it does the job. I was a little frustrated with the time to market for the initial GA of the product, and even more so when it was incorporated into the Horizon Suite of products. Feel VMware has missed getting to a key part of the market with DropBox like clones popping up everywhere of late.
Having just gone through my first deployment of the Horizon Workspace vApp (…and failed) …put together with the fact there isn’t much on the internet in terms of walkthroughs, I thought a blog post would be handy. This won’t be a HA scaled out deployment as I only need to support 100-500 internal users for the moment, but the on-line docs do touch on Advanced Configuration tasks.
There is quiet a bit to the deployment, so this post will only touch on the key points and any additional items the docs don’t cover clearly. While starting to write out this post it became clear this would need to be a multi-parter…in this part I’ll go through initial DNS configuration requirements, deploying the Horizon Workspace vApp and going through the initial configuration wizard.
Initial Design Action Items:
Reading through the online docs the key takeaway is that you need to get your DNS right…that is, allocate the vApp VM IP addresses and ensure the reverse IP’s match up. You also need to think about the FQDN for internal and external access.
FQDN: xx.horizon.domain.com -> (split DNS employed relative to the vCenter/ESX environment to ensure internal and external access is achieved without the VM’s having to route publicly)
Caution: After you deploy, you cannot change the Horizon Workspace FQDN.
This was the mistake I made which meant I had to redeploy the vApp and get the FQDN right. When it came time for me to publish the gateway-va externally the external host name redirected the the FQDN specified during setup which I configured as an internal address.
Deploy The vApp:
Once you download and acquire the OFV from the VMware Download page, deploying the vApp is straight forward, however one thing to point out is that you need to ensure you have a vCenter Datacenter IP Pool configured so that the vAPP can correctly allocate IP/DNS settings to the VM’s. The OVF deployment screen below, warns you about that.
I had a previous IP Pool setup for my vCOP’s install, but there wasn’t a requirement to populate the DNS settings. That part is critical for this setup to be successful as the vApp will use these settings to configure DNS on the VM’s…without it, the initial configuration will fail due to a DNS lookup error when the configurator VA tries it’s first lookup against the VA IPs. You will need to restart the VA if any errors are detected.
Once the vApp has been deployed you should only have the configurator-va powered on. (do not power on the other VA’s). Log into the vCenter console for the configurator-va and go through the initial Configuration Wizard.
Once enter is pressed the wizard kicks off the the DNS checks mentioned above are executed. You are then prompted to enter in the root password to all VA’s in the vApp (this also becomes you default login password). From there you enter in your SMTP relay, Workspace FQDN and vCenter credentials.
From this point the wizard goes through and configures the remains VA’s, allocates the root password throughout the different systems and creates the SSL certificate services. This process can take up to 30-40 minutes depending on the your underlying storage. Viewing the process through vCenter you can see a summary of what’s taking place…interestingly (similar to vCloud Director managed VM’s) the VA’s management is taken over by the configurator-va and through that all the wizard actions take place.
Once complete you are presented with the message below and you are ready to continue configuring Horizon Workspace from the configurator-va web console.
Part 2 will follow and run through setting up initial Horizon Workspaces users, groups, services and policies.
I was luckey to attend PEX at Australia Technology Park this week and thought I would share some of my take always. The venue was a little different to what you would come to expect from a tech event in Sydney… Usually we are in and around Darling Harbour at the Convention Centre… And even if there where whispers of VMware being late to book the event in the city the surroundings of the old rail works in Redfern refurbished and transformed into a spectacular Centre for technology and innovation fits.
There is a fundamental shift happening in how we consume IT and pretty much all leading technology vendors are in the process of embracing that change. VMware have chosen to focus on three key areas and after a few years of letting the dust settle they have three main pillars of focus.
Software Defined Datacenter
End User Computing
I’ve written about EUC and their Hybrid Cloud Offerings in the past so I’m not going to focus on that in this post…but the one thing I will say is that VMware still have a material understanding of where their partners sit in the ecosystem and still see them being central to their offerings… As a Service Provider guy working for a vCloud Powered provider there is some concern around the vHPC platform that will be deployed globally over the next few years… But we need to understand that there has to something significant in the Public Cloud space in order to compete with AWS and Google … And maybe Microsofts Azure. AWS is a massive beast and will only be slowed by its own success…will it get too big and product heavy… therefore loosing focus on the basics. There has been the evidence in recent weeks about increasing issues with instance performance due to capacity issues.
With regards to the SDDC push … Last year was the year of network virtualisation but what excites me more at this point is the upcoming features around software defined storage. There has been an explosion of software based storage solutions coming on the market over the past 18 months and VMware have seen this as a key piece to the SDDC.
vVOLs and vSANs represent a massive shift in how vSphere/vCloud environments are architected and engineered. Storage is the biggest pain point for most providers and traditional SANs might have well run their race. There is no doubt that storage arrays are still relevant but with the new technology behind virtual sans on the horizon direct access storage will start to feature… Where we had limitations around availability and redundancy previously the introduction of technology that can take DAS and create a distributed virtual San across multiple hosts excites me.
Why tier and put performance on a device that’s removed from the compute resource? It’s logical to start bringing it back closer to the compute.
Not only to you solve the HA/DRS issue but, given the right choices in DAS/flash/embedded storage there is potential to offer service levels based on low latency/high IOP data store design that takes away the common issue with shared LUNs presented as VMFS or NFS mounts for data stores. Traditional SANs can certainly still exist and this set and in fact will still be critical to act as lower tier high volume storage options.
For a technical overview of VMware Distributed Storage check out Duncan Eppings (@DuncanYB) Post here: There is also a slightly dated VMwareKB overview by Cormac Hogan (@VMwareStorage) that I have embedded below…note that it’s only the tech preview, but if it’s any indication of what’s coming later in the year…it can’t come soon enough.
Being able to control the max/min number of IOPs garunteed to VM/VMDK similar to the way in which you can select the IOP performance on AWS instances is worth the price of admission and solves the current limitations of vSphere in that you can only set max values to block out noisy neighbors.
Vendors that are already pushing out solutions around storage virtualization continue the great work…anything that sits on top of this technology and complements/improves/enhances it can only be a good thing.
It’s the year of storage virtualization…
During last weeks #APACVirtual Podcast (Episode 70 – Engineers Anonymous pt1 – Engineer2PreSales) the panelists (of which, I was one) where discussing what it took to become a successful candidate in transitioning from a technical engineering role to a pre-sales/architecture role. It was universally agreed upon that passion is a much sort after trait in those roles. Someone who is passionate about what they are doing can overcome almost any professional deficiency and succeed where others might fail. It was discussed that someone who is seen to be passionate is a more sort after asset than someone who is simply technically brilliant.
I’m a passionate guy…those that know me generally would describe me as such. When I find something I love I tend to embrace it with all that I have and it becomes a driving force in life…I wear my heart on my sleeve in most aspects of life…be it family, playing cricket or work, and for each of those…passion manifests it’s self in different ways.
I’ve mulled over this post for about a week now…it’s been written and re-written a number of times as I try to best represent and explain passion and how it can contribute to a successful and rewarding career in IT. At the end of the day I can’t explain passion with any great level of verbal prowess…it’s too much of a basic raw emotion!
Passion is something you have, or don’t have…it’s a driving force that makes you strive to better yourself and it fuels the fire within to drives you to succeed and excel in anything you attempt in life.
Passion has the ability to lay down the foundation of a lasting legacy…
I posses a driving force when it comes to my work…I truly believe in the technology I work with…When talking with colleagues and clients alike, I am always passionate in my evangalization of those products and technologies.
My current passion lies within Hosting and Cloud technologies and i’m a big believer in what VMware is doing in the market at the moment. Previously I was (still am to a lessor extent) passionate around Hosted Exchange services and other Microsoft technologies…in that, the driver of passion can change depending on current circumstance and in my case, the agent of change was directly related to the way Microsoft started treating their partners…that and I was consumed by the vSphere, ESX, vCloud Virtualization stack and the power of transformational change it can offer clients…look no further than the EUC push for evidence of this change.
Not everyone possess passion, and I see examples of people without passion everyday…I can’t comprehend this…I can’t understand people that work without anything truly driving them…
One person with passion is better than forty people merely interested.
— E. M. Forste
Again, it’s almost impossible to represent what drives me…but I know i’d rather be passionate in life than not.
REMOVING DEAD PATHS IN ESX4.1 (version 5 guidance here)
Very quick post in relation to a slightly sticky situation I found myself in this afternoon. I was decommissioning a service which was linked to a VM which had a number of VMDKs, one of which was located on a dedicated VMFS Datastore…the guest OS also had a directly connected iSCSI LUN.
I choose to delete the LUNs first and then move up the stack removing the VMFS and eventually the VM. In this I simply went to the SAN and deleted the disk and disk group resource straight up! (hence the pulled reference in the title) Little was I to know that ESX would have a small fit when I attempted to do any sort of reconfiguration or management on the VM. The first sign of trouble was when I attempted to restart the VM and noticed that the task in vCenter wasn’t progressing. At that point my Nagios/OpsView Service Check’s against the ESX host began to timeout and I lost connectivity to the host in the vCenter Console.
Restarting the ESX management agents wasn’t helping and as this was very much a production host with production VM’s on it my first (and older way of thinking) thought of rebooting it wasn’t acceptable during core business/SLA hours. As knowledge and confidence builds with experience in and around ESX I’ve come to use the ESX(i) shell access more and more…so I jumped into SSH and had a look at what the vmkernal logs where saying.
Mar 11 17:55:55 esx03 vmkernel: 393:13:48:38.873 cpu8:4222)NMP: nmp_DeviceUpdatePathStates: Activated path "NULL" for NMP device "naa.6782bcb00014ebe60000035e4de4314c". Mar 11 17:55:55 esx03 vmkernel: 393:13:48:38.874 cpu12:4265)WARNING: vmw_psp_rr: psp_rrSelectPath: Could not select path for device "naa.6782bcb00014ebe60000035e4de4314c". Mar 11 17:55:56 esx03 vmkernel: 393:13:48:39.873 cpu11:4223)WARNING: vmw_psp_rr: psp_rrSelectPathToActivate: Could not select path for device "naa.6782bcb00014ebe60000035e4de4314c.
So from the logs it was obvious the system was having major issues (re)connecting to the device I had just pulled out from under it. On the other hosts in the Cluster the datastore was greyed out and I was unable to delete it from the Storage Config. A re-scan of the HBA’s removed the dead datastore from the storage list so if I still had vCenter access to this host a simple re-scan should have sorted things out. Moving to the command line of the host in question I ran the esxcfg-rescan command:
[root@esx03 log]# esxcfg-rescan vmhba39 Dead path vmhba39:C1:T0:L3 for device naa.6782bcb00014ebe60000035e4de4314c not removed. Device is in use by worlds: World # of Handles Name
And at the same time while tailing the vmkernal logs I saw the following entries:
==> vmkernel <== Mar 11 17:56:16 esx03 vmkernel: 393:13:48:59.768 cpu13:4118)Vol3: 644: Could not open device 'naa.6782bcb00014ebe60000035e4de4314c:1' for volume open: I/O error Mar 11 17:56:16 esx03 vmkernel: 393:13:48:59.768 cpu13:4118)FSS: 735: Failed to get object f530 28 1 4de4a1f8 3002130c 21000ff6 5abda09b 0 0 0 0 0 0 0 :I/O error Mar 11 17:56:16 esx03 vmkernel: 393:13:48:59.768 cpu13:4118)WARNING: Fil3: 1987: Failed to reserve volume f530 28 1 4de4a1f8 3002130c 21000ff6 5abda09b 0 0 0 0 0 0 0 Mar 11 17:56:16 esx03 vmkernel: 393:13:48:59.768 cpu13:4118)FSS: 735: Failed to get object f530 28 2 4de4a1f8 3002130c 21000ff6 5abda09b 4 1 0 0 0 0 0 :I/O error Mar 11 17:56:16 esx03 vmkernel: 393:13:48:59.769 cpu0:4096)VMNIX: VMKFS: 2561: status = -5 Mar 11 17:56:16 esx03 vmkernel: 393:13:48:59.873 cpu9:45315)NMP: nmp_DeviceUpdatePathStates: Activated path "NULL" for NMP device "naa.6782bcb00014ebe60000035e4de4314c". Mar 11 17:56:16 esx03 vmkernel: 393:13:48:59.874 cpu15:4265)WARNING: NMP: nmpDeviceAttemptFailover: Retry world restore device "naa.6782bcb00014ebe60000035e4de4314c" - no more com mands to retry Mar 11 17:56:16 esx03 vmkernel: 393:13:49:00.232 cpu15:4120)WARNING: vmw_psp_rr: psp_rrSelectPath: Could not select path for device "naa.6782bcb00014ebe60000035e4de4314c". Mar 11 17:56:16 esx03 vmkernel: 393:13:49:00.232 cpu15:4120)WARNING: ScsiCore: 1399: Invalid sense buffer: error=0x0, valid=0x0, segment=0x0, key=0x2 Mar 11 17:56:16 esx03 vmkernel: 393:13:49:00.232 cpu15:4120)WARNING: vmw_psp_rr: psp_rrSelectPath: Could not select path for device "naa.6782bcb00014ebe60000035e4de4314c". Mar 11 17:56:16 esx03 vmkernel: 393:13:49:00.232 cpu15:4120)WARNING: NMP: nmp_IssueCommandToDevice: I/O could not be issued to device "naa.6782bcb00014ebe60000035e4de4314c" due to Not found Mar 11 17:56:16 esx03 vmkernel: 393:13:49:00.232 cpu15:4120)ScsiDeviceIO: 1672: Command 0x1a to device "naa.6782bcb00014ebe60000035e4de4314c" failed H:0x1 D:0x0 P:0x0 Possible sen se data: 0x2 0x3a 0x0. Mar 11 17:56:16 esx03 vmkernel: 393:13:49:00.232 cpu15:4120)WARNING: ScsiDeviceIO: 5172: READ CAPACITY on device "naa.6782bcb00014ebe60000035e4de4314c" from Plugin "NMP" failed. I /O error Mar 11 17:56:16 esx03 vmkernel: 393:13:49:00.232 cpu15:4120)Vol3: 644: Could not open device 'naa.6782bcb00014ebe60000035e4de4314c:1' for volume open: I/O error Mar 11 17:56:16 esx03 vmkernel: 393:13:49:00.232 cpu15:4120)FSS: 3924: No FS driver claimed device 'naa.6782bcb00014ebe60000035e4de4314c:1': Not supported Mar 11 17:57:18 esx03 vmkernel: 393:13:50:02.431 cpu15:40621)WARNING: vmw_psp_rr: psp_rrSelectPathToActivate: Could not select path for device "naa.6782bcb00014ebe60000035e4de4314 c". Mar 11 17:57:18 esx03 vmkernel: 393:13:50:02.431 cpu15:40621)NMP: nmp_DeviceUpdatePathStates: Activated path "NULL" for NMP device "naa.6782bcb00014ebe60000035e4de4314c".
From tailing through those logs the rescan basically detected that the path in question was in use (bound to a datastore where a VMDK was attached to a VM) reporting the “Device is in use by Worlds” error. The e rrors also highlights dead paths due to me removing the LUN while in use.
The point at which the host went into a spin (as viewed by seeing the Could not select Path for device in the vmkernal log) was when I attempted to switch on the VM and the host (still thinking it had access to the VMDK) trying to access all disks.
So lesson learnt. When decommissioning VMFS datastores, don’t pull the LUN from under ESX…remove it gracefully first from vSphere and then you are free to delete on the SAN.