Virtualization Is Life!
HomeLab - SuperMicro 5028D-TNT4 Storage Driver Performance Issues and Fix

HomeLab - SuperMicro 5028D-TNT4 Storage Driver Performance Issues and Fix

Ok, i’ll admit it…i’ve had serious lab withdrawals since having to give up the awesome Zettagrid Labs. Having a lab to tinker with goes hand in hand with being able to generate tech related content…point and case, my new homelab got delivered on Monday and I have been working to get things setup so that I can deploy my new NestedESXi lab environment. The issue that I came across was to do with storage performance and the native driver that comes bundled with ESXi 6.5. With the release of vSphere 6.5 yesterday, the timing was perfect to install ESXI 6.5 and start to build my management VMs. I first noticed some issues when uploading the Windows 2016 ISO to the datastore with the ISO taking about 30 minutes to upload. From there I created a new VM and installed Windows…this took about two hours to complete which I knew was not as I had expected…especially with the datastore being a decent class SSD.By way of an quick intro (longer first impression post to follow) I purchased a SuperMicro SYS-5028D-TN4T that I based off this TinkerTry Bundle which has become a very popular system for vExpert homelabers. It’s got an Intel Xeon D-1541 CPU and I loaded it up with 128GB or RAM. The system comes with an embedded Lynx Point AHCI Controller that allows up to six SATA devices and is listed on the VMware Compatibility Guide for ESXi 6.5. I created a new VM and kicked off a new install, but this time I opened ESXTOP to see what was going on, and as you can see from the screen shots below, the Kernel and disk write latencies where off the charts topping 2000ms and 700-1000ms respectivly…In throuput terms I was getting about 10-20MB/s when I should have been getting 400-500MB/s.  !(/images/2016/11/TNT4_perf_2.png) ESXTOP was showing the VM with even worse write latency. !(/images/2016/11/TNT4_perf_3.png) I thought to myself if I had bought a lemon of a storage controller and checked the Queue Depth of the card. It’s listed with a QD of 31 which isn’t horrible for a homelab so my attention turned to the driver. Again referencing the VMware Compatability Guide the listed driver for the conrtoller the device driver is listed as ahci version 3.0.22vmw. !(/images/2016/11/TNT4_perf_4.png) I searched for the installed device driver modules and found that the one listed above was present, however there was also a native VMware device drive as well.

 esxcli software vib list | grep ahci
sata-ahci 3.0-22vmw.650.0.0.4564106 VMW VMwareCertified 2016-11-16
vmw-ahci 1.0.0-32vmw.650.0.0.4564106 VMW VMwareCertified 2016-11-16

I confirmed that the storage controller was using the native VMware driver and went about disabling it as per this VMwareKB (thanks to @fbuechsel who pointed me in the right direction in the vExpert Slack Homelab Channel) as shown below.

 esxcli system module set --enabled=false --module="vmw_ahci"
 esxcli system module list | more
Name Is Loaded Is Enabled
----------------------------- --------- ----------
vmkernel true true
chardevs true true
user true true
....

....
vmkapi_v2_1_0_0_vmkernel_shim true true
vmkusb true true
igbn true true
vmw_ahci true false
iscsi_trans true true
iscsi_trans_compat_shim true true
vmkapi_v2_2_0_0_iscsiInc_shim true true

After the host rebooted I checked to see if the storage controller was using the device driver listed in the compatability guide. As you can see below not only was it using that driver, but it was now showing the six HBA ports as opposed to just the one seen in the first snippet above. !(/images/2016/11/TNT4_perf_9.png) I once again created a new VM and installed Windows and this time the install completed in a little under five minutes! Quiet a difference! Upon running a crystal disk mark I was now getting the expected speeds from the SSDs and things are moving along quiet nicely. Hopefully this post saves anyone else who might by this, or other SuperMicro SuperServers some time and not get caught out by poor storage performance caused by the native VMware driver packaged with ESXi 6.5. References: http://www.supermicro.com/products/system/midtower/5028/SYS-5028D-TN4T.cfm https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2044993

18 Commentsarchived

  1. Xavi
    Hi, thanks for the info!! I had the same problem with an old NUC, after disabling the vmw_ahci I see again the six HBA ports and the speed comes back to the normal. The model is a: "Panther Point AHCI Controller"
  2. GregS
    Thankyou! I had the same issue on a Supermicro 5018A-FTN4 (Atom C2758 CPU). I had been tearing my hair out all day wondering why the disk performance had deteriorated. I was also seeing lost datastore errors on my two local disks. I checked the compatability guide and it is the same driver for this system as you have above. I disabled the VMware driver and I seem to be back in business.
  3. Paul Braren
    Anthony, thank you so much for your careful documentation on this issue, which I'm also carefully tracking as well. As you know, turns out it doesn't seem to affect everybody with Xeon D.
  4. Owen
    Reporting fix here also resolved extremely high latency (using an SSD) on my Skull Canyon NUC with the Sunrise Point-H AHCI Controller (http://www.vmware.com/resources/compatibility/detail.php?deviceCategory=io&productid=42126&deviceCategory=io&details=1&partner=46&releases=338&keyword=sunrise&page=2&display_interval=10&sortColumn=Partner&sortOrder=Asc)
  5. frankbrix
    Thanks for sharing. I had the same problem on my Intel NUC 5th gen.
  6. Wes
    Thanks for sharing indeed. The problem CAN go all the way back to Sandy Bridge and the Cougar Point SATA-AHCI controller, as I can attest. Couldn't get more than 1MB/s out of an Intel 730 SSD 480GB, which had been speedy under 6.0U2.
  7. NeededHelp
    Very good detective work!! Worked like a charm for me. can't thank you enough.
  8. briandm81
    And that did the trick! You rock! Thank you so much!
  9. briandm81
    And I spoke too soon. This works fine on one of my systems. But on my main box, when I disable that driver, all hell breaks loose. I can't even reboot without a hard reset. What am I missing?
  10. Nick Moody
    Anthony, I cant thank you enough for this post! It fixed the issue on my Lenovo TS140 post an install of ESXi 6.5 and enable me to go to bed at a sensible hour :-)
  11. John
    It looks like the content between the lessthan and greaterthan characters was stripped.
    1. Anthony Spiteri
      Great to hear...and pleased the post helped you get that performance back!
  12. MrLight
    Drivers released on 3/14/2017 does not seem to help either:
  13. Gabe G. (@1GabeG)
    For those of you who were helped by this fix, what are you seeing in terms of SSD performance?
  14. Andy
    Thank you thank you. I was getting only 2Mbps upload speed and disabling the AHCI module helped alot. Now I'm back at avg'ing @ 900Mbps when I'm uploading files to the datastore. #$%^&*()
    1. Anthony Spiteri
      No worries...glad to have helped. It can be a frustrating thing to work through.
  15. APR911
    Although not the same issue I was experiencing, this article was hugely helpful! Thanks for the write-up!
  16. Antony Sysadmin
    This also fixed a problem for me where my new SSD was unreadable as a datastore in exsi. Hardware is Supermicro 5018A-FTN4 and SSD is an off the shelf model. Before I was unable to format the drive as a datastore, and could not write to it using the commandline (even dd failed) however following from your suggestion to disable the driver it is now working fine, so thanks very much.
  17. mike
    Have Asus h110I-Plus with Intel ssd S3520 150G, and the same problem! Get ride with your article! Thanks very much!!! Helped!
  18. forinsc
    Thanks for your article! I tried to replace the driver and it works, but didn't fix my problem. I have two SSDs in my system and every normal poweroff would cause a increase of "Unexpected Power Lose Count" in the S.M.A.R.T of SSDs, but my hardwares like Supermicro X11SAE-M with Intel C236 chipset and Samsung SM863 SSD are offically supported so I get really confused, I wonder if every ESXi_6.5_U1 system have this problem, any suggestions? By the way, the HBA controller VID/DID is 8086/a102. Thank you sincerely!