Sunday, January 20, 2019

Supermicro IPMI - Redux

The X10 IPMI support on my new servers is great - no more Java!
HTML5 virtual console for the win, that plus H5 vSphere and NSX (increasingly) means the days of needing Java or Flash are numbered.
  I did have one hiccup though; right when I thought my cluster was ready to go I tested remote access, as I wanted to secure it with an ACL in addition to a non-standard username & strong password.
  I'd used the IPMI virtual CD-ROM to install ESXi onto these so was surprised to find I couldn't access two of the four anymore.  After many reboots, trying both static and DHCP I concluded something had become wedged in the firmware, as the ports were showing as up on the switches, but though frames were being sent to them the inbound counters were all zeros.
  My theory is that in the bit of code that decides between the dedicated IPMI LAN interface and sharing LAN1 there was a bug.  This is the default 'failover' mode, where it uses the dedicated port if it is determined to be connected when power is applied, but once it's failed over to using LAN1 it never recovers without a hard reset - which is a huge pain in my new 2U boxes with shared power for the two nodes, the only way to power cycle just one is to physically pull it from the chassis, making my remote switched PDUs pointless.  Don't ever apply power to these until your switches are fully booted - which in my case is several minutes, so in the event of a power loss it would break.  I did try putting the LAN1 ports on the lights out VLAN without any change amongst many other experiments  that on my workbench at home I was happy to do for curiosities sake, where in a datacenter I'd just want the boxes back up ASAP.
  Anyhow, I built a DOS boot USB key (this is useful) and put Supermicro's IPMICFG tool as well as the latest IPMI firmware on it (already a release newer than when I started setting these servers up in December).  After upgrading from 3.77 to 3.78 and setting a static IP again I was back in business and once back in the web interface I changed their LAN mode from 'failover' to 'dedicated' which will hopefully prevent the issue from reoccurrence.


Postscript -
Managed to screw up another one by upgrading to current release, but then it never came back after rebooting.  Querying it from the Linux command line tools just gave errors. 
The AlUpdate tool was able to re-flash it - after which it worked, but be warned this needs a hard power cycle which would've been hard if the box had been off in a colo somewhere.
'AlUpdate -f REDFISH_X10_380.bin -kcs -r n'
(Update via KCS channel without preserving config)

Monday, January 14, 2019

Homelab refresh

Finally replacing my homelab, for two reasons, consisting of three hosts from 2010 it was ancient, and additionally I lost a drive and my vSAN blew up.

vSphere has finally pulled out the x86 instruction emulation code that allowed really old CPUs to work so while 6.7U1 ran on my 5630L CPUs I couldn't do a clean install (would have had to install 6.5 and upgrade) and nested virtualization was becoming limited by the same thing, which is kind of my killer app for a homelab, on the hosts themselves upgrading is OK, but not being able to instantly have a >6.5 nested host was a pain.

I didn't understand vSAN :)  I'd been running it a long time on unsupported everything (controllers, drives, NICs, you name it) and my early mistakes couldn't be easily fixed as if I tried to reconfigure anything on the fly I got error messages rather than actions.  With money I could have fixed it - by replacing the controllers and buying enough disks for an additional disk group and migrating, or doing something ugly like moving data onto a USB drive or 2 bay NAS...didn't come to that anyhow as I lost so much data there was little point in saving any.

The critical thing I hadn't understood was that erasure coding needs a minimum of four hosts, more if you want to do maintenance, so turning it on in my three host cluster was not smart.  One of my SSDs failed and about half my VMs went with it as they must have had blocks on that disk group that couldn't be pulled from elsewhere.  I daresay I could have recovered many of them but in the lab nothing was critical enough to bother, my greatest pang is for my trusty Windows 7 admin VM...I have been way too cowboy in my lab, which was fine a decade ago, when it was a fraction the size, local to me, and using NFS storage.  These days when I blow it up with a pre-release build that I then find can't be upgraded, or by turning on features for fun before I understand the consequences it's a huge effort to recover.  Nested labs make a lot of sense, where I used to almost enjoy (OK enjoy is overstating it, but I did derive some masochistic pleasure from it and revel in being an expert), the installation pains of the VMware suite, many of those have been reduced (finally) to the point there's no learning in that stage of things.

As with the old cluster I got somewhat carried away building a new one, I similarly received some cast off gear for free and supplemented from eBay and stripping my old systems (only reusing the SSDs). I wanted to grow to four nodes without taking up more space, so when I was gifted a 2015 vintage Supermicro Twin I was happy to purchase a second in order to end up with four identical hosts in 4U of space (replacing 3 X 2U boxes).  This particular model has a SAS controller onboard so I can live with the 2 PCIe slots, reusing my Intel Optane 900p NVMe drives* in one for vSAN cache layer, and installing new Intel X710 10 gig NICs in the other.  (If I'd had a third slot I would've reused my X520's in order to have NICs to pass through to VMs when playing with NSX-T etc.)
The build took a long time as I wanted the firmware on the BIOS/IPMI/SAS controller (now in HBA mode) NICs to all be current - all of which takes a lot of power cycles and messing about, I do see why people purchase vSAN ReadyNodes.  These boxes, 2028TP-DC0R, support current Xeons, I'm using E5-2630L v3, which are not very recent, but cost and power effective and importantly Haswell series so good for some time to come.

The X710's were the biggest time suck, I had two fail on me, not sure if I was unlucky with static, or upgrading their firmware bricked them after a power loss or something.  I would've put the X520s in and been done with them but I only had three and I really wanted the four nodes identical.
I also had second thoughts on RAM, having built out with 128GB per node, deciding longevity would be served better with 192 per.  VMware's stack loves RAM and once I have a pretty complete SDDC running plus a few third party integrations I'd be swapping.

I also turned back on Transparent Page Sharing, enabled nested virtualization, and though I don't thing any of my operating systems support it right now TRIM in vSAN.
I'm now a happy camper, building out a nested lab, the below shows resources consumed by my management layer:



* The Optane are awesomely fast, and have ridiculous endurance too for consumer drives, the 280GB have 336GB inside which supposedly isn't used for traditional over-provisioning, but they must use some of it to help deliver that longevity.  I figure that having the cache tier off the main controller saves the queue in that for destaging to my relatively slow consumer grade SSDs too.  (I had some Enterprise SSDs at one point but they also gave me my only SSD failures, out off warranty of course, where the Samsung Pro consumer drives have been issue free)



Bill of materials:

2 X Supermicro 2028TP-DC0R Twin systems (four nodes) (3008 SAS controller onboard)
8 X Intel Xeon E5-2650Lv3 1.8Ghz 12 core
4 X Intel Optane 900P 280GB PCIe (cache drives, not on vSAN HCL)
4 X Supermicro 64GB SATA-DOM
4 X Intel X710DA Dual 10g SFP+ dual port
4 X Intel 1.6TB S3610 SAS SSD
4 X Samsung 850Pro SATA drives (not on vSAN HCL)
64 X 16GB ECC DDR4 DIMM
2 X HPE 5900 switches (48 gigabit, 4 SFP+, 2XQSFP)

Total, about 20K, but over time and much eBay so very approximate.

P.S. Were I doing this again I'd get the 2028TP-DC0TR, which is exactly the same but with Intel X540 ten gig NICs on board, cost difference now is negligible.