I listened to the excellent Infosmack podcast focusing on a deepdive into Blade Servers vs Rack Servers. I guess it had the desired effect as it really got me thinking. Not so much about the main objective of the podcast, comparing blades to rack mount servers, but rack servers vs traditional blades vs Cisco UCS.
Over the last 6 months I have been neck deep in the Cisco UCS platform from both a blades and rack mount servers perspective. It struck me that many challenges raised by the panel are addressed with UCS.
Each topic I touch on bellow is probably a blog post in its own right so I have skimmed over them. My goal is to highlight vendors are aware of these issues and are actively working to resolve them.
Id like to highlight this does not go into ‘what is UCS’ is this post, for that I recommend:
Life Cycle for Chassis:
Nigel raised a very real concern for server architects and engineers around the longevity of the blade environment. With traditional rack and ever tower servers replacing them for the latest and greatest was an easy task. However when you introduce a blade environment element of longevity in delivered into the infrastructure. The blade chassis is a fixture that can be 2 or 3 times the life of the server and I/O components that the house. So how do venders get round this? Well the answer is to make the chassis as basic as possible. With the Cisco UCS 5100 series chassis you get some power, front to back airflow and a midplane. This midplane can handle up to 1.2Tb of aggregate throughput (Ethernet). This midplane is both the point of failure and the life cycle point of failure for a chassis. All other parts are easy to upgrade or replace however this midplane is built into the chassis and is not a quick fix should it fail or need to be upgraded.
The chassis midplane supports two 10-Gbps unified fabric connections per half slot to support today’s server blades, with the ability to scale to up to two 40-Gbps connections using future blades and fabric extenders. For this reason I’m fairly confident that the Cisco UCS 5100 series chassis would be future proof for a significant amount of time.
For me all of this shows, that while adding complexity to things, the concern over longevity and upgrades should be minimal. Especially as UCS is easily managed with only 1 non hot-swappable unit (chassis midplane).
NOTE: traditional blade chassis have inbuilt management etc. This adds an additional point of aging over UCS blades.
As I touched on above the midplane is a single point of failure in the 5100 series chassis. Within the chassis this is the only component that if a failure did occur would result in the loss of a chassis. You could ask yourself ‘why would you put all your eggs into one basket’, for me the risk is very small. However no self respecting Architect/Engineer would recommend putting this kind of risk into a production datacenter. This is when you would recommend multiple blade chassis. Now we get some testing questions:
Am I actually saving on rack space?
Have I increased my failure risk %?
Have I added additional cost and complexity?
There will be plenty of times when the answers to those questions will leave little choice but to go for rack mount servers. But for me this will only be for small business or small projects. When consolidating racks of servers or considering cloud based architectures, blades make allot more sense.
Big, Big issue this one. Just like when virtualisation first came round the questions of ownership caused issues. With virtualisation it became obvious (or evolved) that a new role was necessary, this was possible because virtualisation can only cover each discipline only to a certain level, leaving the SAN, Network and Compute teams to remain segregated. UCS now muddies the water because the network encroaches further into the network engineers realm further than ever before (especially when you add the Nexus 1000v into the mix). The same goes for the SAN and Compute.
This is not solved with UCS, if anything it can be exasperated. However careful planning and understanding can go along way to possibly improve the existing relationships between the disciplines.
Rack Space & Co-Location
Its hard to beat blades when it comes to space in a rack. Its fairly obvious when you look at a Cisco B230 M2 and how powerful it is for a half width blade that you can fill a rack with a very large amount of compute. Co-location of racks and blades is of course possible but with UCS you can manage both from the UCSM console.
Where things get complicated with blades and rack space is when it comes to power & aircon. A common rack setup from a power point of view would be twin 16 amp feeds. This should be enough to fully populate a rack with chassis. However you get into issues with air con and managing to meet what are often relatively low BTU’s (british thermal units). I once worked on datacenter that had plentiful electricity but could not populate more than 1 fully populated chassis and 1 half height chassis in a single rack. Unless you have a brand new datacenter build with blades in mind you are unlikely to be able to fit out a chassis environment (like the pic above).
One big advantage with UCS is that both blades and rack servers can be managed in the same way through UCSM via the Fabric Interconnects. In traditional rack server topology each server is a point of management, in traditional blades each chassis is the management point and with UCS this is aggregated to another layer, the Cisco 6100 series fabric interconnects.
UCSM is a Linux based OS run from the ROM and delivered through a webserver hosted on the FI. This software is the coalface of UCS and allows for the centralised management of every chassis connected to the FI. Where this become appealing in a cloud environment is it allows for a similar topology to VMware vCenter in that it can be contacted through and API and all components connected to the FI are treat like objects. If we compared this with traditional blades chassis each chassis is the management point so meaning an individual connection to each chassis. When you start looking at allot of chassis this becomes a bottleneck and logistical problem.
Are blades as dense as their rack equivalents? Well the guys on the podcast discussed what is quite a common occurrence. The perception of being able to pack more compute into a rack server over a blade server is a complex one. There are more rack servers that can out gun the blades but that number is falling. Then you have to look at things like how many Us is the rack server? e.g. Two 4u servers packed full of compute will probably lose out to 2 UCS chassis packed with RAM and the new westmere Intel chips.
Cisco can also utilise there patented memory extension technology and increase the memory amounts without increasing the DIMM slots.
Obviously this will change from deployment to deployment but in general if you plan carefully you can top trump the rack mount equivalent.
One point raised in the podcast was around local disk. Now local disk has been minimised in virtual environments generally just to host the hypervisor. However with VDI venders utilising local disk I can see this being a potential issue with blades going forward. Having said that if companies like Atlantis Computing are working on running VDI desktops directly out of memory, and with memory density only set to get higher this potentially a SAN less environment (blog post pending on that ).
The Virtual Interface Card (VIC) is a converged network adaptor (Ethernet & FC) that is designed with virtualisation in mind. VN-link technology enables policy-based virtual machine connectivity and mobility of network and security policy that is persistent throughout the virtual machine lifecycle, including VMware VMotion. It delivers dual 10-GE ports and dual fibre channel ports to the midplane.
So lets think of a fairly common example. 10 x Dell R710 2u servers with and additional PCI network card with quad ports and an additional PCI HBA with 2 dual ports. Lets assume every PCI card port is full.
4 NIC ports + 2 HBA ports x 10 servers = 40 Ethernet cables and 20 fibre. (this doesn’t include any management ports or the dual power supplies for each server)
When comparing traditional blade chassis this is reduced as you can add switch’s internally to the chassis however this model will not always work any you may need to use pass-through modules which will keep the number of Ethernet and fibre cables high.
With a UCS blade system this is reduced significantly with the introduction of FCoE. This FCoE strategy only operates between the chassis and the FI’s allowing for up to 40GBps up to the FI’s. A maximum of 8 twinax cables per chassis (4 per 2100 series fabric extender).
Nigel mentioned wireless as a possible alternative or future. Personally I think this is not round the corner technology wise but is within the realms of possibility.
This is where Cisco UCS can fall down because of the Fabric Interconnects. However concerns over how much a empty chassis will impact on rack server vs blade server capex are a little unfounded. Where things get expensive with Cisco UCS blades are with the fabric interconnects and the fabric extenders. Each can be purchased in singular formats (i.e. a single FE per chassis and single FI managing) however this introduces the single points of failure that we want to avoid.