A wise man once told me that if there were a major car crash further up the highway, having a faster car would only get me to the accident quicker. Obvious right? Not so it seems when the wisdom of these words is applied to the analogy of the growing number of SAN infrastructures currently upgrading from 4Gbps to 8Gps. ‘Faster means quicker, means better’ is the commonly heard sales pitch used to seduce vulnerable IT Directors who dream of ‘a guaranteed performance improvement that would solve the headache of their ever slowing applications’. Sadly though for many of those that bit the 8Gbps apple, the significant improvement never came and like a culprit with no shame the same voices returned claiming that this was the fault of the outdated servers, HBAs and storage systems which also now needed to be upgraded. So down the 8Gbps road they went which now extended from the fabric all the way to the server platform, but still no significant improvement and if so certainly not one that could justify such a heavy investment. Like any infrastructure, being unaware of the SAN inevitably means that any unseen problems caused by error statistics such as CRC errors, physical link errors, protocol errors, code violations, class 3 discards etc. (i.e. the car crash) would remain, regardless of whether you get there at 4Gbps or 8Gbps. So how could such a simple concept be lost amongst the numerous 4Gbps to 8Gbps upgrades that are now taking place across the SAN stratosphere?
The main reason is that there are clearly several seemingly instant advantages with the 8Gbps standard. Having one byte consisting of 8 bits, giving you a potential 800 MB per second gives you the immediate impression that you are able to potentially double the transmission of your data within the same single cable. Logic would then dictate that with both SAN switches and storage systems having 8Gbps ports, you also now have the freedom to double the number of hosts to a single storage port without the fear of any performance impact. Logic would also conclude that extra bandwidth would be a blessing in a virtual environment where dozens of VMs scramble for a limited number of ports while blade servers subsequently struggle to house the physical space for their growing HBA demands. Couple this with the ever-nearing cost equivalence to their 4Gbps component counterparts and such advantages become unavoidable choices for end users.
Indeed it’s the drive for ‘more throughput’ in this virtualisation era that has really kicked the 8Gbps juggernaut into top gear. Pre-virtualisation world, (which surprisingly wasn’t even that long ago yet already seems like an aeon) the relationship between server, application, SAN and storage were straightforward and one-dimensional. A single host with one application would connect to a dual redundant SAN fabric that in turn would be mapped to a single LUN. Today everything has multiplied, with a single physical server hosting numerous virtual servers and applications being connected to several storage interfaces and numerous LUNs.
Solutions such as N_Port ID Virtualization NPIV and N_Port Virtualization (NPV) have gone even further by enabling the virtualization of host and switch ports. Now via NPIV your single HBA can be termed an N_Port and consequently register multiple WWPNs and N_Port ID numbers. So now what was once just a single physical server can now house numerous virtual machines each with their own Port IDs, which in turn allows them to be independently zoned or mapped to LUNs. On the switch side, NPV presents the switch port as an NPIV host to the other switches. Hence expanding a SAN can be rapidly deployed without the burden of worrying about multiple domain IDs.
So while the case to upgrade to 8Gbps is on the offset quite compelling, further analysis would show that this isn’t necessarily the case. Reality and not logic shows that a lot of the aforementioned advantages have been related to ‘guess work’ and assumptions. Moreover and ironically the rush to 8Gbps is actually causing more problems than were previously existent within data centers unbeknownst to the majority of end users due to their inability to soundly monitor what’s happening in the SAN. To begin with if we revisit the concept of FC bit rates and their constant increase from 2Gbps to 4Gbps and now 8Gbps, one should be aware that the consequence is a proportionally decreasing bit period. Hence this now shrunken window of data requires an even more robust physical infrastructure than before and becomes even more susceptible to potential errors - think Michael Schumacher driving his Ferrari top speed on the same public road Morgan Freeman took Miss Daisy.
While you may not have had performance issues with 4Gbps, by upgrading to 8Gbps and its greater sensitivity to light budget you instantly expose yourself to more bit stream errors, bit-error rates and multiple retries i.e. delays, disruption and performance degradation of your mission critical applications. Of course this isn’t always the case but when FC cables are bent at 70 degrees or more, quality of optical transceivers / in-line connectors are not upgraded or small specks of dirt reside on the face of optical cable junctions, your environment suddenly becomes doubly susceptible to jitters and major errors on your SAN fabric. Factors which were previously transparent at 4Gbps become significant performance degraders in the highly sensitive mould of 8GBps.
So as organizations upgrade to 8Gbps without having taken these factors into consideration, we see countless troubleshooting and even HBA replacements as there is no real insight into these transmission errors from current SRM tools. Orange OM2 fiber-optic cables may get replaced for aqua OM3 fiber-optic cables and SFP transceivers swapped for SFP+ transceivers leaving administrators thinking they’ve solved the problem. Worst of all though such fire-fighting tactics often lead to a temporary elimination of performance problems, only to then without any explicable reason rear their ugly head like a persistent zombie from a horror flick that refuses to die.
Given the recent revelations in the industry that SAN fabrics are being over-provisioned on average by at least a factor of 5 times, there clearly is little reason for most companies to upgrade to 8Gbps. When all of your applications are receiving the bandwidth that only 5% of your applications actually need, going straight to 8Gbps leads to even poorer configuration and further waste. This scenario becomes even more complicated given the fact that server virtualization has led administrators to over- provision their SAN infrastructure in a fear that they can’t accommodate their bandwidth requirements. Also with an increase of SSDs being deployed in the majority of enterprise infrastructures, going up to 8Gbps seems a natural way of making the most of their expensive disk investment. Problem is that having SSDs running on upgraded yet over-provisioned links which already suffer from jitters may give some performance improvement over their mechanical disk counterparts but are hardly running at optimum levels.
To solve such a dilemma and gain the true benefits of an 8Gbps upgrade it’s important to have an instrument which captures both directions of every SCSI I/O transaction from start to finish on every link carrying your business-critical data. In a recent discussion with IBM’s DS8000 specialist Jens Wissenbach, it was agreed that the solution of deploying TAPs on all the key links within the data center is the only way to truly detect the number of light levels, signal quality, throughput metrics, latency and response times, as well as protocol violations. With such real-time visibility into your FC infrastructure the administrator can quickly determine if any of the applications are in actual need of an excess to 4Gbps or where in fact the performance problems are coming from whether that be a bent cable, a speck of dirt or an outdated SFP.
TAPs such as those provided by the company Virtual Instruments, will soon be the natural replacement for patch panels across all enterprise data centers. But their role could also be the tool that allows end users to provision their SAN links to properly accommodate their SSD and VMware requirements without over-provisioning and being blinded by performance degradation that is beyond the scope of their SRM tool. So as Fibre Channel vendors are planning to start rolling out 16Gbps products for next year and with the news that the standard for 32Gbps Fibre Channel is already being worked on, it’s imperative that such upgrades take place with the correct preparation so as to maximize the benefits of such an investment.