The SANMAN: Exchange Completion Time - SAN Storage Performance Redefined

Roll back several years and certain vendors had you believe that Fibre Channel was dead and that the future would be iSCSI. A few years later and certain vendors were then declaring that Fibre Channel was dead again and that the future was FCoE. So while this blog is not a iSCSI vs FC or FC vs FCoE comparison list (there’s plenty of good ones out there and both iSCSI or FCoE each have immense merit), the point being made here is that Fibre Channel unlike Elvis really is alive and well. Moreover Fibre Channel still remains the protocol of choice for most Mission Critical Applications despite the FUD that surrounds its cost, manageability and future existence. Most Storage folk who run Enterprise class infrastructures are advocates of Fibre Channel not only because of its high performance connectivity infrastructure but also due to its reliability, security and scalability. Incredibly this is all with the majority of Fibre Channel implementations being vastly under utilized, poorly managed (due to lack of visibility) and running at a far from optimized state due to the constant day to day operations of most SAN Storage administrators. Indeed if Storage folk were empowered with a metric that could enable them to gain a better insight and understanding of their SAN Storage’s performance and utilization the so called impending death of Fibre Channel may have to take an even further rain check. Well that metric does exist; cue what is termed the “Exchange Completion Time.”

It’s now common for me to visit customer environments that run Fibre Channel SANs yet have various factions that complain they are suffering performance issues due to lack of bandwidth or throughput, whether that's server, VM, Network or Storage teams. In every single instance FC utilization has actually been incredibly low with peaks of 10% at the most and that's with 4GB/s environments not 8GB/s! At worst there may be an extremely busy backup server that singlehandedly causes bottlenecks and creates the impression that the whole infrastructure is saturated but even these occasions are often rare. What seems to be the cause of this misconception is the lack of clarity between what is deemed throughput and what is an actual cause of bottlenecks and performance slow downs i.e. I/O latency.

Sadly (and I am the first to admit that I was also once duped), Storage folk have been hoodwinked into accepting metrics that just aren’t sufficient to meet their requirements. Much like the folklore and fables of Santa Claus that are told to children during Christmas, storage administrators, architects and engineers have also been spun a yarn that MB/s and IOPS are somehow an accurate determination of performance and design considerations. In a world where application owners, server and VM admins are busily speaking the language of response times, Storage folk are engrossed in a foreign vocabulary that revolves around RAID levels, IOPS and MB/s and then numerous calculations to try and correlate the two languages together. But what if an application owner requested Storage with a 10ms response time that the Storage Administrator could then allocate with a guarantee of that performance? That would entail the Storage engineer not just looking at a one dimensional view from the back end of the Storage Array but one that incorporated the comprehensive transaction time i.e. from the Server to the Switch port to the LUN. That would mean considering the Exchange Completion Time.

To elaborate, using MB/s as a measurement of performance is almost akin to how people used to count cars as a measurement of road traffic. Harking back to my days as a student and before all of the high tech cameras and satellites that now monitor road traffic, I was ‘lucky’ enough to have a job of counting the amount of cars that went through Trafalgar Square at lunchtime. It was an easy job, I'd see five cars and I'd click five times but this was hardly accurate as when there was a traffic jam and all of the lanes were occupied I was still clicking five cars. Here also lies the problem with relying on MB/s as a measurement of performance. As with the counting car situation a more accurate way would have been to instead watch each single car and measure it's time from its origin to its destination. In the same vein, to truly measure performance in a SAN Storage infrastructure you need to measure how long a transaction takes from being initiated by the host, received by the storage and acknowledged back by the host in real-time as opposed to averages. This is what is termed the Exchange Completion Time.

While many storage arrays have tools that provide information on IOPS and MB/s to get a better picture of a SAN Storage environment and it’s underlying latency it's also key to consider the amount of Frames per second. In Fibre Channel a Frame is comparable to a word, a Sequence a sentence and an Exchange the conversation. A Standard FC Frame has a Data Payload of 2112 bytes i.e. a 2K payload. So for example an application that has an 8K I/O will require 4 FC Frames to carry that data portion. In this instance this would equate to 1 IOP being 4 Frames and subsequently 100 IOPS of the same size equating to 400 Frames. Hence to get a true picture of utilization looking at IOPS alone is not sufficient because there exists a magnitude of difference between particular applications and their I/O size with some ranging from 2K to even 256K. With backup applications the I/O sizes can be even larger. Hence it's a mistake to not take into consideration the amount of Frames/sec when trying to measure SAN performance or if trying to identify whether data is being passed efficiently. For example even if you are witnessing a high throughput in MB/s you may be missing the fact that there is a minimum payload of data and the Exchange (conversation) is failing to complete. This is often the case when there’s a slow draining device, flapping SFP etc. in the FC SAN network where instead of data frames causing the traffic you have a number of management frames dealing with issues such as logins and logouts, loss of sync or some other optic degradation or physical layer issue. Imagine the scenario, a Storage Administrator is measuring the performance of his infrastructure or troubleshooting a performance issue and is seeing lots of traffic via MB/s – unaware that many of the environment’s transactions are actually being cancelled across the Fabric!

This lack of visibility into transactions has also led to many storage architects being reluctant to aggressively use lower tiers of storage as poor I/O performance is often attributed to the storage arrays when often bottlenecks in the storage infrastructure are actually the root cause. Measuring performance via Exchange Completion Times enables measurement and monitoring of storage I/O performance, hence ensuring that applications can be correlated and assigned to their most cost- effective storage tier without sacrificing SLAs. With many Storage vendors adopting automated tiering within their arrays some would feel this challenge has now been met. The reality of automated tiering though is that LUNs or sub-LUNs are only dynamically relocated to different tiers based on the frequency of data access i.e. frequently accessed is more valuable so should reside on a higher tier and infrequently accessed data should be moved to lower tiers. So while using historical array performance and capacity data may seem a sufficient way to tier, it’s still too simplistic and lacks the insight for more optimized tiering decisions. Such an approach may have been sufficient to determine optimum data placement in the days of DAS when the I/O performance bottleneck was disk transfer rate but in the world of SANs and shared storage to look just at external transfer rates between SSD, Fibre Channel or SATA drives is a detached and inaccurate way to measure the effect of SAN performance on an application’s response time. For example congestion/problems in the SAN can result in severely degraded response times or cancelled transactions that fail to be acknowledged by the back end of the array. Furthermore incorrect HBA queue depths, the difference between sequential and random requests, link and physical layer errors all have an impact on response times and in turn application latency. By incorporating the Exchange Completion Time metric i.e. measuring I/O conversations across the SAN infrastructure into your tiering considerations, tiering can now accurately be based on comprehensive real time performance as opposed to device specific views.

Monitoring your FC SAN Storage environment in a comprehensive manner that incorporates the SAN fabric and provides metrics such as the Exchange Completion Time rapidly changes FC SAN troubleshooting from a reactive to proactive exercise. It also enables Server, Storage and Application administrators to have a common language of ‘response times’ thus eliminating any potential silos. With the knowledge of application I/O latency down to the millisecond, FC SAN Storage administrators can quickly be transformed from the initial point of blame to the initial point of resolution, while also ensuring optimum performance and availability of your mission critical data.