IO Determinism and Streams: Seagate Nytro Q-Boost NVMe™ Features Improve SSD Behavior

  • IO Determinism and Streams - Seagate Nytro Q-Boost NVMe Features Improve SSD Behavior

To keep up with today’s hyperscale, data center, cloud and enterprise workloads, the technology industry needs to be continually blazing trails and seeking solutions outside of the box. This is not only true for the technologies that we come in contact with in our day-to-day lives, such as social media, store purchases and the surge in media content, but also the infrastructure and hardware that act as the foundation to enable the success of these technologies.

NVM ExpressThis is such the case with Seagate and its industry leadership developing advanced hard drives, SSDs and systems. Not only is Seagate deploying these storage hardware technologies for solid performance, reliability, efficiency and enabling lower TCO, it is at the cutting edge of developing features that further advance the capabilities of these technologies. One such instance of this is Seagate’s charge in leading the NVMe™ (or PCIe) SSD capabilities developed with the NVM Express™ Technical Workgroup and its open specifications.

Seagate has coined several of these new up and coming features in the NVMe specification with the brand name of Seagate Nytro® Q-Boost™. The first two features that have garnered a great deal of industry attention under this brand are Streams and IO Determinism. Seagate has played a key role within the NVM Express Technical Workgroup in defining these specifications and the capabilities of these features and has demonstrated their vast performance in various proof of concepts (POCs). It is also important to highlight that while these features vastly improve performance, they do so via different methods in which data is placed logically or physically on an SSD. This enables improvement of different aspects of performance that can be used harmoniously to get the best of both performance worlds, so to speak.

Nytro Q-Boost feature: Streams to keep your data inline … literally

Writing data to SSDs is not a cut-and-dry process – various steps are involved. Overwriting data that is no longer required is not an option with SSDs and the existing data must be erased before new data can be written. With unusable and older data written in multiple blocks across the SSD (workloads are coming into the SSD and are then scattered across it), the SSD must first collect the usable data, which is not intended for erasure, from these blocks, merge it, write it to another block and then (finally!) erase all the data from the previous block. End result – the erased block is then free of data and is now available for new data to be written to that block. The term used for this process is referred to as “garbage collection” in which the SSD repossesses the physical locations that no longer have usable data.

As you can probably conclude, this can take a hit to performance. This is where Streams come into play. This feature is a logical placement of data, meaning the data is not placed on a physical location, but is stored and organized in a methodical way across the SSD. With Streams, the host will identify workloads that contain “like” data that is very similar in nature and bundle this data together in a Stream and write it to the SSD. This is done via Stream identifiers for IO write commands, which in-turn serve as hints to identify the related group of data. Now with these Streams, this grouped data is written together so they can be erased together. This minimizes the performance impact of garbage collection as this operation can only impact a specific Stream, not all of them, resulting in reduced write amplification as well as efficient flash utilization – both improving performance and lowering TCO. In fact, the Seagate POC mentioned previously concluded at least a twofold improvement in write amplification!

Nytro Q-Boost feature: IO Determinism removes the “nosy neighbor” challenge

The second feature under the Nytro Q-Boost brand is IO Determinism. Without this feature, very different workloads could be written to any place on the SSD and any background task associated with the SSD, such as random and sequential workloads the are mixed in application environments, can negatively affect the latency of the SSD and, in turn, the overall system. Some call it the “noisy neighbor problem” where writes to the SSDs are interfering with the reads that are occurring, resulting in some reads returning to the server at a slower pace. I like to think of it more of in a minion analogy where the worker bees are trying to complete their tasks, but their workspace is haphazard with no order. So, with all of their different designated tasks located at random locations, they are running into each other and slowing the overall process down.

With the IO Determinism feature in place, the SSD is carved into multiple physical SSDs, or smaller sub-SSDs, that will allow the workloads to only run their IO processing in a designated portion of the SSD. Therefore, this parallel execution of IOs removes the previous conflict or overhead as workloads are not blocked from doing their own processing. The Seagate POC, in this case, was able to demonstrate an average read latency over 5x for higher performance and maximum read latency for better Quality of Service (QoS). In addition, as the host becomes more intelligent, commands can be sent from the SSD back to the host that will enable it to conclude which segments of the drive are not being utilized and where data can be extracted with minimal latency. Ultimately, this will provide predictable and improved latency (99.9999%).

The current state of NVMe and Seagate

Seagate Nytro 5000 1.92TBI encourage those interested in the current Seagate NVMe SSD offerings to check out Seagate’s NVMe SSD series. Partnering with other technology partners and confirming a high level operability testing is also critical to NVMe deployment success as a turn-key solution. For example, the Nytro 5000 NVMe SSD Compatibility Report Summary is available on the Seagate support site and we are adding new vendors constantly so you can be certain of your data center’s interoperability. And for more information on the features mentioned, I had the opportunity to present at one of Silicon Valley’s most influential conferences in March, the OCP Summit 2018. You can find the recording from the OCP Summit here.

The storage industry is quickly moving to NVMe, which makes the availability of an open and collaborative standard all the more critical. NVMe is not only serving high-end enterprise storage, but also client and mobile solutions, and is continuing to demonstrate that it is an extremely important technology to ensure industry success. It is also clear that these future features will allow the data center, especially the Cloud and Hyperscale, to manage SSD behavior to their benefit and I am happy to share that Seagate is at the forefront of this effort.


About the Author:

David Allen
David Allen is an NVMe board member, and is a Technologist in the Office of the CTO at Seagate Technology.