Seagate and other key members of the Open Compute Project (OCP) organization are working to define a new standard that makes it easier to manage drive health in hyperscale data centers.
Customers need storage systems to allow for automated management of large populations of storage devices. Progress in this area is impeded by a lack of standards in the industry in two key areas: Drive Health Status, and Storage Supply Chain Data.
Working group goal: a common API to manage drive health in data centers
Seagate and other OCP leaders are forming an initial working group to understand the needs of, and find consensus within, the industry to move to a common API for managing drive health. The group’s work is complementary with existing industry committees such as T10 and the Storage Networking Industry Association.
“Our goal is to bring our customers together who have similar problems but might be tackling them differently,” said Jun Liu, Principal Cloud Architect at Seagate, who recently presented a framework at the OCP Storage Committee meeting (download that presentation from OpenCompute.org now).
The working group will aim to bring users and vendors together to create consensus on the technology that can enable the diagnosis of drive issues in the field and accurately assess and act on drive health conditions. The group’s focus is to standardize the building blocks to allow different enterprises to build their own unique solution. The group will work to standardize supply chain data, standardize APIs for drive health data, and link Supply Chain and datacenter information in a common structure.
“For hyper-scale data centers, the manual process for checking device health in the field is getting impractical due to sheer volume of storage devices,” said Liu. “Soon IT professionals in the field won’t be able to do full diagnostics and in-situ repair based upon current monitoring, analysis, and reporting tools that provide limited raw data about local device attributes.”
Data center device repair, device re-purposing, and safe decommissioning
“Meanwhile, vendors have very advanced vendor-specific features that users can’t always adopt due to the lack of a common drive-health platform,” Liu continued. “By standardizing supply chain data to incorporate health-chain data, OCP believes we can make it possible to update data center operations, moving away from manual data imports to automated systems, improving productivity and avoiding human error.”
A set of well-defined standard APIs will not only enable efficient monitoring of device health conditions, but also simplify other data center operations like field device repair, device re-purposing, and safe decommissioning.
The “Health Chain Management” proposal will be complementary with existing industry forums, and tools, like SMART. SMART detects and reports on various indicators of drive reliability. The proposal would extend and enhance today’s functionality through a common platform to do holistic storage device health management.
OCP is a global community of technology leaders with a goal of building the most efficient computing infrastructures, working together to break open the black box of proprietary IT infrastructure to achieve greater choice, customization, and cost savings. OCP aims to reimagine hardware, making it more efficient, flexible, and scalable.