We are seeing a dramatic increase in worldwide data volumes. Seagate recently announced and has been discussing IDC’s new Data Age 2025 report which finds that data creation will swell to a total of 163 zettabytes (ZB) by 2025, 10x more than today. We are entering what is called the Exascale era, portending a form of computing anticipated to possess the same order of processing power as the human brain.
This is being enabled in considerable part by the advance of HPC (High-Performance Computing) and big data created as a result of computation-intensive research areas such as engineering, science and healthcare. However, due to the prolific rise of technologies such as AI and its ability to derive valuable insights from this data, it has become essential to have a system in place that can handle the requirements of HPC and big data.
HPC is classified as systems that provide an estimated million billion calculations per second of compute performance. While machines of this capability have opened up a wealth of opportunities in science and technology, due to the speed at which innovation is occurring, they will be insufficient to cater to the needs of these industries over the next few years as we enter the Exascale era.
Around 18 months ago here at Seagate, we launched the SAGE (Percipient Storage for Exascale Data Centric Computing) project. The aim of the SAGE project, one of fifteen projects recently funded under Horizon 2020, is to build an Exascale system prototype that can deal with vast amounts of data produced by simulations, scientific instruments and big data applications.
Now halfway through the 36-month project, we have made considerable progress. Based on Seagate’s Mero architecture, the driving force behind the SAGE platform which has fundamentally been built to handle massive volumes of data, we’ve developed a system which contains data-management tools, programming models and data analytics methods.
SAGE is demonstrating the first instance of intelligent data storage, uniting data processing and storage as two sides of the same rich computational model. This will enable sophisticated, intention-aware data processing to be integrated within a storage systems infrastructure, combined with the potential for Exabyte scale deployment in future generations of HPC systems.
The industry-first system consists of a multi-tiered storage system, made up of non-volatile flash, high-performance disk drives, and SATA disk drives — capable of dealing with computations close to data within these different tiers. This is a unique proposition and the system also allows computations to be run within the storage environment. In addition, what makes this system even more unique is its API capability, named Clovis. This allows third parties to add their own software and build features into the system, making it customizable to their data needs.
The SAGE project consists of ten partners who have come together to provide the different pieces of technology and defined the data-intensive use cases to co-design the system. The SAGE hardware prototype was built at Seagate HQ and has now been moved to Juelich Supercomputing Centre where it will be evaluated in a real world HPC environment for various use cases.
The use cases fall into eight fields, which are: extreme data analysis, application needs, programming techniques, optimization tools, next-generation storage media, extreme data management, advanced object storage and percipient storage methods.
The aim of this evaluation is to validate the necessity of such a storage system for the Exascale era and prove the benefits of its speed and efficiency over incumbent systems. The testing is beginning imminently and will run over the next 18 months. Seagate and the SAGE partners will be producing reports in September 2018 to provide both the HPC and big data communities, an insight into the potential impact of SAGE and the technology at the center of this project.
We’re proud at Seagate to be leading the SAGE Project and to be part of the wider Horizon 2020 initiative. As a leader in developing and deploying HPC storage we look forward to continuing our work in improving the technology necessary for Exascale computing. This project will redefine data storage for years to come and unlock the potential capability of big data applications across a number of industries.