Facebook, Open Compute Project rethink cloud storage systems for better scalability

Fast, scalable and high-performing cloud storage is becoming more essential to enterprises each day. As a hyperscale operator with exceptional needs, Facebook illustrates how a lot of companies are overhauling their storage systems to keep pace with rapidly rising volumes of data and evolving business requirements. The company, which has been a key player in cloud infrastructure initiatives such as the Open Compute Project, now handles 600 TB of new information each day and is continually exploring new technologies in search of ways to boost the efficiency of its data centers.

"At Facebook, we have unique storage scalability challenges when it comes to our data warehouse," wrote Facebook's Pamela Vagata and Kevin Wilfong on the company's Engineering Blog. "Our warehouse stores upwards of 300 PB of Hive data, with an incoming daily rate of about 600 TB. In the last year, the warehouse has seen a 3x growth in the amount of data stored. Given this growth trajectory, storage efficiency is and will continue to be a focus for our warehouse infrastructure."

Most enterprises will never have to deal with such enormous data quantities. However, Facebook's approach to storage, particularly how it has honed in on improving read and write performance to disk, may hold some important lessons.

Organizations of all sizes are being reshaped by cloud computing, social platforms and mobile devices, the combination of which Gartner has termed a "Nexus of Forces" that could accelerate growth in the amount of stored digital information – IDC estimates that the worldwide total could top 40 zettabytes by 2020. Accordingly, companies will need cloud storage solutions that enable economical expansion of infrastructure and big data projects, while maintaining strong performance.

Rethinking file systems and compression efficiency to better use storage capacity
Facebook engineers have rethought the file systems and storage formats in order to move raw data as seamlessly as possible onto disk. More specifically, Vagata and Wilfong explained that Facebook created the Record-Columnar File Format (RCFile) to provide both compression efficiency and row-based query processing. With RCFile, columns can be written as contiguous chunks, letting it offer 5x compression on disk of the source material.

But the taking up RCFile alone wasn't enough to get a handle on ever-growing volumes of Facebook data. Eventually, Facebook's teams made some changes to components of the open source ORCFile in order to improve read and write performance, and save memory and storage space. For example, they decreased the memory footprint of dictionaries in ORCFile and changed how the writer determined column encoding.

On the reader side, they added lazy column decompression and encoding, making Facebook's implementation of ORCFile three times as efficient as the original open source version. Overall, these tweaks allowed Facebook to reclaim almost 10 PB of warehouse capacity. It also continues to explore areas such as cold storage data centers and the usage of RAID in the Hadoop Distributed File System.

For example, at the most recent Open Compute Summit, Facebook demonstrated its idea to use 10,000 Blu-ray discs for petabyte-scale cold storage. Cabinets filled with dense discs could store rarely accessed assets, such as photo backups, and be accessed via robot. Tepid consumer demand for high-capacity Blu-ray may have contributed to these plans, although Facebook has also discussed using low-grade SSDs ("the worst flash possible") for operations that would feature few, if any, rewrites of data.

Scaling cloud storage infrastructure while focusing on energy efficiency
The Open Compute Project has cast a wide net in addressing the infrastructure and cloud computing needs of enterprises, with the Blu-ray initiative being just one way that its community has approached issues of storage scalability. Since its inception, the project has focused on streamlining data centers through the usage of vanity-free cloud hardware, alternative cooling methods (open air in cooler locales such as Sweden, rather than air conditioning) and sharable energy-efficient server designs.

TechTarget's Alex Barrett noted that after Microsoft joined the Open Compute Project, it contributed a server with a 12 U chassis and capacity for JBOD expansion. Microsoft corporate vice president Bill Laing stated that this type of hardware was used to power critical cloud computing services such as Office 365 and Xbox Live. Additionally, it has reduced Microsoft's costs by 40 percent compared to traditional appliances, while reducing power consumption by 15 percent.

Such innovations from the Open Compute Project make efficient infrastructure designs more widely available to enterprises, at a time when large data volumes require fresh approaches to cloud hardware and software. Contributors have even gone so far as to work on open switch designs and leverage Ethernet for direct connection between drives.

Seagate Kinetic Open Storage and the transition from files to objects
Similar to how Facebook has continually switched between file formats to improve its storage practices, many enterprises are taking up object storage systems in order to deal with seas of data. These setups are unstructured, without the hierarchy of a file system, meaning that information is stored in a pool and can be retrieved using unique keys that correspond to individual chunks.

Seagate has been at the forefront of this trend, having unveiled earlier this year its Kinetic Open Storage Platform. Users can utilize high-capacity HDDs with Gigabit Ethernet ports in place of SAS pins, along with an API based on OpenStack Object Storage.

"The file system is gone," stated Ali Fenn, senior director of advanced storage at Seagate, according to IT Knowledge Exchange. "The drive does the space management. Applications are dealing with objects, and we should let them deal with objects right down to the drive … If an app is doing object, all it's trying to do is put and get objects – it's really that simple."

Fenn also added that Kinetic Open Storage Platform could reduce total cost of ownership of storage arrays by 50 percent. Essentially, it would strip out the storage server tier services and enabled applications and drives to communicate directly. Such an arrangement enables increasingly economical, scalable enterprise storage at a time when the cloud is chan​ging the face of business and data center operations.

2014-04-14T16:23:24+00:00

About the Author: