How Backblaze’s Fireball B2 Moves Big Datasets to the Cloud Easily, Safely and Swiftly

  • Move Big Datasets to the Cloud Easily

The increasing volume of data generated by today’s information-driven organizations represents the billions of frames of video shot by video production companies, billions of financial transactions each day, scans of important historical documents, medical data for billions of humans, decades of concert footage, the data flowing in from billions of Internet of Thing (IoT) connected devices, and much more.

Making sense of all that data is the central focus of IT 4.0 — the next stage in the data storage evolution, in which more and more data needs real-time processing and action near where it is created — close to the edge of the network, away from the core. And keeping the rising mountain of valuable data manageable, available for analysis and action, and safe from natural disasters, hacking, equipment failures, and all the other things that can and will go wrong represents one of the great challenges of the era.

By 2025, the world writ large — from media and entertainment creators to content distribution networks, from healthcare providers and medical firms to small businesses, banks, and law firms — will generate 175 zettabytes of data, according to IDC in the recent Data Age study. That’s more than five times the amount of data created up to 2016. With each zettabyte equal to a trillion gigabytes, that’s enough data to fill a row of DVDs encircling the globe 222 times.

Of course, no one would ever attempt to download that much data or archive it on optical media. But businesses, governments, and nonprofit organizations around the world are all dealing with their own explosions of data, and are increasingly challenged to keep it available, manageable, and secure.

That’s why cloud storage company, Backblaze has created a new device, called B2 Fireball. The Fireball solution enables ingest of large data sets at a given locale and transfer into Backblaze B2 Cloud Storage faster than standard business internet connections allow.

Fireball is essentially a massive hard drive array, ingesting up to 96 encrypted terabytes of data at the rate of 1 gigabit per second or 10 gigabits per second per Fireball device. It can be used to back up data from local drives/servers, transfer datasets from on-prem storage to the cloud, or take data generated “in the field” and rapidly get it to cloud storage for immediate availability to distributed teams.

To bypass the challenge of limited office bandwidth, the innovative Fireball approach combines some of the industry’s most advanced technology with an old-school data-transfer approach. When a business customer needs to transfer a massive data load, Backblaze physically sends them a Fireball. The customer fills up the Fireball and returns it back to Backblaze for secure upload inside a Backblaze data center.

“It’s like the old trope that you can’t beat the bandwidth of a FedEx truck,” said Ahin Thomas, Backblaze’s VP of Marketing. “When you need to move enormous quantities of data to the cloud, given today’s bandwidth limitations it’s faster to put that data on a hard drive array and take it to its destination.”

The challenge of IT 4.0 is data everywhere, and more of it

The IT 1.0 and 2.0 eras were marked by the rise of first mainframes and then personal computers, and IT 3.0 by the rise of mobile and cloud computing. IT 4.0 gives us all of that, with the addition of the IoT, even more data (at the edge), and artificial intelligence to make sense of it all.

While IoT is poised to grow with the emergence of 5G bandwidth, there’s a problem. Today’s internet connections — although gaining the bandwidth and speed needed to make IT 4.0 possible — are still limited in capacity. As Thomas points out, standard office broadband connections can handle day-to-day upload needs but don’t have the bandwidth to solve the problem of moving massive archives in bulk or getting the historical data off of failing hardware and into the cloud — and woe to anyone who tries to sneak mass data transfers through in the background. “If you soak the pipe, you’re going to hose everyone in the office,” he says.

That’s where a new breed of rapid ingest services and devices become necessary. Enter Backblaze’s Fireball.

How data ingest services solve what other networked cloud strategies don’t

Data ingest devices and services are gaining popularity as part of the new IT 4.0 paradigm, in which data gets continuously created everywhere, changes constantly, and must be delivered wherever users need to access it.

It’s a challenge that Backblaze embraces. The company has been a leader in cloud storage for over a decade. With many businesses and individuals increasingly realizing that their data storage needs may exceed the capabilities of cloud-based synchronization and file sharing from companies like Dropbox and Google, Backblaze’s position as a leader in easy-to-use, affordable cloud storage has earned them a rabid following.

Synchronizing data across devices and sharing files are important functions for daily operations, to be sure. But the complete backup of entire systems and large datasets — especially when individual files still need to be accessed at a moment’s notice — requires a dedicated backup solution.

Backblaze B2 Cloud Storage is similar to Amazon S3, but at about a quarter the price, Thomas says. Customers scale up or down as needed in real-time, enabling users to pay only for storage space they actually need. By contrast, sync-and-transfer services typically charge a flat monthly fee for fixed storage that users may or may not fill.

Still, it’s a challenge to transfer terabytes of data that today’s organizations generate into secure cloud storage for backup, without overwhelming office broadband connections. The B2 Fireball data ingest device solves this problem.

“The vast majority of offices need networking that supports a bunch of download capability, to support the entire office’s needs,” Thomas noted. “And they usually have a pretty decent upload capability, so their incremental day-to-day data that’s generated can be uploaded. But when you start looking at several years worth of data — looking at, you know, hundreds of terabytes and more — you just start doing the math and you realize you will never be able to get that amount of data up to the cloud using your existing networking system.”

“Even if they soaked their upload bandwidth, it would take years,” he emphasized.

“Of course, that’s a problem that’s not unique to Backblaze customers. It’s an issue for every data user who wants to utilize all the advantages of the cloud: save money, save time, make life easier,” Thomas said.

How Backblaze can save you months or even years of waiting

Backblaze launched B2 Fireball in 2017. Any customer can rent the device and have it sent anywhere in the world. The customer then loads the Fireball with data and ships it back to one of Backblaze’s secure datacenters. The encrypted data is then loaded securely into the customer’s B2 account.

Backblaze B2 Fireball

The B2 Fireball encloses 96 terabytes of storage in the form of an eight-hard-drive array. It can go anywhere data gets created and stored.

Backblaze published a chart showing just how long it would take to transfer 100 terabytes of data at various network speeds. At 10 megabits per second (Mbps), the average speed of mobile connections in the United States, 100 terabytes would take a whopping three years to transfer. A typical office internet connection, at 100 Mbps, would require about four months, still far too long to suit most Big Data users, especially on shared connections.

The B2 Fireball, with up to 10 gigabit-per-second wired transfer speed, needs only a few days to transfer 100 terabytes. And therein lies its advantage for backing up large amounts of data to the cloud.

Designing and building the Fireball — a data center in a box

Thomas explained the inspiration for the B2 Fireball.

“We knew the fastest, simplest way to move huge data sets was on a portable storage system,” Thomas recounted. “But obviously our customers don’t want to take their storage offline, drive it to our data center, and then let us upload it. That isn’t going to work. So then what?”

“What if we basically sent a data center in a box to the customer’s facility?” he asked. “And now we don’t have to use their bandwidth. They can directly connect through a super-fast network card on the Fireball.”

From that simple idea came a host of design choices; Thomas recalled the team’s main goals. “The customer needs this to be very easy — it needs to be effectively plug-and-play for them,” he said. “At the same time, in life there must be a balance between ease of use and security — and our customer needs this to be completely secure. Right?”

Because customers would have to ship the device back to a Backblaze data center, it would have to incorporate powerful encryption features.

“Yes, the major delivery services and their trucks are incredibly reliable,” said Thomas. “But it is not unheard of for a package to get misplaced while in transit. So everything’s got be encrypted.”

The device also had to be physically strong. “Our Fireball devices have these top-shelf cases — our VP of design personally stepped into to make sure that the specifications were to our standard. Still, at the end of the day, these are hard drives going in boxes,” said Thomas. “It has to be robust in transit.”

“Backblaze does great software and great storage — what else do we need?” the team asked. “Well, we’re going to need a partner on the fundamental box, right? We are not in the business of making hardware. And inside that box, great hard drives have to go in — great, very reliable, hard drives that are affordable, so we don’t have to charge very much for this program.”

Key to the device’s durability, said Thomas, are the hard drives inside. Besides being tough, they also had to be affordable to keep costs down for customers. Backblaze engineers knew that drives from Seagate could fit the bill, thanks to Seagate’s long-standing partnership supplying drives to Backblaze’s data centers. About 75 percent of the hard drives in Backblaze data centers come from Seagate.

“Reliable drives that do their job should be a simple thing,” said Thomas. “But anybody that looks at the hard drive market knows — there’s a reason Seagate is who it is: it’s not that easy. And when the drives do their job, then the boss can do his or her job, and then our partners and customers can make more and better use of our tools.”

“We’re grateful for the partnerships we’ve built up over the years,” said Thomas. “When we make phone calls to a company like Seagate saying, ‘this is what we’re thinking, and here’s the customer use case,’ we’re all very focused on the customer, and the program can come together easily.”

Customers want a scalable and predictable solution

So it’s no surprise that Backblaze is earning accolades from a slew of customers — from A to Z, and from Austin City Limits preserving its years of concert performance video, to WunderVu the production studio delivering engaging Virtual Reality stories and experiences — for helping them solve their data challenges. From the customer’s point of view, the process is scalable and predictable, and it works smoothly and efficiently, said Thomas. After ingesting the data to be backed up, the customer simply ships the fully encrypted Fireball back to Backblaze in a FedEx box.

Once in the data center, technicians upload it into Backblaze’s servers at high speed. “And now in a matter of days,” said Thomas, “they’ve migrated their entire archive. In the typical world of IT professionals, those words very rarely go together.”

While Backblaze stores over 1 Exabyte (1000 Petabytes) of customer data, there’s still a lot more to go. After all, there will be 175 zettabytes of global data to back up by 2025. Time to get busy.


About the Author:

John Paulsen
John Paulsen is a "Data for Good" advocate, with nearly 20 years in the data storage industry. He's helped launch many industry-firsts including HAMR technology, 10K-rpm and 15K-rpm hard drives, drives designed specifically for video and for gaming, Serial ATA drives, fluid dynamic HDD motors, 60TB SSDs, and MACH.2 multi-actuator technology.