From the Blog

An icon for a calendar


The Surprising Things You Don’t Know About Big Data

The Surprising Things You Don’t Know About Big Data

You’re probably familiar with the terms byte, megabyte, and gigabyte — but do you know what a terabyte is? How about a petabyte, or an exabyte?

These lesser-used words describe units of Big Data, or data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process data within a reasonable amount of time. Think of it this way: one byte is equivalent to one letter, while one megabyte (or 1,024 kilobytes) is equivalent to one book. A gigabyte is then around 1,600 books (1,024 megabytes), while a terabyte is 1,024 gigabytes and a petabyte is 1,024 terrabytes. An exabyte, finally, is 1,024 petabytes, or 1,600,000,000,000 books- equivalent to about 3000 times the entire content of the Library of Congress.

Between 1986 and 2007, digital storage grew annually by 23%. Most data was stored on videotapes, vinyl LP records, audio cassette tapes, and photography during the pre-digital revolution world of the 1980s. On its own, paper-based storage represented one-third of all data storage in 1986; however, in the year 2000, 25% of all data stored in the world was stored digitally. By 2002, digital storage capacity had overtaken analog storage capacity, and by 2007, 94% of all data was stored digitally.

Today, more than 2.5 exabytes (or 2.5 billion gigabytes) of data are generated every day- an already-high number that’s expected to continue growing at a significant rate, with mobile devices responsible for much of this data. Some experts have hypothesized that 90% of all the world’s data today was produced within the last two years.

Of course, big companies play a big part in these units of data. For example, it’s currently estimated that Google stores over 10 exabytes of data; Facebook, meanwhile, collects 500 terrabytes of data every day and admitted to have stored 100 petabytes of photo and video as of 2012. Companies such as Amazon, Microsoft, Target, VMware, UPS, and AWS are all major players in Big Data as well.


We also encounter Big Data daily in a variety of ways. For example, Big Data can help accurately predict the outcomes of sports events or political elections. We also engage with Big Data any time we use our smartphones for directions or answering a question. If you’ve noticed a Facebook ad on your newsfeed that seems particularly relevant to your life, you can thank Big Data for personalized advertising and purchasing recommendations; and the next time you hit all green lights on your way to work, you’ll know that Big Data might have had a hand in streamlining your city’s traffic.

In the future, cloud based technologies will continue to see increased usage; a 2014 study found that 94% of organizations are or want to make cloud computing a part of their operations, and it’s estimated that 40 zettabytes of data will be created by the year 2020. However, with this increased in cloud technology will come an increased focus on security, since security and theft of intellectual property are cloud users’ primary concerns. We can also expect to see an increased usage of private cloud computing and an increase in related education and employment.