Like a hungry lion coming over the hill, massive amounts of data are hitting businesses and government agencies at an increasing pace. Solutions to handle such data, while still pricey, are becoming more common and thus more affordable. What can organizations do to handle the onslaught of so-called Big Data?
Before we tackle that question, let’s make sure we define what we mean by "massive." Nowadays it is not uncommon for organizations to be dealing with multiple terabytes of data, but we see only a select few folks dealing with petabytes, which is a thousand (give or take) terabytes. A few years ago, only the geekiest of us even knew what a terabyte was (1,000 gigabytes), much less a petabyte! By the way, the next unit of measure is exabyte, which is a thousand petabytes. When we start defining solutions for exabytes, that will mean that petabyte-capable solutions are commonplace.
Big Data is the buzzword used to define this massive amount of information. Generally speaking, there are two types of data that fall into this category. First, there is unstructured data, with which most folks are familiar. All your word processing docs, spreadsheets, presentations and even pictures, music and video fall into this category. As even the smallest business knows, these files can add up fast and can be difficult to administer and control.
Unfortunately, there are few independent tools to facilitate the management of unstructured data. All of the major storage vendors have introduced features and functions with their platforms, but of course these are proprietary.
The second category of Big Data deals with business intelligence, or BI for short. Also called business analytics, BI is an amalgamation of all of an organization’s data, from which information is gleaned to help with business decision-making.
The data structure upon which BI is built is far different from the data structure that the business uses to operate its normal business processes. While not a new concept, technical advances in storage speed and capacity have led to a marked increase of organizations using BI.
Back in the day, it was unheard of to have a completely separate, redundant database to support BI-type needs, but this is a much more affordable solution today. As with all technologies, we expect the trend of affordability to only improve.
A couple of relatively new products have demonstrated the commitment to this market. IBM finalized its acquisition of Netezza a few months ago and is actively marketing a BI "appliance." Basically, the Netezza appliance is a prepackaged solution with the necessary hardware and software for organizations to build their data warehouse.
Venerable database veteran Oracle announced in January its "Big Data Appliance." Obviously, this product is designed to capitalize on the Big Data craze.
While most Big Data solutions currently run into the hundreds of thousands of dollars, like anything else, we expect to see more affordable models in the near future.
———
John Agsalud is an IT expert with more than 20 years of information technology experience in Hawaii and around the world. Reach him at johnagsalud@yahoo.com.