The DataBoost Nexus #5
What is Big Data?
Now that we’ve covered the basics of business intelligence and data visualization, there’s another component that needs to be understood. You won’t get far in any conversation of business intelligence without running across the term Big Data.
Defining Big Data isn’t as easy as some of the other terms we’ve already discussed. Depending on who you’re talking to, Big Data is used in different ways, which means managers understand the term in different ways.
But before we get ahead of ourselves, let’s nail down a working definition.
Wikipedia defines Big Data as:
“…A term for data sets that are so large or complex that traditional data processing applications are inadequate.”
https://en.wikipedia.org/wiki/Big_data
This is an accurate definition, but it’s a bit general for our purposes. Large data sets could include star charts, or global climate tracking, or a list of everyone who’s ever subscribed to the New York Times. For our purposes, discussions of Big Data should be constrained to information that affects, comes from, or relates to your company or industry.
Techtarget.com refers to Big Data as:
“…An evolving term that describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information.”
http://searchcloudcomputing.techtarget.com/definition/big-data-Big-Data
This is much better, but still leaves a few questions open. First of all, what’s the difference between structured and semi-structured data? Also, what kind of information are we hoping to mine?
Forbes has defined Big Data as:
“…A collection of data from traditional and digital sources inside and outside your company that represents a source for ongoing discovery and analysis.”
http://www.forbes.com/sites/lisaarthur/2013/08/15/what-is-big-data/#273362003487
Perfect! Here we have an ideal working definition for Big Data. Big Data is information gathered by your company or by sources related to your company that could provide new innovations or discoveries if it is properly analyzed.
Examples of Big Data might include a data dump of all the orders ever made by every customer that has patronized your business, a spreadsheet of every credit card transaction your company has ever made, or metrics of purchasing habits of customers that use your product or service at the national level.
Each of these huge volumes of data would require more than Microsoft Excel to properly analyze. To extract meaningful information, a specific type of software must be found, modified, or produced to properly catalog all this information.
Now that we understand what Big Data is, the next step is figuring out what to do with it.