Monthly Archives: April 2016


The DataBoost Nexus #8

The DataBoost Nexus #8

Big Data Implementation

Simply having access to big data repositories is meaningless in the absence of a worthwhile data management strategy. Your enterprise can import huge volumes of information at will, but without an achievable goal, and without the tools to analyze and organize that information, no benefit is gained.

Selecting a Data Strategy

The first step is laying out a tenable data strategy, and this generally revolves around shoring up a weakness or deficit in your organization. Has your enterprise historically had trouble making accurate sales forecasts? Does your customer support staff require more comprehensive information on client order history? Do you regularly encounter shipping or distribution problems that lead to lost sales and unhappy customers?

Whatever the weakness, a properly implemented data strategy can be extremely helpful in ironing the kinks out of your business process.

Big Data: Best Practices

This IBM-sponsored article offers a wealth of information on implementing big data solutions, and provides an excellent starting point for enterprises embarking on a big data strategy:

http://www.ibmbigdatahub.com/blog/10-big-data-implementation-best-practices

Perhaps the most important thing to remember from this list is the second point – implementing your big data solution should always be seen as a series of business decisions, and should not be hamstrung by your IT department.

IT departments can always be expanded and improved, and should never drive your core business decisions.

Big Data: Management

To drive the previous point home, the following article from TechTarget.com outlines a host of real-world big data projects that led to revolutionary improvements for several notable enterprises:

http://searchdatamanagement.techtarget.com/essentialguide/Big-data-applications-Real-world-strategies-for-managing-big-data

Had the businesses in this article let their IT departments determine which, if any, big data strategies were tenable, it is unlikely that the projects would have been so successful.

Big Data: Consulting Firms

To help implement your big data strategy, acquiring assistance from an outside agency that has already successfully managed a project similar to yours is one of the easiest and most cost-effective ways to ensure the success of your project.

Next week, we will discuss how to go about selecting an agency that can help you determine and implement a big data strategy.


The DataBoost Nexus #7

The DataBoost Nexus #7

Big Data Resources

Previously, we discussed a proper definition for big data, and we considered how data sets can be used in myriad ways to accomplish and complement a wide variety of goals.

The next question to be answered is where to find large volumes of data that are applicable to particular enterprises or industries.

Internal Data

The first sources to consider, and some of the most applicable and readily available, are sources inside your enterprise. A complete listing of client data, including addresses, email information, and any available demographic metrics, could be considered a form of big data – especially for larger enterprises with client lists that number in the thousands.

As well, purchase and transaction histories can be considered big data, especially for order histories that go back years or decades.

External Data

External sources of big data can be broken down into public and private repositories. While private repositories are often confidential or require significant expenditures to acquire, there are a number of publicly available data sets that are both massive and highly useful to a broad range of industries.

Public Data Sets

This article from LinkedIn provides an excellent starting point for finding public data repositories, including data.gov – perhaps the largest public source of data on the planet:

https://www.linkedin.com/pulse/20141210080103-64875646-the-free-big-data-sources-everyone-should-know

A more recent article from BigData-MadeSimple.com provides an expanded list of public sources that includes many of the sets listed above as well as a number of internationally available big data repositories:

http://bigdata-madesimple.com/70-websites-to-get-large-data-repositories-for-free/

Finally, this most recent list from Forbes offers data hunters a list of the top 30+ sources of big data that can be acquired at no cost:

http://www.forbes.com/sites/bernardmarr/2016/02/12/big-data-35-brilliant-and-free-data-sources-for-2016/#6a7da4167961

Big Data Strategies

Of course, acquiring a data repository is only the first step. To utilize the information contained in the data set, you’ll need an enterprise goal that can benefit from the use of large data volumes, and a technique for extracting, analyzing, and outputting the information contained in one or more of these repositories to fulfill that goal.

Next week, we discuss strategies for implementing big data.


The DataBoost Nexus #6

The DataBoost Nexus #6

Harnessing Big Data

Big data:

“…A collection of data from traditional and digital sources inside and outside your company that represents a source for ongoing discovery and analysis.”

http://www.forbes.com/sites/lisaarthur/2013/08/15/what-is-big-data/#273362003487

Properly defined, the term “big data” isn’t quite as imposing as it initially seems. More importantly, now that we know what big data is and the potential it provides, the most important question becomes how do we put it to use?

Different firms are going to use big data in different ways, but the same data repositories can also be put to use in completely different ways. For example, suppose I have access to a catalog of every gasoline transaction made throughout the country over the last year. What could I do with this information?

Data Has Multiple Uses

First, I could determine the average price of gasoline in America, and the average amount purchased per transaction. Easy enough. Or I could analyze repeat transactions to reveal how many Americans spend more than $100 on gasoline per month. I could even compare all these purchases against known state populations to determine the average amount spent on gasoline per person per state.

The point is that too often, when considering functions for big data, project managers tend to think that the type of information a company has access to determines what kind of data output is possible. This is not always true. A single big data source can provide a long list of possible areas of study which, while related, can be quite different.

This is why choosing the appropriate big data source for your project is crucial – not because the project is defined by the data, but because the data can be implemented and analyzed in so many different ways.

Where Can I Find Big Data?

Naturally, this begs the question – what are the best sources for big data? Next week, we work on answering that question.


The DataBoost Nexus #5

The DataBoost Nexus #5

What is Big Data?

Now that we’ve covered the basics of business intelligence and data visualization, there’s another component that needs to be understood. You won’t get far in any conversation of business intelligence without running across the term Big Data.

Defining Big Data isn’t as easy as some of the other terms we’ve already discussed. Depending on who you’re talking to, Big Data is used in different ways, which means managers understand the term in different ways.

But before we get ahead of ourselves, let’s nail down a working definition.

Wikipedia defines Big Data as:

“…A term for data sets that are so large or complex that traditional data processing applications are inadequate.”

https://en.wikipedia.org/wiki/Big_data

This is an accurate definition, but it’s a bit general for our purposes. Large data sets could include star charts, or global climate tracking, or a list of everyone who’s ever subscribed to the New York Times. For our purposes, discussions of Big Data should be constrained to information that affects, comes from, or relates to your company or industry.

Techtarget.com refers to Big Data as:

“…An evolving term that describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information.”

http://searchcloudcomputing.techtarget.com/definition/big-data-Big-Data

This is much better, but still leaves a few questions open. First of all, what’s the difference between structured and semi-structured data? Also, what kind of information are we hoping to mine?

Forbes has defined Big Data as:

“…A collection of data from traditional and digital sources inside and outside your company that represents a source for ongoing discovery and analysis.”

http://www.forbes.com/sites/lisaarthur/2013/08/15/what-is-big-data/#273362003487

Perfect! Here we have an ideal working definition for Big Data. Big Data is information gathered by your company or by sources related to your company that could provide new innovations or discoveries if it is properly analyzed.

Examples of Big Data might include a data dump of all the orders ever made by every customer that has patronized your business, a spreadsheet of every credit card transaction your company has ever made, or metrics of purchasing habits of customers that use your product or service at the national level.

Each of these huge volumes of data would require more than Microsoft Excel to properly analyze. To extract meaningful information, a specific type of software must be found, modified, or produced to properly catalog all this information.

Now that we understand what Big Data is, the next step is figuring out what to do with it.