Big Data Friday Funny


, , , , ,

Courtesy of Datafloq (again), here is a great cartoon reflecting the decision making process post Big Data.

The less funny reality is that decision making before Big Data wasn’t too different than this cartoon’s reference, just less informed pain that came in two forms.  In the first scenario, data was presented with slide after slide of powerpoint with every form of suggestion or mind-numbing comments thrown in.   Those meetings or teleconferences took longer and involved less information and more opinion.  (I imagine that is the evolution from the plastic overhead protector slides.)

The second scenario involved just making a decisions with no information – data or opinion or otherwise.

Either way, at the end of the day, this cartoon is easy to relate to in a data driven world.  Enjoy!  More on Datafloq


Source: Datafloq (reprinted)


What Can Big Data Do for YOU?


, ,

After several posts on the Big Data implementation process, let’s look at the ways Big Data can be used by your organization.

The following is copied from the Datafloq website, which provides many great articles on Big Data implementation.  I highly recommend subscribing to their newsletter.

9 Generic Big Data Use Cases to Apply in Your Organization

(reposted from DataFloq)

Big Data means something different for every organization and every industry. What Big Data can do for your organization depends on the type of company, the amount of data that you have, the industry that you are in and a whole lot of other variables. Whenever I advise organization on their Big Data strategy, this is the main problem; there are so many different possibilities and often it is a struggle to find the right use case to develop into a Proof of Concept. That’s why I have developed the Big Data Use Case framework, to help organizations understand the different possibilities of Big Data and what it can do for their business. The framework divides 9 generic Big Data use cases into three different pillars:

  • Your Customers;
  • Your Product;
  • Your Organization.

For each pillar there are three Big Data use cases that can be defined, which are relevant for all organizations across all industries. The framework looks as follows and let’s discuss the Big Data use cases one by one:


360 Degrees Customer View

Developing a complete view of your customer is important for every organization, as it helps to understand what your customer wants, what the needs and preferences are and how the customer has to be approached. When you combine multiple data sources you can get a 360 degrees overview. Your internal sources such as your CRM data, sales data or call centre data can be combined with external sources such as social media data or news data. The retailer Walmart is the best example of creating 360 degrees customer profile. Thanks to their online marketing platform, they are capable of creating segments of 1.

Understand The Market

When organizations want to gain a better understanding of the market, they traditionally turn to market research organizations. Consumer panels, focus groups and questionnaires provide insights in what the market thinks, but unfortunately it is time-consuming, expensive and always offers insights from the past instead of the future. With Big Data this is not necessary anymore. When you mix different data sets such as sales data, market news data and social media data, you can get real-time insights what the market thinks of your product and when you launch for example a new commercial, you can get insights in real-time how it is perceived. Big Data brings market research to the next level.

Find New Markets

Analysing various data sources such as web statistics and social media can help you find new markets or customers with latent needs that you were not aware of. Using techniques such as Natural Language Processing or Machine-Learning you will be able to better anticipate what (potential) customers are looking for and pattern analytics can result in finding completely new markets.

Personalized Website / Offering

Big Data is all about relevancy and offering the right product/service to the right person for the right price via the right channel at the right moment in time. Google personalizes its search results based on your profile and Amazon offers different homepages, with different products on offering, to almost each visitor. It comes back to completely knowing your customer by combining different data sources to really know what they are looking for. There are ample examples of companies successfully targeting their customers with personalized products including the InterContinental Hotel Groupand Spotify.

Improve Service

Big Data enables you to drastically improve your service. Using deep data analytics, you can optimize your customer service, resulting in happier customers. A great example of this practice is Southwest Airlines; they use speech analytics to extract in real-time deep and meaningful information out of live-recorded interactions between staff and customers. This data, combined with other sources such as customer profiles, flight information and social media data enables their staff to offer consistent high-quality service.

Also smart cities can use Big Data to better organize their cities and improve the service towards citizens. The smart city of Songdo is a great example, where even garbage is analyzed in order to improve the garbage removal services.

Co-create and Innovate

Big Data not only provides your insights about your customers, but can also give you information regarding the products and how these are being used. When you are capable of monitoring how the product is being used via sensors and telematics, you gain a deep understanding of how you can improve the product. In addition, simulation analysis using massive amounts of data and supercomputers will enable you to drastically speed-up the innovation of your products. As a result, P&G used simulation analytics to create thousands of iterations in seconds in order to find the best disposable diaper.

Reduce Risk / Fraud

Anomalies and outliers can easily be detected with Big Data and these anomalies and outliers could indicate fraudulent actions. MasterCard uses Big Data to determine during the payment process whether or not a certain payment is legitimate or fraudulent. In addition, Big Data can also reduce the risk you are facing. When you have a better understanding of your customer, you can better determine their risk profile (whether it is a customer or a business looking for a credit, mortgage or insurance). English car insurer Insurethebox is a pioneer in reducing risk by allowing customers voluntarily to have their driving habits monitored. The better they drive, the lower their insurance fee. Of course this also reduces the risk for the company.

Better Organize Your Company

Employees generate massive amounts of data at the office. Sensors installed on office furniture and throughout the office can provide insights in how employees behave at work. These insights can be used to better organize the workplace. Cubist Pharmaceuticals for examples, used data to reveal it had too many coffee machines. By reducing the amount of coffee machines and creating centralized coffee spots, they increased serendipitous interactions among employees.

You can also monitor all the unstructured data such as emails, documents and meetings to know which employee is knowledgeable about what topic and which employees interact with each other. This should not be seen as spying on your employees, but will help employees to find the information they need faster and more efficient.

Understand Your Competition

Of course, what you can do for your own organization, can also be done, more or less, for your competitors. When you monitor the pricing strategy of your competitor, Big Data can inform you in real-time when they adjust pricing, allowing you to responds faster. Of course, this strategy is not completely without risks as when two algorithms start interacting, strange things can happen; such as a book about flies for sale for over $ 23 million.

These nine Big Data use cases are just the tip of the iceberg of what is possible with Big Data. Of course, specific use cases differ per organization and industry, but hopefully this framework provides some guidance in how you can start with Big Data.

What is Data Munging?


, , , , ,

One of the advantages of Big Data is the Data Lake. The Data Lake stores all the legacy databases in addition to new data storage and even streaming data. In order to combine and use this broad variety of source information, some amount of “cleaning” or rearranging that data has to happen. Think of it like room mates moving in, each person retains his or her personality but certain house rules will be necessary for everyone to live together.

Your Mother Doesn’t Work Here

In the Big Data world this cleaning is called data munging (or wrangling).  Data wrangling does conjure rather accurately the task of roping a large or small but unwieldy live animal and forcing it into restraints. Munging seems to be used more prevalently. Perhaps because it’s a cool made up word that sound like the munching only the Cookie Monster could evoke, although one reference suggested it was an acronym.

The world according to Wikipedia:

Data munging or data wrangling is loosely the process of manually converting or mapping data from one “raw” form into another format that allows for more convenient consumption of the data with the help of semi-automated tools. This may include further munging, data visualization, data aggregation, training a statistical model, as well as many other potential uses. Data munging as a process typically follows a set of general steps which begin with extracting the data in a raw form from the data source, “munging” the raw data using algorithms (e.g. sorting) or parsing the data into predefined data structures, and finally depositing the resulting content into a data sink for storage and future use.

Data munging can be a simple(r) task of matching a new dataset to existing. It can also be matching two or more data sets together. Or it can be a Herculean effort of conforming multiple sources in multiple formats that all weren’t created to fit together – until the Data Scientist makes it so.

Laundry List

Here’s a quick list of the tasks associated with munging in simple terms for those of us that aren’t so cool or data intelligent.

  • Enrich the data
  • Standardize
  • Normalize
  • Apply a Macro
  • Find Pattern (Regular Expression)
  • Sort
  • Filter
  • Merge
  • Transpose
  • Parse
  • Transform Data Types
  • Missing Data Handling

The last task, missing data handling seems the most intriguing. Like the odd sock sucked into the dryer vortex or the peek a boo the computer plays with specific emails that use to be there and can’t be found, most data sets have missing data items. That aspect alone could fill another post, but it presents an easy to understand concern about Big Data and data lakes. It seems too good to be true – placing all this disparate silos in one place and getting to use them for a collective picture.

The answer is yes, it is possible, and the BUT is cleaning data takes resources. The post-script is that it’s not a binary decision. Clean it “enough” to do what you want.

Munging didn’t make the original Glossary of Big Data Terms on What’s The Big Data Idea, so it’s being added with this post. Continue reading

Big Data Implementation in the Middle


, , , ,

Implementing Big Data in your organization requires good old-fashioned MBA basics – strategic moves need a top level leadership champion that creates grass roots determination. An article from The Economist Group develops this concept but it also brings up a more discerning point. Although the schism between generations isn’t new, the digital perspective is poignant to Big Data implementation.

Middle Ground

Digital age technicians grew up in a world of open source. Leadership and more discouragingly middle and top managers most likely did not. That middle strata is significant to making the Big Data (or any data initiative) successful.

That group most likely has developed their career with the sense that owning information is power; sharing information weakens you position. Even if leadership has a grand plan, addressing that monolithic fence is critical to making the transition happen.

As the Data Goes …

Big Data strategy takes silos and drops them in a lake for better utilization and specific use. Old data isn’t thrown away; it’s stored more efficiently and effectively to give the new data perspective and edge. The new data though is the future and the capability necessary to progress.

… so do the People

The data re-organization exemplifies how the people within can follow suit. The techies, the data creators, the data users, the managers: all still fall within the structure of the organization but their relationships to each other is going to shift. That’s not at the expense of cutting out the middle man. That’s not pushing the data stuff off to the data people to do. It’s also not everyone holding hands and singing kumbaya every morning. This is everyone doing what they do best … and learning new capabilities as a group.

 Stepping UP

This is the stuff of leadership. The shift should be visible and active to the organization as more than a fad. The change is critical to survivability of the entity. This is not a ship taking on water though; it is rigging to go faster – keep up with the competition or leave them sitting on the horizon!

4 Top Reasons Big Data Implementation Projects Fail


, , , ,

Big Data isn’t a magic wand that fixes all an organizations data challenges and Aladdin’s lamp isn’t going to make it happen either. Big Data implementation failures again fall back to MBA basics.  Here’s the Big Four challenges.

Who’s in Charge Here?

The boss’s pet project? Or worse, the boss’s boss finds Big Data interesting so the boss finds it fascinating. The implementation should begin with the Go Big or Go Safe® decision and stick to it. Over ambition is the first reason Big Data or any new technology implementation fails. Without specific direction for the scope, any project is setting up for failure. Focus on creating a win for the organization and repeating or growing the capability from the lessons learned in the organization’s frame of reference.

And Why Are We Here Again?

Even if leadership has set a defined course (“You WILL do this.), without clear goals, it’s hard for the team to hit the target. Be realistic about what those first targets are. Like surfing the internet, it’s easy to get lost following threads. Big Data is like a blue ocean of possibility once the new capability is created. Having specific goals also aligns resources to meet those markers.

Everyone’s Doing It (Big Data)

NO jumping off a cliff. Doing Big Data for the sake of getting in the latest trend is a dangerous move. if everyone’s looking at each other for the right action to take, the project is going to fail. Unrealistic expectations fail. The Big Data implementation needs a champion in leadership to ensure the right resources and motivation are pushing the ball down field.

Out of Control

Without boundary control, the project easily slips outside the margins. A structured implementation with consistent feedback enables better reaction and adjustment for keeping the project within the lines. Again, make scalable victories. This serves both short-term wins and long term survivability.

4 Top Reasons Big Data Implementation Projects Fail

4 Top Reasons Big Data Implementation Projects Fail


Go BIG – How to Begin using Big Data with a new Data Strategy


, , , , ,

How to begin with Big Data continues with Part 3 – Go Big.  The alternative to Go Safe(r) is Go Big.  Going Big is refocusing the entire data strategy of the organization.

Changing the data structure of an organization takes a great dedication of time and resources.  Why would an organization choose to take such risk?

How to Begin with Big Data - Part 3 - Go Big

How to Begin with Big Data – Part 3 – Go Big

Go Safe(r) – How to Begin Using Big Data with Part of your Organization


, , ,

The biggest decision for how to begin with Big Data in your organization is choosing between Go Safe(r) and Go Big.  Going Safe(r) means beginning with a portion of the organization’s data or using a specific business unit within the organization and not the entire data warehouse.

Going Big is a foundational shift in the way the organization stores and utilizes data.

Part 2 of Beginning with Big Data in your Organization - Go Safe(r)

Part 2 of Beginning with Big Data in your Organization – Go Safe(r)

Big Data is not JUST for Data Scientists and Experts


, ,

In “bringing Big Data to the people” for this blog, I believe that Big Data is working its way into everyday life and that means working toward everyone utilizing it.  Big Data isn’t just for the cool people (IT geeks).

The future of Big Data belongs to everyone so take advantage of its benefits and realize its shortfalls.

In the Jan 14th 2015 post, Dataconomy reports on the the 8 trends in Big Data for 2015. Number Three is what I am explaining.

Self Service Big Data

As the analytics software becomes easier to use, and companies educate their employees on some basic techniques, 2015 will be the year of accessible Big Data. It will no longer be some mysterious activity spitting out conclusions of questionable origin. People will understand where the data has come from, they will understand how to manipulate it and what it means for their particular part of the business. If they have the right tools, anyone will be able to draw their own insights.

That doesn’t mean data scientists aren’t needed.  Quite the contrary, data scientists are a rare and most valuable professional.  Finding and keeping data scientists that understand an organization’s values and goals is fast becoming a survival asset.

Check out the rest of the trends at Dataconomy.

Big Data Future and Past Stats


Screen Shot 2015-01-30 at 8.15.24 AM

Source – floydworx

In honor of Apple’s achievements noted earlier this week, here’s more on just how much more data capability we have now, looking back on where we were.  (Or The Way We Were if you really want to go back in time.)

How did we make it to the moon and back without digital capability?

Screen Shot 2015-01-30 at 8.15.15 AM

Source: floydworx


In the lens of Big Data, where Data Lakes are less expensive than data warehousing, here are some great data points emphasizing how far prices have dropped.

Screen Shot 2015-01-29 at 1.41.39 PMThese AMAZING info graphics shots are taken from floydworx on  Check out the entire fantastic work.



Get every new post delivered to your Inbox.

Join 116 other followers