What not Why of Big Data
Bringing Big Data to the People (Part 5 of 6)
What Not Why – Not Your Mother’s Scientific Method
What Not Why is a mental shift that accompanies the 3 Vs of Big Data. Big Data consumes great volumes of a variety of data and produces ‘what” the data is. Big Data tells you what is happening with the data, but not why. The “answer” Big Data gives is not “why” but “what”?
Walmart, Hurricanes & Pop Tarts
For example, Walmart has been a leader in data accumulation pre-dating true Big Data emergence. Product placement is critical for profit margins. When Walmart began using that data, one correlation they found was that prior to a hurricane, not only did people stock up on batteries – but also Pop Tarts.
Unlike this Big Data example, in traditional Scientific Method, a hypothesis would be created, such as when a hurricane is coming, people buy “________”. A specific representative data sample would be calculated. The test would be run with a product and then repeated until a positive result (accept the hypothesis) indicated what was bought prior to a hurricane.
Big data does not need a sample set of the correct data to prove or disprove an idea. As in the Walmart example, study of the entire data itself provides a result without a pre-conceived notion of what the “answer” should be. Big Data scientists look for what the data tells them, not whether or not their hypothesis holds up.
Does Walmart know “why” people buy Pop Tarts before a hurricane? Maybe or maybe not, but they do make sure to stock them near the front.
Scientific method and hypothesis testing of data sets has required math – so can we forget about probability and statistics now?