Big Data addresses:
Why should the ancient practice of scientific method be questioned?
As individuals in society, we hold others in regard for accomplishments that give them authority, such as doctor for their medical degree. Although with the internet at our fingertips we have gained access to ever-greater amounts of information, we have also learned some skepticism, but still retain some sheep mentality.
Goldacre points out we still have a retained awe for authority. With a simple example, he explains how authority can be accepted by a large, popular audience when the authority is actually less than ideal.
With the ubiquity of the internet, authority will only continue to be an issue for any organization or society at large. Big Data is more of an open source platform which involves creating data lakes. These currently infuse the data silos of an organization, or in the case of drug efficacy, corporate secrets.
Goldacre expounds upon how cause and effect studies are “published” with basic flaws in even the simplest cases. The testing environment does not accurately, or sometimes even remotely, simulate the results touted. In addition, the plethora of factors involved is rarely accounted. The test sample sets are representative of general or specific populations, but are these representative of YOU?
Because Big Data is able to consume a vast variety of data, not adhering to strict control methods of traditional scientific method frees the data to more readably present a viable pattern. Trying to hold all other variables constant in a scientific experiment is challenging at best and completely unrealistic practically at worst. (In real life, you can’t hold all the scientific experiments environmental factors constant to obtain the same favorable results.)
Goldacre somberly explains then that these simple examples are just that – simple. Drug studies that are the basis of doctors’ “knowledge” of treating YOU and society are based upon far more complex … and jaded processes.
Our beliefs and expectations of a drug’s efficacy shape the outcome. He gives several examples of how data is effectively rigged to produce a carefully prepared outcome. Thus making the result look … like what they want you to see.
One of the premises of Big Data is finding patterns in the data, not looking to prove or disprove a theory. Therefore, trying to rig an outcome one direction or the other is not a Big Data practice.
(…so would a drug company ever what to use it?)
Goldacre’s final, sobering point was actually the jumping off point for his next Ted Talk on how drug trials have dangerously biased results.
Missing data is one of the greater challenges to Big Data execution. Several methods are in practice to compensate for gaps such as null values or incongruous data sets. The difference with Big Data is that it readily addresses missing data as opposed to discounting it as Ben Goldacre explains in his examples. Because Big Data involves huge volumes of data points, the missing data compensation practices more readily present an accurate representation of the information.