The One Hidden Skill You Need to Unlock the Value of Your Data - the scientific method
Mo Data stashed this in Analysis Tips and Tricks
2. Computer Engineering
3. Business Acumen
The final group of skills reflect the need to approach problems using critical thinking, creativity and open-mindedness. I grouped these final set of skills under the label of “scientific method.” Formally defined, the scientific method is a body of techniques for objectively investigating phenomena, acquiring new knowledge, or correcting and integrating previous knowledge. The scientific method includes the collection of empirical evidence, subject to specific principles of reasoning. Specifically, the scientific method follows these general steps: 1) formulate a question or problem statement; 2) generate a hypothesis; 3) test hypothesis through experimentation (when we can’t conduct true experiments, data are obtained through observations and measurements); and 4) analyze data to draw conclusions.
These steps are not meant to imply that science is only a series of activities. Instead of thinking about science as an area of knowledge, it is better to conceptualize it as a way to understand how the world really works. As Carl Sagan said, “Science is a way of thinking much more than it is a body of knowledge.” The scientific method not only requires the adherence to rules, it also requires creativity, and imagination in order to find new possibilities, address problems in different ways and apply findings from one setting to another. Separating signal from noise, data scientists’ work truly reflects an exercise in uncovering reality.
I believe that the scientific method plays a critical role in understanding any data, irrespective of their size or speed or variety. Despite the idea that Big Data will kill the need for theory and the scientific method, the human element is necessarily involved in the generation, collection and interpretation of data. As Kate Crawford points out in a thoughtful article, The Hidden Bias of Big Data, data do not speak for themselves; humans give data their voice; people draw inferences from the data and give data their meaning. Unfortunately, people introduce bias, intentional and unintentional, that weaken the quality of the data.
Additionally, I highlighted a few ways that the scientific method can help improve the veracity (validity) of data. To be of real, long-term value to business, Big Data needs to be about understanding the causal links among the variables. Hypothesis testing helps shed light on identifying the reasons why variables are related to each other and the underlying processes that drive the observed relationships. Hypothesis testing helps improve analytical models through trial and error to identify the causal variables and helps you generalize your findings across different situations.