Big data works best when small data analysis is its sidekick
Mo Data stashed this in Analysis Tips and Tricks
Big data has the potential to yield business insights that give organizations an edge on their competition. In spite of all the positive changes that have taken place as a result of data-driven decision making, however, not all insights are accurate insights.
Many have been heard saying, “let the data speak for itself,” but there are a few causation and sampling biases that you should be aware of when you’re talking about big data.
Sampling errors are errors that reflect the difference of a sample from the rest of the population. Through big data we can monitor the habits and actions of those who feed us information. Unfortunately, big data can overlook those who are not sending out information or those who don’t have the resources to send out information.
Platforms such as Facebook, Twitter and Google are great sources for gathering potentially high-impact data. However, data pulled from these platforms don’t necessarily represent the entire population, but rather a smaller faction of the population that has access to these means.
This problem was demonstrated in an experiment with drivers and potholes in Boston a few years ago. Apps on smart phones would send immediate information to analysts about pothole locations throughout the city, but the data ignored the lower-income areas of the city that still dealt with the problem of potholes, but did not have access to smart phones.
As the example above demonstrates, big data sampling doesn’t take into account context, and can often times ignore the bigger picture. The information we receive from big data can be transformative, but it can ignore the fact that correlation does not always equal causation.
Big data deals with the quantity of information. Small data, on the other hand, deals with the qualitative details of the information.
We, as storytellers and data analysts, give big data meaning. We are responsible for correctly interpreting and understanding the message the numbers are telling.
Big data works best when small data analysis is its sidekick.