Sign up FAST! Login

When thinking about the big data era, what are some statistical ideas we've already figured out?

10 things statistics taught us about big data analysis | Simply Statistics

If the goal is prediction accuracy, average many prediction models together.

When testing many hypotheses, correct for multiple testing

When you have data measured over space, distance, or time, you should smooth

Before you analyze your data with computers, be sure to plot it 

Interactive analysis is the best way to really figure out what is going on in a data set

Know what your real sample size is.

Unless you ran a randomized trial, potential confounders should keep you up at night

Define a metric for success up front

Make your code and data available and have smart people check it 

Problem first not solution backward 

Stashed in: Big Data!, Correlation is not causation.

To save this post, select a stash from drop-down menu or type in a new one:

Great use of an animated gif. 

And also, good rules of thumb.

You May Also Like: