Sign up FAST! Login

When data scientists are under pressure to produce results that favor their employer's or client's hypothesis...

ghostbusters demon sigourney

Sin #1: Cherry Picking

This is where a data scientist includes only data that confirms a particular position and ignores evidence of a contradictory position. "I see this all the time," Walker said.

Sin #2: Confirmation Bias

This is where researchers favor data that confirms their hypothesis.

"When you're dealing with very large data sets, you're going to find more relationships, more correlations," said Walker. And that can lead to causation confusion, especially in high causal density environments.

Sin #3: Data Selection Bias

"This means the skewing of data sources," Walker said. "A lot of times [data scientists] fool themselves in this regard."

Sin #4: Narrative Fallacy

"A lot of data scientists feel the need to fit a story into connected or disconnected fact," said Walker. "So they come up with a story, and then they go looking for data that they can plausibly interpret to fit that story."

Sin #5: Cognitive Bias

This is where you're skewing data to suit your prior beliefs rather than relying on the evidence.

Stashed in: Your argument is invalid.

To save this post, select a stash from drop-down menu or type in a new one:

Ah, logical fallacies. The bane of every scientist's existence.

You May Also Like: