Dealing With Data Garbage

Emre Soyer
2 min readNov 18, 2021

More information is better for decision making… ideally. But it can also make things worse.

When dealing with information, a famous notion is garbage in, garbage out: faulty inputs produce misleading outputs. But it’s important to acknowledge that not all garbage is the same. As in the process of recycling, different types require different approaches.

Here are three issues to check and consider:

1. Inaccuracy

Information can involve a variety of subtle reporting or measurement errors. For instance, successes often feature embellishments and depend on who is telling the story. If crucial details are incorrect, the resulting lessons suffer.

It becomes important to assess the nature of these errors and adjust the interpretation of the results accordingly. If the errors are considerable, decision makers might have to look for further details that represent the situation more accurately.

2. Bias

Even if data is accurate, certain parts of the information may be missing. For instance, survivors are visible but failures tend to remain hidden. There’s also plenty of data on outcomes but not the processes behind them. And some important details are hard to measure in the first place, like customer satisfaction or employee loyalty.

So it becomes important to be aware of such omissions and then estimate their effects on the results. If certain aspects are completely ignored, decision makers need to uncover them to have a good idea about the actual picture.

3. Irrelevance

Even if data is both accurate and complete, the past might not represent the future. If there’s a dramatic change, then the past would rapidly become obsolete. Or if there’s a lot of noise, then the findings wouldn’t be much help with predictions. And the knowledge gained in hindsight would lead to overconfidence.

It’s important to allow for the possibility of data-based insights being irrelevant. Lessons may be data-approved, but that doesn’t mean that they won’t expire. And just because decision makers can analyze a situation in great detail after the fact doesn’t provide them oracle-like abilities for the upcoming shocks and changes.

To make things more complicated, these three issues — inaccuracy, bias, and irrelevance — aren’t mutually exclusive. They can be present simultaneously and lead to misperceptions while providing an illusion of data-based understanding.

There’s a famous quote: “Data is like garbage. You’d better know what you are going to do with it before you collect it.” But that’s not enough. To make better-informed decisions, we should also check for the types of garbage we are collecting.

We also publicly posted a version of this article on Psychology Today: https://www.psychologytoday.com/intl/blog/experience-studio/202110/3-types-data-garbage

--

--

Emre Soyer

behavioral scientist, co-author of The Myth of Experience