The fallacy of data

Photo by Becca McHaffie on Unsplash

I used to be a Product Manager for a Big Data Analytics software. Our product used to take a lot of data, analyze it and show personalized recommendation. That is, each visitor to the portal sees a set of personalized recommendations hand picked by the algorithm just for her; and since she sees things that she likes, she is more likely to shop a lot. We deployed it for one customer and the customer saw an upsurge of 10% in sales. We had cracked it, right?

I believed it too, until one day when we were showing the demo of the software to a prospective client. I proudly touted the upsurge in sales seen by the previous client and I thought I’d have the client eating out of my hands, now that I had floored him with real-life stats. But instead the client asked me — ‘ We could probably get the same upsurge because of say, any set of recommendations? How do you know that upsurge of sales was because of your ‘personalized’ recommendations?” Until then, no one had asked me that. I hadn’t even asked myself that or I hadn’t asked my team that. I mean, 10% upsurge was verifiable data right?

In other words, the customer was asking us to consider 3 different scenarios:

A- show visitors no recommendations

B- show visitors random recommendations

C-show visitors personalized recommendations

While I was saying that C led to more sales than A, the client was saying that it is possible that even B could lead to more sales than A. The real question then is, is C better B and by how much? Only if C is better than B, we could say that personalization works. If C isn’t better than B, then our product didn’t really do anything significant.

Quite frankly, I was stumped and I had no answer and the rest of the meeting was a disaster. All my training and experience was in showing data and stats that were there but not in the data and stats that were not there.

— –

Photo by Gary Wann on Unsplash

In the World War 2, the British aircrafts were getting battered by the German guns. The commanders wanted to armor the planes to extend their life in the war. Most commanders’ first reaction was to look at the planes and armor them where the bullet holes were the most (on the fuselage and wings). Abraham Wald, a statistician, looked at the same data but said that the planes should be armored where the bullet holes were the least (engine and cockpit).

His logic was, that everybody else was looking at only the sample set of survivors and the survivors survived despite the attacks to the fuselage and wings. So, it’s possible that the ones that didn’t survive were hit in the cockpit and engine and so that is where we need to armor them.

— –

The point of my blog is that we are very often we think we are making a rational data based decision. But we are blind to data and stats that maybe more important, but is not presented. Here are a few more examples-

> Resumes are always made to impress and to hide the flaws. Therefore, it’s always good to start interviews with the mistakes made by the candidates rather than their accomplishments. If I ever take on clients, I plan to show my mistakes first and then my few accomplishments. If somebody still trusts me, I think the advisor-client relationships would be more meaningful. Just like Warren Buffett.

> Sales presentations always, always, always show successful deployments. I know this because I have made a million presentations and not in one of them, I have put anything even mildly offensive to my team or my employer. Not one. If you looked at my presentations, you’d think that we always did a stellar job, on time, with minimal bugs. It was true, but only if you considered the case studies.

> If you are someone looking to deploy millions of $, always asked to be shown the dirty details before the dazzling ones. If you ask one set of consultants why you need to outsource more, hire another set why you shouldn’t. Instead of asking -’why should I buy from you’ ask ‘why should I not buy from you’?

> Here is another life saver. You could probably score more in Math or Science if you double check your workings with a mindset of finding mistakes rather than a mindset to make sure everything is alright.

We have a natural tendency to count on what’s presented and discount what’s absent. So don’t just take data and stats at their face value; instead ask what data or stat is important that I am not seeing? If you are a customer or a recruiter or someone in a bargaining situation, demand to see what is not being shown. If you wish to be a better decision maker, look at the world of data upside down like Abraham Wald. The insights you get may blow you away!

Happy decision making!

Leave a comment