I work in a fun office full of brilliant people that plays morale games on occasion.  Today, in honor of St. Valentine’s Day, we had a contest to guess how many candy hearts were in a jar:

The guesses were as follows:

Straight away, we see some data quality issues for the purposes of this article:

• My boss gave two irrelevant responses.
• A strategically minded business man came behind me, and knowing I was a data scientist, openly performed a “The Price is Right” tactic and simply rendered my answer plus 1. This is not an independent estimate but rather a derivative based on false confidence in a single guess from a wrongly attributed expert.
• Contestant AD answered twice, contributing two sources of error. For the purposes of this article, we shall presume the second supersedes the first, reducing this individual’s contribution to a single record.

Tidying the data and calculating the average percentage error (APE) results in the following:

The actual number of candies in the jar, after the submission period for entries had closed, was found to be 815. My boss did not actually eat any, he just thought he was being funny.

The winner was Contestant P, who guessed 786, a 3.46% absolute percent error, without going over the actual count. Contestant AF had an even more accurate estimate at 830, representing a 1.84% absolute percent error, but was disqualified for going over the actual count (a rule not explicitly stated).

Now the stunning part:

The majority of records (20 of 39) were at least 40% off. Two were over 100% off (120.86% and 145.40%, respectively). Yet, on average, as a cohort of very wrong individuals, we produced an average that was only 17.01% off. Our collective guess of 676.39 was superior to the guesses of 84.1% of cohort members (including my own).

This is the effect called “The Wisdom of the Crowd”. All things being equal, some of us were under, some of us were over…and thus we cancelled out each other’s contributed error. The remaining residual error then averaged equally over all 39 of us, making each member’s effective error less that the error that 84.1% of us actually contributed.

Here then, is a statistical case for teamwork, and arguably, democracy. Where many voices are equally valid, and independent, treating each as equal will get a group closer to the truth than most of the individuals can by guessing alone. Is it possible for an individual to be closer to right? Sure. 15.9% of ours were, in fact. But before the reveal, we had no way of knowing whose guesses were closest, because all guesses were equally valid.