The Average That Isn't

In 1973, the University of California, Berkeley faced a sex discrimination lawsuit after admissions data showed that men were accepted at a significantly higher rate than women - 44% versus 35%. The numbers seemed damning. Then statisticians Peter Bickel and Joseph O'Connell looked more carefully. When they broke the data down by department, women were actually admitted at equal or higher rates than men in most departments. The aggregate average had been hiding a completely different reality: women had disproportionately applied to the most competitive programs, while men clustered in the easier-to-enter ones. No discrimination. A statistical artifact.

This is your first lesson in statistical thinking: the average you are handed is almost never the average you need.

Three Ways to Find the Middle

When someone tells you the "average," they are making a choice - usually without telling you they made it. There are three different measures that all answer the question "what is typical," and they can give you wildly different answers from the same dataset.

The mean is the one you learned in school: add everything up, divide by how many things there are. It is powerful because every single value contributes to the result. It is also fragile for exactly the same reason. One extreme value - one CEO, one billionaire, one outlier - can drag the mean far from where most of the data actually lives.

The median is the middle value when you line everything up in order. Half the data is above it, half below it. It ignores how extreme the extremes are. If ten people in a room earn $40,000 a year and one earns $4,000,000, the median salary is $40,000. The mean salary is roughly $400,000. Both numbers are mathematically correct. Only one of them tells you what most people in that room experience.

The mode is the value that appears most often. It rarely gets mentioned in journalism or political speeches, but it is often the most honest number for decisions about real people. A clothing retailer does not need to know the mean shirt size or the median shirt size - they need to know the mode, the size most people actually buy.

What Skew Tells You

The relationship between these three numbers tells you something important about the shape of the data. When mean, median, and mode are roughly equal, the data is symmetric - think of heights or test scores, which cluster around the middle and tail off equally in both directions.

When the mean is higher than the median, the data is right-skewed: a small number of very high values are pulling the mean up, away from where most observations actually sit. Income data is the classic example. Wealth data is even more extreme. When you read that average household wealth in a country is $750,000, check whether that is the mean or the median - those two numbers tell completely different stories about the financial lives of most citizens.

When the mean is lower than the median, the data is left-skewed: a small number of very low values are dragging the average down. Age at retirement in countries with high youth unemployment can look like this.

Key Point: The word "average" without qualification is an incomplete statement. Before you accept any average as meaningful, ask which measure was used and whether the data is skewed. A mean in skewed data is like a GPS coordinate for a mountain range - technically accurate, practically useless.

The Berkeley Lesson Applied

The Berkeley case illustrates something beyond averages: the problem of aggregation. When you pool data across groups that are behaving differently, the combined number can reverse the pattern visible in each individual group. This is called Simpson's Paradox, and it appears everywhere - in hospital mortality rates, in educational outcome statistics, in sports performance data.

You do not need advanced mathematics to protect yourself from it. You need one habit: whenever you see an aggregate statistic that surprises you, ask whether it changes when broken down by a relevant subgroup. Often, the aggregate is the least informative number available.

Three Ways to Find the Middle

What Skew Tells You

The Berkeley Lesson Applied

Quiz: The Average That Isn't