A list of numbers (like the column of test scores in your spreadsheet) isn’t very useful in itself. Sure, you might be able to pick out some patterns just from scanning the numbers, but that isn’t a very efficient or accurate way to understand what’s going on, and it’s certainly tricky to compare a general pattern to other lists of numbers.
As a teacher, you might want to know how your students (or specific groups of students) scored on a specific homework assignment overall, or how their grades are changing over time. Alternatively, you might want to compare how individual students are performing relative to one another on the whole.
In any of these cases, we’re looking for an “average.” There are three main ways that people find the average of a list of numbers: mean, median, and mode.
- The mean is the most commonly used measure of “average”; in fact, the two terms are often used interchangeably. It is calculated as the sum of all the numbers in the list divided by the length of the list. For example, to find the mean of the list “5, 3, 1, 4”:
- First, sum up all values in the list: 5 + 3 + 1 + 4 = 13.
- Next count the number of values in the list: 4.
- Finally, divide the result of step 1 (sum) by the result of step 2 (count): 13 / 4 = 3.25.
- The median is simply the middle number in a sorted list. If there are two middle numbers, the median is halfway between those two. For example, to find the median of the list “5, 3, 1, 4” (same as before):
- First, sort the list: 1, 3, 4, 5.
- Next, identify the middle number or numbers. Since the example list contains an even number of values, there are two numbers in the middle: 3 and 4. (If we had only a single middle number, we could just stop here.)
- Since there are two middle values, calculate their mean: (3 + 4) / 2 = 7 / 2 = 3.5.
- The mode is simply the most commonly occurring number in the list. This measurement is far less useful than mean and median, so we won’t spend any more time on it.
If you want a wonderful, more rigorous explanation of these three along with some practice exercises, check out the masterful Sal Khan’s lessons on “Measures of Central Tendency”.
But how do I know which one to use?
I mentioned that the mean is the most commonly used way to calculate averages, so why not always use that? In general, that’s not a bad idea, as the mean does a nice job of taking into account the magnitude (or “largeness”) of values. And the mean gives you a very useful result if your data set looks like a bell curve (which it normally does—no pun intended).
Of course, data is often messy, and so-called “outliers” that sit outside the pack can greatly skew the result of a mean calculation, giving you a less intuitive or useful result. But the median is far less likely to be affected. For example, recall our earlier example of the list “5, 3, 1, 4”. The mean and median we calculated were 3.25 and 3.5, respectively—values that are pretty close to one another.
But consider a list of four numbers where one number is far off from the other three, for example “5, 3, 1, 100”. In this case:
- The mean is sum / count, or (5 + 3 + 1 + 100) / 4 = 27.25.
- The median is the mean of the middle two numbers from the sorted list (3 and 5 in the list “1, 3, 5, 100”), or (3 + 5) / 2 = 4.
Unlike the previous example, there’s a pretty big difference between the mean and median in this case. And you’ll see a similar pattern in general when your data set includes outliers. So as a general rule:
Use median instead of mean when the list of values includes a few numbers that are much larger or much smaller than most of the others.
Exercise: Mean or median?
Click this link to see two groups of student data. Determine whether calculating the mean or median test score would be appropriate for the two data sets. To let you practice without being tempted with the correct response, we’ll start off the next lesson with an answer key.
- Mean and median are the best ways we have to describe the middle of our data. Tweet
- Use mean most of the time, but median when your data includes outliers. Tweet
- Impress your friends by saying “mean” instead of “average” from now on! Tweet
As with any summary statistic, mean, median, and mode don’t tell you everything about a list of numbers. But they do give you a compact way to summarize a list of numbers. We’ll take a look at doing these calculations in the spreadsheet in the next lesson.