Regression to the mean applies whenever two measurements aren’t perfectly correlated.
What does that mean? It will be clearer with an example.
Example of regression to the mean: Heights of parents and children
Regression to the mean was first noticed by Francis Galton in the late 19th century. He noticed that very tall fathers tended to have sons who were shorter than they were. He thought this would lead to mediocrity all around, and used it to buttress his rather horrid ideas on eugenics. His interpretation was all wrong, but his observation was correct. The sons of the tallest men will usually be shorter than their fathers (and the same applies to daughters, and to the average of the parents’ heights).Similarly, the offspring of very short parents will tend to be taller than their parents. But the sons of the tallest men will still be taller, on average, than the sons of the shortest men.
And yet …. after many many generations of human, we’re still very diverse on height. If regression to the mean happens, generation after generation (and it does) then why aren’t we all the average height? More on this later.
But the reason there is regression to the mean from parents’ height to offspring height is that, while parent height and child height are correlated, they aren’t perfectly correlated.
Example of regression to the mean: Test scores
Regression to the mean can also occur on one population. In fact, it always does. But sometimes it’s very minor.
When we measure anything, we measure it with error. Some things (such as height) we can measure quite accurately – but not perfectly. If I get myself measured today and tomorrow, the two heights won’t match perfectly – maybe I stood straighter one day, or maybe the person measuring me slipped, or whatever. But the two measures will be very close. If I measured myself every day for a year, then compared the tallest measures to those on the next day, the ones on the next day will be smaller. But not by much. Regression to the mean will be minimal.
But some things are harder to measure. Take any human ability. If we test an ability (no matter WHAT type of test we use, and no matter WHAT ability) we will measure with error. Usually quite a lot of error. So, if we give a test of (say) spelling to a group of students in September, and another test of spelling in October, the kids who did best in September will tend to not do quite as well in October; and those who did worst in September will tend to do somewhat better in October. That’s regression to the mean, and it doesn’t reflect ability. It reflects error. Not the type of error you make when you misspell a word, statistical error.
In classical test theory, they have a formula
O = T + E
where O stands for observed score, T stands for true score (that is, real ability) and E stands for error. Error is assumed to be random. If we give the test twice, and true score doesn’t change, then we have two equations
O1 = T + E1
O2 = T + E2
notice that T doesn’t change, because I said true ability doesn’t change.
So, if E1 is large and positive, then O2 will tend to be lower than O1; and if E2 is E1 is large and negative, then O2 will tend to be higher than O1. That’s where regression to the mean comes from in a single population.
The same is true if T varies, but the math gets a bit complex.
What is E? E is error. Maybe the night before the September test, Johnny got in a big fight with his mom, slept badly, skipped breakfast, and took the test sleep deprived and hungry. So, Johnny tested badly in September, and E was large and negative. In October, he didn’t have these problems, and E was close to 0. So Johnny’s O2 will be higher than O1, if T stayed the same.
This happens on ANY type of measurement, unless measurement is perfect. And, with humans, measurement is never perfect. It happens even if the test is absolutely fantastic, but it’s worse if the test is poor, because then there is more error. That is, if there were some measurement that had zero error, there would be no regression to the mean, and the higher the error, the greater the regression to the mean.
Solution to puzzle about height and regression to the mean
So, why isn’t everyone the same height? Why aren’t October scores all closer together than September scores?
Because, in addition to regression to the mean, there is also what might be called regression to the extreme. The children of people of average height tend to be farther from average. And if your error in September is 0, it probably won’t be 0 in October.
Comparison of the two examples of regression to the mean
In the first example (height) we looked at two generations, and at a very precise measure. But while height can be measured quite accurately, height of parents is not a perfect predictor of the height of children. Here, the “true score” could be called “familial height” and parental and child height would each be measurements of it, and each would have error. In the second example we looked at the same people twice, and the observed score was a measurement of the true score, but it had error.