Why does regression to the mean happen




















To illustrate the statistical methods used to detect and control for RTM we used a random subset of measurements of serum betacarotene from the Nambour Skin Cancer Prevention Trial. The effect of the betacarotene supplement on serum levels was investigated in a random sub-sample of trial participants, who provided a blood sample at the start of the trial, in February , and another blood sample at the end of the supplementation period in July unpublished. For our purpose we therefore log-transformed the data to make them approximately Normally distributed.

In the analyses presented here we are interested in whether the supplements increased betacarotene levels i. One should assume that RTM has taken place unless the data show otherwise. The initial examination of the data should include a scatterplot of change follow-up minus baseline measurements against baseline measurements, which can help identify the magnitude of the RTM effect. An example scatterplot is shown in Figure 3 for the log-transformed betacarotene data from the Nambour Skin Cancer Prevention Trial.

The solid line represents perfect agreement i. The dotted lines were obtained by linear regression of the change values on baseline values including a group covariate; the higher line is for the treatment group and the distance between the regression lines indicates a possible treatment effect. Some RTM is apparent in the plots, as subjects whose baseline results were unusually low have tended to increase so that change values are likely to be above the solid line , and subjects whose baseline results were unusually high have tended to decrease so that change values are likely to be below the solid line.

This pattern is clearer in the placebo group where there was less change in the group mean between the measurement times. The solid line represents perfect agreement no change and the dotted lines are fitted regression lines for the treatment and placebo groups. The effect of RTM can be reduced by a good study design.

We describe two such designs below. If subjects are randomly allocated to comparison groups the responses from all groups should be equally affected by RTM. With two groups, placebo and treatment, the mean change in the placebo group provides an estimate of the change caused by RTM plus any placebo effect. The difference between the mean change in the treatment group and the mean change in the placebo group is then the estimate of the treatment effect after adjusting for RTM.

The effect of RTM increases with larger measurement variability see Equation 1. To reduce the variability we can select subjects using two or more baseline measurements.

The study selection criterion i. This method can be thought of as an attempt to get a better estimate of each subject's true mean before the intervention. The advantages of taking extra measurements are it gives better estimates of the mean and the within-subject variation. The reduction in the RTM effect is biggest between the first and second measurements; the benefit of extra baseline measurements decreases. An example of the reduction in the regression to the mean RTM effect due to taking multiple baseline measurements and using each subject's mean as the selection variable.

Many different methods have been proposed to estimate the size of the RTM effect and to adjust observed measurements for RTM. If we know, or can estimate the mean and standard deviation of the population distribution and the within-subject standard deviation then we can estimate the RTM effect using Equation 1 or Equation 2 for multiple baseline measurements.

This value can then be subtracted from the observed change to give an adjusted estimate. ANCOVA can also be used with the change between baseline and follow-up as the outcome variable, although the only difference from Equation 3 is that the regression coefficient, a , for the centred baseline value is decreased by one unit.

As ANCOVA is a special case of a general linear model it can be performed in most statistical software packages used in epidemiological research e. The commands to perform ANCOVA and check the model's adequacy in a number of statistical packages are given on our web site, 16 and can also be obtained by contacting the authors. In this example, treatment allocation was random, and hence the study was protected against RTM at the design stage.

Table 1 shows an analysis of the serum betacarotene data from the example data set. Using the full data set no cut-off , the results show a significant increase in betacarotene in the treatment group and no apparent change in the placebo group. The increase in the treatment group, compared with the placebo group, is 0. For the data with a baseline cut-off, there appears to be a possible increase in the placebo group of 0. Hence we conclude that there was no real change in the betacarotene levels in the placebo group.

Analysis of change follow-up result minus baseline in log-transformed betacarotene measurements. Table 2 shows the results using ANCOVA, the estimated treatment effect is similar to the paired t -test results, but with narrower confidence intervals, particularly from the subset of data using the cut-off. The narrower intervals are due to the baseline term explaining more of the variance in the outcome in the ANCOVA model. We have highlighted the problem of regression to the mean RTM using some simple biological examples where the variable was approximately Normally distributed.

However, RTM is not restricted to biological variables. It will occur in any measurement biological, psychometric, anthropometric, etc that is observed with error. Also it is not restricted to distributions that are Normal, or even to distributions that are continuous. RTM can occur in binary data where it would cause subjects to change categories without any true change in their underlying response. Using data from a study in which subjects were randomly allocated to groups t -tests and ANCOVA gave results that were the same when there was no baseline cut-off.

When a cut-off was used, ANCOVA gave narrower confidence intervals for the treatment effect, and the paired t -test showed a change in the placebo group consistent with RTM. RTM occurs in any variable that is subject to random error, and therefore it needs to be ruled out as a cause of an observed change before any other explanation is sought.

It has already caught out many researchers 21 —we hope that people who read this article will avoid this mistake. Reduce regression to the mean RTM at the design stage: 1 include a randomly allocated placebo group, 2 take multiple baseline measurements, although this is unlikely to completely eliminate the problem. Identify RTM at the analysis stage: 1 examine a scatterplot of change against baseline; is there more change at the tails of the baseline measurements?

Biological variability of cholesterol, triglyceride, low- and high-density lipoprotein cholesterol, lipoprotein a , and apolipoproteins A-I and B. Clin Chem ; 40 : — Stigler SM. Regression towards the mean, historically considered. Statist Meth Med Res ; 6 : — Chesher A. Non-normal variation and regression to the mean. Effectiveness and tolerability of a new lipid-altering agent, Gemcabene, in patients with low levels of high-density lipoprotein cholesterol.

Am J Cardiol ; 92 : — Blood pressure, stroke, and coronary heart disease. Lancet ; : — Some effects of within-person variability in epidemiological studies. J Chron Dis ; 26 : — Davis CE. The effect of regression to the mean in epidemiologic and clinical studies.

Am J Epidemiol ; : — The other major factor that affects the amount of regression to the mean is the correlation between the two variables. If the two variables are perfectly correlated — the highest scorer on one is the highest on the other, next highest on one is next highest on the other, and so on — there will no be regression to the mean. But this is unlikely to ever occur in practice.

It is only when the measure has no random error — is perfectly reliable — that we can expect it will be able to correlate perfectly. You can estimate exactly the percent of regression to the mean in any given situation.

The formula is:. In the first case, the two variables are perfectly correlated and there is no regression to the mean. With a correlation of. If the correlation is a small. With zero correlation, knowing a score on one measure gives you absolutely no information about the likely score for that person on the other measure. In that case, your best guess for how any person would perform on the second measure will be the mean of that second measure. Given our percentage formula, for any given situation we can estimate the regression to the mean.

All we need to know is the mean of the sample on the first measure the population mean on both measures, and the correlation between measures. Consider a simple example. Given this, we would predict that the population mean would be 50 and that the sample would get a posttest score of 30 if there was no regression to the mean. Now, assume that the correlation is. In this case, we would observe a score of 40 for the sampled group, which would constitute a point pseudo-effect or regression artifact.

In this case, a sample that had a pretest mean of 30 would be expected to get a posttest mean of 45 i. But here, the correlation between pretest and posttest is. That is, we would observe a posttest average of 55 for our sample, again a pseudo-effect of 10 points. Regression to the mean is one of the trickiest threats to validity.

It is subtle in its effects, and even excellent researchers sometimes fail to catch a potential regression artifact. You might want to learn more about the regression to the mean phenomenon. One good way to do that would be to simulate the phenomenon. If you already understand the basic idea of simulation, you can do a manual dice rolling simulation of regression artifacts or a computerized simulation of regression artifacts.

Otherwise, we are prone to be disappointed. When Kahneman was giving a lecture to Israeli Air Force about the psychology of effective training, one of the officers shared his experience that extending praise to his subordinates led to worse performance, whereas scolding led to an improvement in subsequent efforts.

As a consequence, he had grown to be generous with negative feedback and had become rather wary of giving too much praise. Kahneman immediately spotted that it was regression to the mean at work.

He illustrated the misconception by a simple exercise you may want to try yourself. He drew a circle on a blackboard and then asked the officers one by one to throw a piece of chalk at the center of the circle with their backs facing the blackboard. Naturally, those that did incredibly well on the first try tended to do worse on their second try and vice versa. The fallacy immediately became clear: the change in performance occurs naturally. That again is not to say that feedback does not matter at all — maybe it does, but the officer had no evidence to conclude it did.

At this point, you might be wondering why the regression to the mean happens and how we can make sure we are aware of it when it occurs. The correlation coefficient between two measures which varies between -1 and 1, is a measure of the relative weight of the factors they share. For example, two phenomena with few factors shared, such as bottled water consumption versus suicide rate, should have a correlation coefficient of close to 0.

That is to say, if we looked at all countries in the world and plotted suicide rates of a specific year against per capita consumption of bottled water, the plot would show no pattern at all. On the contrary, there are measures which are solely dependent on the same factor. A good example of this is temperature. The only factor determining temperature — velocity of molecules — is shared by all scales, hence each degree in Celsius will have exactly one corresponding value in Fahrenheit.

Therefore temperature in Celsius and Fahrenheit will have a correlation coefficient of 1 and the plot will be a straight line. There are few if any phenomena in human sciences that have a correlation coefficient of 1. There are, however, plenty where the association is weak to moderate and there is some explanatory power between the two phenomena.

Consider the correlation between height and weight, which would land somewhere between 0 and 1. While virtually every three-year-old will be lighter and shorter than every grown man, not all grown men or three-year-olds of the same height will weigh the same.

This variation and the corresponding lower degree of correlation implies that, while height is generally speaking a good predictor, there clearly are factors other than the height at play. When the correlation of two measures is less than perfect, we must watch out for the effects of regression to the mean. Kahneman observed a general rule: Whenever the correlation between two scores is imperfect, there will be regression to the mean.

This at first might seem confusing and not very intuitive, but the degree of regression to the mean is directly related to the degree of correlation of the variables.

This effect can be illustrated with a simple example. Assume you are at a party and ask why it is that highly intelligent women tend to marry men who are less intelligent than they are. Most people, even those with some training in statistics, will quickly jump in with a variety of causal explanations ranging from avoidance of competition to the fears of loneliness that these females face.

A topic of such controversy is likely to stir up a great debate. Now, what if we asked why the correlation between the intelligence scores of spouses is less than perfect? This question is hardly as interesting and there is little to guess — we all know this to be true. The paradox lies in the fact that the two questions happen to be algebraically equivalent. Kahneman explains:. The observed regression to the mean cannot be more interesting or more explainable than the imperfect correlation.



0コメント

  • 1000 / 1000