I. What it is and where to find it
A. Variance in Y changes with levels of one or more independent variables.
B. It is often a problem in time series data and when a measure is aggregated over individuals.
1) Example: average college expenses measured by sampling .01 of students at each of several institutions differing in size. Because the size of the sample of students changes with institution size, and because average college expenses has variance s2/n, as institution size grows, n grows and s2/n shrinks.
II. How to know you have it
A. Plot the data
B. Plot the residuals
C. With categorical independent variable, one can perform a test for the homogeneity of variance (e.g., Box’s test; cf. Winer, 1971).
III. What to do about it
A. Conceptually, one might want to treat observations with greater variance with less weight because they give a less precise indication of the path of the regression line.
B. Instead of minimizing S(yi-a-bxi)2, minimize
This is called weighted least squares because the ordinary least squares (OLS) expression is “weighted” (by the inverse of the variance). Note than when si2=s2 that is, when the variances are all equal (homoscedastic), then this equation gives the ordinary least squares (OLS) solution for a and b. In the heteroscedastic case, this equation gives the maximum likelihood estimates (MLE) of a and b.
C. In general it is not possible to solve  and one must rely on computer programs that find the minimum by iterative fitting algorithms.
D. However, there is a simple solution whenever si is proportional to the values of a variable (e.g., Xi) i.e., whenever si=kXi. In this case, one can obtain the weighted least squares solution by minimizing
Because the constant (1/k2) multiplier does not affect the location of the minimum, one can find the appropriate estimates of a and b by minimizing:
Therefore, weighted least squares estimates of the regression parameters can be obtained by performing an ordinary least squares regression on the transformed variables obtained by dividing the original variables by Xi:
Y/Xi = a 1/Xi + b + e/Xi
Note that the constant in this equation (b) corresponds to the regression coefficient for the Xi in the original model and that the regression coefficient for the new independent variable corresponds to the constant term in the original equation. Also, note that since the residuals are conceptually also divided by Xi, they will be normally distributed if the original ei are proportional to the Xi as assumed.
IV. Example: Airline transport accidents predicted by proportion of all flights flown by airline.
This gives the WLS solution:
Number of incidents=-.883+73.122*p(total flights)
Recall (or see above) that the coefficient for the constant and the predictor are switched. The R2 for this model can be obtained by squaring the correlation between the estimated and actual number of incidents (.698)2=.487. The variable statistics can be obtained from the above results (remembering that the coefficient labeled constant is the coefficient for the independent variable). Notice that the t value for the independent variable has increased slightly reflecting the added precision in this model.
D. The plot of the residuals indicates that the heteroscedasticity problem has disappeared.
V. Multivariate Weighted Least Squares
A. Recall that the ordinary least squares solution is:
The WLS solution is B= (X'U-1X)-1X'U-1Y where
U= and U-1=
That is, the ordinary least squares solution is weighted by the inverse of the variances. The regression equation has the form: U-1Y=U-1XB + U-1e
Note that one would obtain the same result if one multiplied the original regression equation by D where
This would yield the solution B=[(DX)'DX]-1(DX)'(DY)
Because D'D=U-1, this solution is identical to the one above.