Frisch-Waugh-Lovell Theorem Guide: Two-Stage Regression & Measurement Error

Introduction to the Frisch-Waugh-Lovell Theorem

The Frisch-Waugh-Lovell (FWL) theorem is a cornerstone of econometric theory, especially in multiple regression analysis. It states that the coefficient of a regressor in a multiple regression can be obtained by first regressing that regressor on all other regressors, taking the residuals, and then regressing the dependent variable on those residuals. This tutorial walks through the FWL theorem with two regressors, as outlined in the assignment. We'll cover matrix inversion, partialling out, and the impact of measurement error—all using timely examples related to current trends like AI model training and fantasy sports analytics.

Setting Up the Two-Regressor Model

Consider the regression model: y_i = β₁x₁ᵢ + β₂x₂ᵢ + εᵢ. In vector notation: Y = β₁X₁ + β₂X₂ + ε. The design matrix X has columns X₁ and X₂. The OLS estimator is β̂ = (X'X)⁻¹X'Y. For two regressors, the matrix X'X is 2x2:

X'X = [[X₁'X₁, X₁'X₂],
       [X₁'X₂, X₂'X₂]]

Each entry is a scalar inner product. To find β̂, we invert this matrix. For simplicity, assume units are chosen so that X₁'X₁ = 1 and X₂'X₂ = 1. This normalization is without loss of generality because scaling regressors does not affect the coefficients. The inverse of X'X is:

(X'X)⁻¹ = (1/(1 - r²)) [[1, -r], [-r, 1]]

where r = X₁'X₂ is the correlation between the regressors. Then:

β̂₁ = (X₁'Y - r X₂'Y) / (1 - r²)
β̂₂ = (X₂'Y - r X₁'Y) / (1 - r²)

This is the direct OLS solution.

The FWL Theorem: Partialling Out X₂

Now apply the FWL theorem. First, regress X₂ on X₁: X₂ = X₁ξ + error. The OLS estimate is ξ̂ = (X₁'X₁)⁻¹ X₁'X₂ = r (since X₁'X₁ = 1). The fitted values are P₁X₂ = X₁ * r, and the residuals are M₁X₂ = X₂ - P₁X₂ = X₂ - rX₁. Next, regress Y on these residuals: Y = (M₁X₂) β₂ + error. The coefficient is:

β̂₂* = ((M₁X₂)' (M₁X₂))⁻¹ (M₁X₂)'Y

Since M₁X₂ is orthogonal to X₁, we have (M₁X₂)' (M₁X₂) = 1 - r² and (M₁X₂)'Y = X₂'Y - rX₁'Y. Thus:

β̂₂* = (X₂'Y - rX₁'Y) / (1 - r²)

This is exactly the same as β̂₂ from the original regression. The FWL theorem is verified.

Application to Real Data: Consumption and Income Growth

In the computer exercise, you regress consumption growth on income growth and the interest rate. Let's denote Y = consumption growth, X₁ = income growth, X₂ = interest rate. The steps are:

Step a: Regress income growth on the interest rate: X₁ = X₂δ + error. Obtain residuals MrY = X₁ - X₂δ̂.
Step b: Regress consumption growth on MrY. The coefficient on MrY equals the coefficient on income growth from the original multiple regression.
Step c: Regress consumption growth on the interest rate: Y = X₂γ + error. Obtain residuals MrC = Y - X₂γ̂.
Step d: Regress MrC on MrY. The coefficient here also equals the original income coefficient. This demonstrates that partialling out the interest rate from both Y and X₁ yields the same result.

This is analogous to how fantasy sports analysts isolate a player's performance by controlling for team strength or opponent quality. For example, to find a quarterback's true contribution, you might regress team points on opponent defense strength first, then use the residuals.

Measurement Error and Attenuation Bias

Measurement error in regressors causes attenuation bias, pushing coefficients toward zero. In the exercise, you add iid mean zero noise to income growth: Y* = X₁ + v, where v is measurement error. Then regress consumption growth on Y* and the interest rate. The coefficient on the mismeasured income growth shrinks toward zero. The interest rate coefficient may also change, often becoming less biased if the measurement error is uncorrelated with the interest rate.

With larger measurement error variance, the attenuation becomes more severe. This is similar to using noisy sensor data in AI training: if input features are corrupted, the model's learned coefficients are less reliable. In sports analytics, if a player's stats are poorly recorded, their estimated impact on wins diminishes.

Practical Implementation Tips

When implementing in MATLAB or Python, use matrix operations to verify the FWL theorem. For the measurement error part, generate random normal errors with increasing variance and observe the changes in coefficients. This reinforces the concept of consistency and bias in OLS.

Conclusion

The Frisch-Waugh-Lovell theorem is not just a theoretical curiosity; it underpins many econometric techniques like fixed effects and instrumental variables. By mastering this two-regressor case, you build intuition for more complex models. Whether you're analyzing economic data, training machine learning models, or evaluating player performance, the ability to partial out control variables is indispensable.