3.3 — Omitted Variable Bias — Class Notes

Contents

Tuesday, October 13, 2020

Overview

Today we return to our regression models, now knowing something about identifying causal effects. We know from DAGs that we often need to “adjust for” or “control for variables” in order to identify the causal effect we are interested in. Now we give a particular name and set of conditions for when we need to control a variable: .b[“omitted variable bias”], where some variable both causes $$Y$$ (is in $$u)$$, and is correlated with $$X$$. To avoid introducing the bias, we now include it as an additional independent variable in our regression.

Thus, we now begin exploring multivariate regression with multiple regressors:

$Y_i=\beta_0+\beta_1 X_{1i}+ \beta_2 X_{2i} + u_i$

We continue the extended example about class sizes and test scores, which comes from a (Stata) dataset from an old textbook that I used to use, Stock and Watson, 2007. Download and follow along with the data from today’s example:Note this is a .dta Stata file. You will need to (install and) load the package haven to read_dta() Stata files into a dataframe.

I have also made a RStudio Cloud project documenting all of the things we have been doing with this data that may help you when you start working with regressions:

Midterm exam corrections are due to me by an emailed PDF by 11:59 PM Sunday October 18. You may redo any question you did not get full points on (do not do questions you did not lose points on), including bonuses. Write the correct answer and explain why it’s the right answer (i.e. show your work, don’t just write $$\hat{\beta_1}$$ when you wrote $$\hat{\beta_1}$$ on the exam.) I want you to demonstrate you are internalizing the answers and learning, not just comparing with your friends to get the correct answer. You can talk to each othjer now, and are welcome to come to my (and the TAs’) office hours to go over the exam together.
The live class Zoom meeting link can be found on Blackboard (see LIVE ZOOM MEETINGS on the left navigation menu), starting at 11:30 AM.
If you are unable to join today’s live session, or if you want to review, you can find the recording stored on Blackboard via Panopto (see Class Recordings on the left navigation menu).