In the previous note, we talked about the notion of causality, gave a quick refresher on functions and then introduced the notion of the potential outcome function which enabled us a mathematically precise why to express our typical interest which is the average treatment effect:
$$ \mathbb{E}[\tilde{Y}_i(1) - \tilde{Y}_i(0)] $$
Now we want to turn our attention to how we might think about estimating or approximating this term from data. And to begin, let’s consider the simplest idea which is given the entire population where some people are treated (i.e. have a housing voucher) and the rest aren’t, let’s compare the differences in the average earnings between the groups.
The first question to ask is — is this a good idea? Does the difference in the average earnings between these groups well approximated the average treatment effect?
Now, we probably intuitively have at least enough of a sense that vouchers aren’t “randomly assigned” but are rather typically targeted to individuals and families with very low incomes. And so we probably have a sense that this comparison won’t be a great idea — it’s going to capture both the effect of a voucher as well as these underlying group differences in earnings.
We can make this more precise by re-expressing this difference-in-means using the potential outcome notation defined previously. The key idea is to recognize that the outcome that we observe in the data $Y_i$ for the group that received the voucher is equivalent to $\tilde{Y}_i(1)$. And likewise, the outcome that we observe in the data for the group without the voucher is $\tilde{Y}_i(0)$.
$$ \mathbb{E}[Y_i \vert D_i=1] - \mathbb{E}[Y_i \vert D_i=0] = \mathbb{E}[\tilde{Y}_i(1) \vert D_i=1] - \mathbb{E}[\tilde{Y}_i(0) \vert D_i=1] $$
We can then add an subtract the same term which leaves us with the following.
$$ \underbrace{\mathbb{E}[\tilde{Y}_i(1) \vert D_i=1] - \mathbb{E}[\tilde{Y}i(0) \vert D_i=1]}{\text{ATT}} + \underbrace{\mathbb{E}[\tilde{Y}_i(0) \vert D_i=1] - \mathbb{E}[\tilde{Y}i(0) \vert D_i=1]}{\text{Selection Bias}} $$