Why don’t we frame everything in terms of a linear regression model?

$$ y_i = \alpha + \beta_1 D_i + \beta_2X_i + \varepsilon_i $$

Identification in this model depends on $\text{Cov}(X_i, \varepsilon_i)$.

It’s not clear how $\varepsilon_i$ relates to the potential outcomes and therefore it’s hard to think about $\varepsilon$ and therefore $\text{Cov}(X_i, \varepsilon_i)$