This is all covered in section 5.4.2 in of Examiner and Judge Designs in Economics: A Practitioner's Guide

Motivation

The idea that we can use differences between judges in housing court to learn about the impact of evictions seems pretty neat. Some judges appear to have a higher tendency to “evict”, while others are more tenant friendly, more likely to arrange for stipulation agreements. The idea behind Judge-IV is that by comparing tenants who receive a lenient judge versus a stricter judge, we can learn something about the costs of evictions.

At a high level, though, it’s an open question as to whether this variation across judges in housing court is credible for this kind of analysis. As we’ll highlight below, there is potential for judge-iv to generate a form of selection bias which would undermine the motivation for an instrumental variable design.

Background

In Judge-IV, both the treatment and instrument are categorical valued. A tenant (defendant) in housing court will be assigned to one judge who will then “rule” on the case.

$$ \tilde{D}_i: \{\text{Judges}\} \longrightarrow \{\text{Housing Court Outcomes}\} $$

We can visualize this first stage function for a single individual with the following diagram. The “y-axis” corresponds to the various judges and the “x-axis” shows the possible housing court outcomes.

Screenshot 2025-05-11 at 7.24.03 PM.png

The typical Judge-IV paper like Eviction and Poverty in American Cities will binarize the first stage thereby replacing the set of possible outcomes with just $\{0, 1\}$ indicating whether the tenant was evicted.

$$ \tilde{D}_{1i}: \{\text{Judges}\} \longrightarrow \{0, 1\} $$

For our hypothetical individual, the first stage would look something like this.

Screenshot 2025-05-11 at 7.42.38 PM.png

The question is, what types of problems does this binarization / approximation / simplification present?

Analysis

To tackle this question, let’s revisit the essence of instrumental variables. From a partially linear perspective, we can understand IV with a binary treatment, binary instrument as exploiting the variation in the probability of the treatment generated by the instrument.

$$ \mathbb{E}[D_1 \vert Z] - \mathbb{E}[D_1] $$

The challenge of “binarization” is that while the instrument generates variation in the treatment of interest — are you evicted ($D_1$) — it might also generate variation in another housing court outcome like — non-final stay agreements ($D_2$) (which is different from an eviction because the tenant can continue to reside in their unit if they stick to a payment plan). This additional variation is a form of selection bias that is (1) observable but (2) cannot be directly controlled for. For the method to “work”, we’d need to make an exclusion restriction.