DAGs

POL51

Juan Tellez

UC Davis

November 13, 2024

Plan for today

Why DAG?

Identifying effects

ggdag()

Why DAG

We want to identify the effect of X (waffles) on Y (divorce)

We can use our model to identify that effect, BUT:

We also know that lurking variables might make things go awry (the South)

Why DAG

We know that the DAG on the left will produce the spurious correlation on the right

Regardless of whether or not waffles cause divorce

Why not control for everything?

Controlling for the wrong thing can close a perplexing pipe – this erases part or all of the effect that X has on Y

Or open up an exploding collider – creates weird correlation between X and Y

Example: Bias in police use of force

Are the police more likely to use deadly force against people of color?

Black Americans are 3.23 times more likely than white Americans to be killed by police (Schwartz and Jahn, 2020)

Yet there are big debates about how exactly to estimate this bias (and the extent to which it exists)

Fryer (2019) finds that Blacks and Hispanics are 50% more likely to be stopped by police, but that conditional on being stopped by the police, there are no racial differences in officer-involved shootings

Bias in use of force

Fryer used extensive controls about the nature of the interaction, time of day, and hundreds of factors that I’ve captured with Confounds

Bias in use of force

Fryer shows that once you account for the indirect effect, the direct effect is basically not there – once the police has stopped someone, they do not use deadly force more often against Minorities than Whites

Bias in police use of force

But what if police are more likely to stop people they believe are “suspicious” AND use force against people they find “suspicious”? THEN conditioning on the stop is equivalent to conditioning on a collider

Tough!

We’d like to know if Minorities are killed more than Whites in police interactions once they are stopped

But controlling for being stopped creates collider bias

Super tough to estimate the effect of race ➡️ police abuse with observational data!

What do we do?

We have to be careful and slow

Think carefully about what the DAG probably looks like

Use the DAG to figure out what we need to control

(and what must be left alone)

Next time: how to actually control for stuff

Why experiments work

DAGs can also help us see why experiments “work”:

Person Shown an ad? Democrats thermometer
1 Yes 58.3
2 No 12.05
3 Yes 57.82
4 No 90.98
5 No 94.64

Why experiments work

Experiments seem simple…

Why experiments work

But the outcome can be very complex …

And yet we can still identify the effect because nothing causes you to receive the experimental treatment; it is random!

When experiments go wrong

Say the ad experiment was implemented on TikTok, and younger people are more likely to use TikTok than older people

This means Age is now a fork

Identifying effects

Front-doors and back-doors

  • Judea Pearl’s back-door criterion ties this all together

  • Confounding caused by existence of an open “back door” path from X to Y

  • A backdoor path is a non-causal path from X to Y

  • Need to close back-doors and keep front-doors open

Backdoor paths

A backdoor path can involve a chain of variables – like the fork, but with more steps

Here we have a backdoor path between X and Y that runs through a, b, and m

Breaking the path

We can identify X \(\rightarrow\) Y by controlling for any variable in the backdoor path to break the chain: m, a, or b

Solve the DAG

Solve the DAG

Solve the DAG

Solve the DAG

Solve the DAG