Causal Inference

PSYC 573

Causation

Data are profoundly dumb about causal relationships

Pearl and Mackenzie (2020)

Outline

  • Thought experiments
  • Potential outcomes
  • Causal diagrams
  • Mediation

Causal Inference

Obtaining an estimate of the causal effect of one variable on another

an hour more exercise per day causes an increase in happiness by 0.1 to 0.2 points

  • Intervention: if I exercise one hour more, my happiness will increase by 0.1 to 0.2 points
  • Counterfactual: had I exercised one less hour, my happiness would have been 0.1 to 0.2 points less

Potential Outcomes

\(T\) is the binary treatment variable (e.g., new drug for boosting stat knowledge)

Person Math Attitude Y (if T = 1) Y (if T = 0) Y(1) - Y(0)
1 4 75 70 5
2 7 80 88 -8
3 3 70 75 -5
4 9 90 92 -2
5 5 85 82 3
6 6 82 85 -3
7 8 95 90 5
8 2 78 78 0
Average 5.5 81.875 82.5 -0.625

Average Treatment Effect (ATE)

\[ \text{ATE} = \bar Y(1) - \bar Y(0) \]

Observed Outcomes

Only one potential outcome is observed for each person

E.g., Persons 2, 4, 6, 7 take the drug

Person T Math Attitude Y (if T = 1) Y (if T = 0)
1 0 4 70
2 1 7 80
3 0 3 75
4 1 9 90
5 0 5 82
6 1 6 82
7 1 8 95
8 0 2 78
average 5.5 86.75 76.25

Directed Acyclic Graph

Allows researchers to encode causal assumptions of the data

  • Based on knowledge of the data and the variables

Data from the 2009 American Community Survey (ACS)

Does marriage cause divorce?

  • A = Median age of marriage
  • M = Marriage rate
  • D = Divorce rate

“Weak” assumptions

  • A may directly influence M
  • A may directly influence D
  • M may directly influence D

“Strong” assumptions

Absence of a link

  • E.g., M does not directly influence A
  • E.g., A is the only relevant variable in the causal pathway M → D

Basic Types of Junctions

Fork: A ← B → C

Chain/Pipe: A → B → C

Collider: A → B ← C

Fork

aka Classic confounding

  • Confound: something that misleads us about a causal influence

M ← A → D

Assuming the DAG is correct,

  • the causal effect of M → D can be obtained by holding constant A
    • stratifying by A; “controlling” for A

Pedicting an Intervention

What would happen to the divorce rate if we encourage more people to get married, so that marriage rate increases by 1 per 10 adults?

Based on our DAG, this should not change the median marriage age

Randomization

Removing incoming path to the “causal” variable

Framing Experiment

  • X: exposure to a negatively framed news story about immigrants
  • Y: anti-immigration political action

No Randomization

Randomization

Back-Door Criterion

The causal effect of X → Y can be obtained by blocking all the backdoor paths that do not involve descendants of X

  • Randomization: (when done successfully) eliminates all paths entering X
  • Conditioning (holding constant)

Dagitty

library(dagitty)
dag4 <- dagitty("dag{
  X -> Y; W1 -> X; U -> W2; W2 -> X; W1 -> Y; U -> Y
}")
latents(dag4) <- "U"
adjustmentSets(dag4, exposure = "X", outcome = "Y",
               effect = "direct")
{ W1, W2 }
impliedConditionalIndependencies(dag4)
W1 _||_ W2

Post-Treatment Bias

Adjusting/“controlling” for covariates imply a causal interpretation

Please do not simply adjust for a variable without thinking about it (especially variables that may be impacted by the treatment)

Data for Framing Experiment

  • cong_mesg: binary variable indicating whether or not the participant agreed to send a letter about immigration policy to his or her member of Congress

  • emo: post-test anxiety about increased immigration (0-9)

  • tone: framing of news story (0 = positive, 1 = negative)

Results

No adjustment Adjusting for feeling
b_Intercept −0.81 [−1.17, −0.44] −2.01 [−2.64, −1.45]
b_tone 0.21 [−0.30, 0.73] −0.14 [−0.70, 0.43]
b_emo 0.32 [0.21, 0.43]
R2 0.003 0.143

Which one estimates the causal effect of tone?

Mediation

Mediation is a causal analysis, by definition

Mediation

In the DAG, E is a post-treatment variable potentially influenced by T

  • E is a potential mediator

Important

A mediator is very different from a confounder

Direct Effect

Causal effect when holding mediator at a specific level (e.g., T → C when E = 5)

Controlled direct effect

tone emo Estimate Est.Error Q2.5 Q97.5
0 0 0.121 0.032 0.066 0.191
1 0 0.108 0.033 0.054 0.183
0 9 0.698 0.070 0.556 0.823
1 9 0.670 0.063 0.542 0.786

Natural Direct Effect

Comparing two potential outcomes: (a) Y(T = 1, M = M[0]) and (b) Y(T = 0, M = M[0])

E.g., What would the effect of negatively-framed story be had it not elicited negative emotions?

Natural Indirect Effect

Change in \(Y\) of the control group if their mediator level changes to what the treatment group would have obtained

i.e., Y(T = 0, M = M[1]) - Y(T = 0, M = M[0])

E.g., What would the effect of negatively-framed story be had it only elicited negative emotions, but did not affect anything else?

Notes on Mediation

  • When the effects of T → M (usually called the a path) and M → Y (usually called the b path) are assumed linear, the indirect effect equals the product of the paths (ab)
    • When a and/or b are not linear, the indirect effect is not constant across different levels of T and M
    • Also the case when there is interaction between the treatment and the mediator

Assumptions of Mediation

  • No unmeasured treatment-outcome confounding
  • No unmeasured mediator-outcome confounding
  • No unmeasured treatment-mediator confounding
  • The mediator-outcome path is not moderated by the treatment

Note: randomization of the treatment only rules out confounding for T → M, but not for M → Y

Sensitivity Analysis

Assign priors representing plausible magnitude of confounding (see notes for an example)

Collider Bias

E.g., Is the most newsworthy research the least trustworthy?

Collider Bias in Real Research/Real Life

  • Adjusting for current neighborhood when estimating effect of schooling on earnings
  • Studying the link between impulsivity and delinquency among high-risk youth
  • Estimating association between standardized test among students admitted
  • Studying infant mortality and maternal smoking among infants with low birth weight

Instrumental Variables

  • X = Career Adaptability
  • Y = Job Satisfaction
  • Z (instrument) = Conscientiousness
  • U = Confounding

Instrument

  • Plausible cause of X
  • Can only affect Y through X

Other Examples of Instrumental Variables

  • Distance to the nearest college for the effect of education on earnings
  • Hospital’s encouragement on the effect of breastfeeding on weight outcomes

Some Other Topics for Causal Inference

  • Propensity score analysis
    • Estimate probability of being in the treatment for each participant based on other covariates
    • Balancing out pre-treatment covariates so that the comparison more resembles a randomized experiments
  • Regression discontinuity
    • E.g., Assigning to treatment only when pre-test score is below a cutoff
    • Pre-test score becomes the only confounding variable to be adjusted for

Remarks

  • Causal inference requires causal assumptions
    • You need a DAG
  • Blindly adjusting for covariates does not give better results
    • post-treatment bias, collider bias, etc
  • Think carefully about what causal quantity is of interest
    • E.g., direct, indirect, total
  • Causal inferences are possible with both experimental and non-experimental data

References

Pearl, Judea, and Dana Mackenzie. 2020. The book of why: The new science of cause and effect. First trade paperback edition. New York: Basic Books.