Today
- Using structural models to perform causal inference
- Represent structural relationships between variables using
- Systems of equations
- Causal Diagrams
- Structural models have implications for conditional independence
- Use these to justify causal interpretation of regression
Structural Equations
- A structural equation represents the effect of variables that directly cause an outcome
- Describes a model of how a variable is generated, and how it would change if other variables were changed
- \(Y=f(X,U)\), \(U\perp X\) is structural if \[P(Y|do(X=x))=P(f(x,U))\]
- Assuming linear structural form, i.e., for \(U\perp X\), \[Y=\beta_0 + \beta_1X_1+\beta_2X_2+\beta_3X_3+\ldots+U\]
- Each coefficient \(\beta_j\) gives the direct causal effect of \(X_j\) on \(Y\)
- A structural equation is distinct from a regression equation
- Regression equation is what you estimate
- Can always run, given data
- Structural equation is model of the world
- May or may not be same as what you run
Structural Equation Models
- A structural equation model (SEM) describes full set of causal relationships between variables in a system
- Each endogenous variable \(Y_1,Y_2,\ldots,Y_p\) described by structural equation \[Y_1=f_1(Y_2,\ldots,Y_p,U_1)\] \[Y_2=f_2(Y_1,Y_3,\ldots,Y_p,U_2)\] \[\vdots\] \[Y_p=f_p(Y_1,Y_2,\ldots,Y_{p-1},U_p)\]
- Each structural equation contains only direct causes
- Variables \(Y_j\) with no effect are excluded
- Variables \(U\) are exogenous
- Not caused by any variable \(Y\) in system
Causal Graphs (Code)
#Library to create and analyze causal graphs
library(dagitty)
library(ggdag) #library to plot causal graphs
yxdag<-dagify(Y~X) #create graph with arrow from X to Y
#Set position of nodes so they lie on a straight line
coords<-list(x=c(X = 0, Y = 1),y=c(X = 0, Y = 0))
coords_df<-coords2df(coords)
coordinates(yxdag)<-coords2list(coords_df)
#Plot causal graph
ggdag(yxdag)+theme_dag_blank()+
labs(title="X causes Y",
subtitle="X is a parent of Y, Y is a child of X")
Causal Graphs
- A structural equation model can be represented as a graph
- Variables are nodes: box or circle
- Direct causal effects are (directed) edges: lines with arrows indicating direction
- Nodes at source of arrow are called parents, nodes at ends are children
- Simplest example: structural equation model
- \(Y=f_1(X,U_y)\) \(X=f_2(U_2)\)
- Graph by R packages
dagitty
and ggdag
- \(U_1,U_2\) not shown explicitly: \(X\) and \(Y\) are variables of interest, with exogenous random variation caused by \(U_1,U_2\)
Implications of structural equations models
- To go from an SEM to distribution of data, solve the system of equations
- Find endogenous variables \(Y\) as function of only exogenous variables \(U\)
- Known distribution of \(U\) then gives joint distribution of \(Y\)
- Today, restrict interest to acyclic structural models
- Causal graph does not contain path along directed edges from a variable to itself
- Variables do not cause themselves
- Further, assume \((U_1,\ldots,U_p)\) are mutually independent
- For acyclic models, solve by
- Start with variables with no incoming arrows
- Substitute variable in direction of arrows for its structural equation
- Repeat until no endogenous variable appears on lefthand side
- Resulting model is called the reduced form
Example: Solving a structural model (Code)
examplegraph<-dagify(Y3~Y2,Y4~Y3+Y2,Y2~Y1) #create graph
#Set position of nodes
coords<-list(x=c(Y1 = 0, Y2 = 1, Y3 = 2, Y4 = 3),
y=c(Y1 = 0, Y2 = 0, Y3 = -0.1, Y4 = 0))
coords_df<-coords2df(coords)
coordinates(examplegraph)<-coords2list(coords_df)
#Plot causal graph
ggdag(examplegraph)+theme_dag_blank()
+labs(title="Y1 causes Y2,
Y2 causes Y3, Y3 and Y2 cause Y4")
Example: Solving a structural model
\(Y_1=f_1(U_1)\), \(Y_2=f_2(Y_1,U_2)\), \(Y_3=f_3(Y_2,U_3)\), \(Y_4=f_4(Y_3,Y_2,U_4)\)
- To solve, start at \(Y_1\), replace in \(Y_2\) to get
- \(Y_1=f_1(U_1)\), \(Y_2=f_2(f_1(U_1),U_2)\)
- Then substitute out for \(Y_2\)
- \(Y_3=f_3(f_2(f_1(U_1),U_2),U_3)\), \(Y_4=f_4(Y_3,f_2(f_1(U_1),U_2),U_4)\)
- Substitute out for \(Y_3\) to get reduced form
- \(Y_4=f_4(f_3(f_2(f_1(U_1),U_2),U_3),f_2(f_1(U_1),U_2),U_4)\)
Interpreting a structural equation model
- Solving an acyclic structural model gives joint distribution of endogenous variables
- What properties does the joint distribution have?
- Causal Markov property:
- Conditional on its parents, a variable is independent of any variable that is not a descendant
- \(Y_k\) is a descendant of \(Y_j\) if there is a path along directed edges from \(Y_j\) to \(Y_k\)
- Causal Markov Property completely defines implications of causal graph
- Absence of an edge means conditional independence
- In above example, applying this rule gives the following conditional indepencies
impliedConditionalIndependencies(examplegraph)
## Y1 _||_ Y3 | Y2
## Y1 _||_ Y4 | Y2
Every SEM tells a story
- Omitted edges imply conditional independence
- Only remove edge if you know there is no direct effect
- Direction of edges describe direction of effect
- Total effect includes indirect effect through other variables
- In linear additive case, effect along directed path from X to Y is product of SEM coefficients
- Total effect is sum of effects along all directed paths
- Example fitting above graph
- \(Y4\) is wages, \(Y3\) is social connections, \(Y2\) is education, \(Y1\) is a randomly assigned admissions decision
- Story says wages determined by social connections and education, and education affects social connections, and admission affects education but has no direct effect on other variables except through it
- Use model to make assumptions clear, and if we believe them, determine which effects we can estimate
“Do” operation and causal interventions
- The effect of a causal intervention can be explicitly calculated in an SEM using the Do operation
- Do \(Y_j=y\) describes action of setting a variable \(Y_j\) to a specific value \(y\)
- Remove equation \(Y_j=f_j(...)\) and replace by \(Y_j=y\)
- Solve new SEM to get distribution of any variable \(P(Y_k|do(Y_j=y))\)
- Equivalently, on causal graph
- Delete all directed edges pointing in to node \(Y_j\)
- New graph describes all relationships in world where \(Y_j\) has been changed
- In potential outcomes notation, \(Y_k^{Y_j=y}\) is distribution of \(Y_k\) in the perturbed SEM
- A “Causal effect” describes what world would be like if instead of its usual value, some variable were changed
- SEM allows calculating distribution of both observed and potential outcomes
- Can use relationship to identify causal effects
Example: Experiments (Code)
#Set position of nodes so they lie on a straight line
coords<-list(x=c(X = 0, Y = 1),y=c(X = 0, Y = 0))
coords_df<-coords2df(coords)
coordinates(yxdag)<-coords2list(coords_df)
ggdag(yxdag)+theme_dag_blank() #Plot causal graph
Example: Experiments
- In an experiment, \(X\) is set (randomly) by experimenter, so has no incoming edges
- Outcome of interest \(Y\) has only \(X\) and \(U_2\), variables independent of \(X\) as inputs
- Causal graph is therefore
- \(Do(X=x)\) deletes all incoming edges to X
- There are none, so graph is unchanged
- Using perturbed graph \(P(Y|do(X=x))=P(f_2(x,U_2))\)
- In original graph \(P(Y|X=x)=P(f_2(X,U_2)|X=x)=P(f_2(x,U_2))\) by \(U_1\perp U_2\)
- Result: \(P(Y|do(X=x))=P(Y|X=x)\)
- Exactly as derived in potential outcomes framework
Example 2: Confounding (Code 1)
confoundgraph<-dagify(Y~X+W,X~W) #create graph
#Set position of nodes
coords<-list(x=c(X = 0, W = 1, Y = 2),
y=c(X = 0, W = -0.1, Y = 0))
coords_df<-coords2df(coords)
coordinates(confoundgraph)<-coords2list(coords_df)
#Plot causal graph
ggdag(confoundgraph)+theme_dag_blank()
labs(title="Confounding of effect of X on Y by W")
Example 2: Confounding (Code 2)
perturbedgraph<-dagify(Y~x+W) #create graph
#Set position of nodes
coords<-list(x=c(x = 0, W = 1, Y = 2),
y=c(x = 0, W = -0.1, Y = 0))
coords_df<-coords2df(coords)
coordinates(perturbedgraph)<-coords2list(coords_df)
#Plot causal graph
ggdag(perturbedgraph)+theme_dag_blank()+
labs(title="Perturbed Graph")
Example 2: Confounding
- Suppose we care about effect of X on Y
- But W causes both X and Y
- Recover \(P(Y=y|do(X=x))\) by constructing perturbed graph
What have we learned so far?
- Introduced notation for Stuctural Equations, causal graphs
- Recover same results as in potential outcomes framework for experiments, adjustment
- This is general: just different notation for same model
- So why bother, aside from pretty pictures?
- Sometimes, relationship between variables takes different form
- Rules of structural equation models can handle many new results
- May require complicated algebra, but can be automated
- Next, show a few graphs representing alternative situations
Colliders (Code)
collidergraph<-dagify(W~Y,W~X) #create graph
#Set position of nodes
coords<-list(x=c(X = 0, W = 1, Y = 2),
y=c(X = 0, W = 0, Y = 0))
coords_df<-coords2df(coords)
coordinates(collidergraph)<-coords2list(coords_df)
#Plot causal graph
ggdag(collidergraph)+theme_dag_blank()+
labs(title="Collider structure")
Colliders
- A collider \(W\) is caused by both \(X\) and \(Y\)
- Grades \(X\) and wealth \(Y\) both help college admission \(W\)
- Unconditionally, \(X\perp Y\)
- But, conditional on \(W\), \(X\) and \(Y\) are dependent
- Knowing \(W\), \(X\) is informative about \(Y\)
- Among college students, those with rich family did not need high grades to get in.
- Observing rich family, can infer that likelihood of high grades was lower
- True even if in full population, grades and wealth independent
- Controlling for collider in regression of \(Y\) on \(X\) causes bias
- \(P(Y|do(X=x))=P(Y|X=x)=P(Y)\) not equal to \(\int P(Y|X,W=w)P(w)dw\)
- Commonality with mediator case: \(W\) is a descendant of \(X\)
Summary
- Structural Equations Models describe full set of causal relationships between variables in a system
- Causal graphs represent structural equations models and the conditional independencies they imply
- Causal effects represented by “Do” operation, which asks what happens in a new graph where treatment is fixed rather than assigned by existing mechanism
- SEMs can describe situations like experiments and control, but also mediation, colliders, and more complicated structures
- Working through the implications of causal graphs yield rules to describe when adjustment by regression measures a causal effect
References
- Short introduction
- Longer introduction
- Monograph: original primary source for this material.
- Pearl, Judea. (2009) Causality: Models, Reasoning, and Inference. 2nd ed. Cambridge UP.
- Software: R packages to analyze and display causal graphs
install.packages("dagitty")
install.packages("ggdag")