Causal Inference
[…] Process of drawing a conclusion between Cause and Effect
— Vogt (2011)
Think pharmacovigilance, comparative effectiveness, health policy, causes of diseases and disease progression. Informatics is uniquely suited to examine causal inference.
You need to pick a single theoretical philosophy. This is known as an identification strategy. There are several! INUS Conditions is one of them. Counterfactual Reasoning is another (very important in biomedical reasoning.)
If you have a confounder, you distort the effect. Study may be significant statistically but yeah.
INUS Conditions
- Insufficient: A single factor doesn’t cause the outcome. Need other factors. Lit match is Insufficient for a fire.
- Necessary: Required within a specific causal combination! Match + gasoline.
- Unnecessary: The combo is not the only one. Match + Curtains.
- Sufficient: The combo is enough. Match + Paper + Gasoline.
A lit match is an INUS condition for fire.
Think a pie chart with Paper, Match, Fire. If you remove any one condition, the pie chart is incompelte and not a condition for fire.
Criteria Based
Hill’s (prominent, 1965) and Susser’s. Structured guidance to assess whether an observed relationship is likely to be causal. Hill’s is empirical and this is very theoretical (can happen a priori or post-hoc). Again, guidelines.
Hill
- Strength of association: strong association is less likely due to bias.
- Consistency: Across populations
- Specificity: Single cause → Single Effect (HIV → AIDS, simple stuff)
- Temporality: cause must precede effect
- Biological Gradient/Dose-Response: increased exposure → increased risk
- Plausibility: Specifically biologic Plausibility. Lit search.
- Coherence: A causal claim cannot contradict knowledge
- Experiment: Conduct a prospective study and effect should follow (remove it and see also)
- Analogy: Similar agents must produce similar effects
Guidelines and not rigid rules. “None can be regarded as a sine qua non.”
Susser (1973)
Core Criteria
- Association: statistics, quantifiable. W/o association, no causation.
- Temporality
- Direction: It must go from cause to effect. This is more related to multifactorial causality.
Additional Supporting Criteria
- Mechanism: Plausibility really
- Consistency with existing knowledge
- Predictive Performance
Multifactorial causality and contextual thinking
- Sufficiency and Necessity: Kinda like INUS Conditions
- Interaction and Synergism: Causes can interact in non-causal ways
- Causal Chains and Webs: Causes can interact in complex ways
Probabilistic Causality
Probabilistic Causation
A causes B if A happening increases the probability of B happening. Exposure increases risk but not certainty.
A makes B more likely! Booze and car crashes. Some people crash without drinking.
Singular Causation
Individual effects. did a specific exposure cause a specific outcome in an individual case.
Counterfactual Reasoning
“What if X had happened instead under different conditions?” This is what models claim to do if you think about it.
Individual → Pill → Outcome (function of treatment and covariates)
Now what if Individual did not take the drug? If you could, you would compute the individual treatement effect (ITE) — with drug minus without drug. ATE is averaged across people.
You cannot do this! The Fundamental Problem of Causal Inference. You cannot observe the counterfactual.
You approximate it using the Rubin Causal Model.
Now the Counterfactual Treatement Effect = Observational Treatement Effect when two assumptions are met:
- SUTVA — Stable Unit Treatment Value Assumption (violation: one patient’s treatment/exposure may affect another’s outcome).
- Strong Ignorability: Treatment assignment is unconfounded and completely ignorable. Think RCTs! Other features about you do not dictate your exposure.
So Randomized Experimentation is the key here. This is why it’s a “Gold Standard” for very practical reasons. RCTs meet the Rubin’s Causal Model (#1 is assumed and #2 in theory.)
Problems with RCTs:
- Unethical to deny treatment!
- Expensive
- Blind to “Subpopulation Heterogenity of Treatment Effect” (what it sounds like) — bias and poor external validity.
Think of a ‘true’ distribution of ATE that has two humps and you only randomize across just one ‘hump’. You get Local ATE (LATE). It is internally valid but not externally valid!
So to prevent this you would have to claim that (a) treatment effect homogenous or (b) you know all the Subpopulations OR that you have full coverage (all Subpopulations in sample.)
Think about expenses: richer populations. You might be desperate/sicker to sign up for an RCT. That box keeps getting smaller…
Observational Data
Passively collected. Lots of advantages. Large , cheaper (stuff’s already done). Broader (LATE is more externally valid.)
Internal Validity —→ We are measuring what we think we are measuring.
Now the lack of randomization that renders Observational studies invalid. What if if the doctor and their assignment based on who you are is the confounder? LATE would be less internally valid but more externally valid.
The Heart
The real heart is: How can you make Observational studies behave like RCTs?
Matching
See other notes
Weighting
A generalization of matcing. Roots in survey sampling. Larger weights to underrepresentation and smaller weights to overrepresentation.
| Type 1 | Type 2 | |
|---|---|---|
| T =1 | 80% | 20% |
| T= 0 | 20% | 80% |
Lower left needs to be upweighted. Weight populations such that is Ignorability which is important for counterfactual reasoning!
It’s not as simple as you think and there are several methods. E.g. Weighting by Odds, Kernel Weighting, etc.
Inverse Prob of Treatment (IPTW) / Inverse Propensity Weighting
Prob of being assigned to the treatment is the propensity score, :
Stabilized weights
There’s another called AIPW (Augmented Inverse Propensity Weighting)…
NOTE The formulas above are for two but there may be more than two groups!
Adjustment
It can mean a lot of things but we’re talking abotu statistical adjustment. Use with matching methods!
Support -→ Overlappting covariates in treatment and control arms. TODO:
You can misspecify… what if the relationship (exposure, outcome, covariates) is non-linear and you’re using linear models?
Sample size is another problem that limits DOF for covariate adjustment.
Stratification
Problem here is you’ll get high variance with small sample size.
Multivariate Modeling Methods
AKA Response Surface Modeling. You explicitly model the relationship between treatment, covariates, and outcome. Logistic/Linear Regression. You can use with more complex ML methods.
Think about this and how you add .