Skip to main content

Types of Studies

Not meant to be exhaustive.

From various class notes. There will be a lot of stuff missing here. I’ll finish this one day lol.

You can zoom and pan this.

Broad Classes​

Quantitative and Qualitative​

This is about what data is collected.

Quantitative focuses on the measurable (statistics involved of course).

Qualitative is about observation (description, explanation) that produces artifacts (e.g. interviews, photos, sketches of spaces, notes) and may use some quantitative data. It can be exploratory research (see and refine questions, understand opportunities.) Can be used for evaluation (why are informatics interventions used or not used?)

Prospective and Retrospective​

This is about the temporality of the data that is collected. Look forward or backward in time. One is cheaper because things have already happened. One gives you more control.

Observational and Experimental​

Did you change anything in the world? Just observed it? Observational Study. Did you assign an exposure? Experimental.

Measurement and Demonstration​

Measurement Study is where you evaluate the properties of a measurement instrument or method itself: these are the objects of investigation.

You typically use these later in Demonstration Studies where you want to show that an intervention, system, or technology works. The instrument or system is now the tool, and you’re investigating its impact.

Experimental and Quasi-Experimental Studies​

Key distinction is randomization. Experimental studies randomly allocate subjects to exposure groups, which (with sufficient N) removes confounding. Quasi-experimental studies assign exposures without randomization β€” common when randomization is infeasible or unethical.


Observational​

No exposure assigned (i.e. no influence on subjects). Descriptive. Analytical: give associations through comparisons. Can infer cause-effect relationships. Less common in interventions research unless you’re studying existing technologies/interventions.

Cross-Sectional Studies​

Slice in time. There is no temporality in cross-sectional studies. It’s a slice in time. Because of this, there cannot be causal assertions. You can only speak of associations.

You can measure prevalence. You cannot measure incidence (incidence is about new cases, which implies temporality). Think of how timelines (e)------(o) get smushed into a spreadsheet β€” you just get (e)s and (o)s, the lines vanish.

                    β”Œβ”€β”€β”€ Single Time Point ───┐
β”‚ β”‚
β–Ό β–Ό

Population ──► Sample ──► Measure simultaneously:
β”œβ”€β”€ Exposure status (E+ / Eβˆ’)
└── Outcome status (O+ / Oβˆ’)

Resulting 2Γ—2:
O+ Oβˆ’
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”
E+ β”‚ a β”‚ b β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€
Eβˆ’ β”‚ c β”‚ d β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ time = tβ‚€ ──────────────────
(no follow-up)

Cohort Studies​

Select on exposure, follow for outcome. The key move: enforce temporality by design. These have the most information about time of any observational design.

                                    β”Œβ”€β”€β–Ί Outcome+
β”Œβ”€ Exposed ───
β”‚ └──► Outcomeβˆ’
Population ──► Cohort
β”‚ β”Œβ”€β”€β–Ί Outcome+
β”” Unexposed ──
└──► Outcomeβˆ’

tβ‚€ ─────────────────────── follow-up ───────────────────► t₁
(define exposure) (ascertain outcome)

Risk in exposed = a / (a+b)
Risk in unexposed = c / (c+d)
Relative Risk = [a/(a+b)] / [c/(c+d)]

Prospective: Study begins at point of exposure. You have a lot of control. You cannot always assume causality. Kinda costly for rare outcomes (gotta recruit a lot of people!). Loss-to-follow-up is a big problem.

Retrospective: Study begins at point of outcome. No control (lol). Inexpensive (already happened, someone else paid the bill). But you’re limited in terms of sampling and quality (already happened). And you cannot always assume causality. More prone to selection, information, and recall bias.

Ambidirectional (β€œCallback Studies”): Rare. β€œWeld” two timelines together β€” retrospective data combined with prospective follow-up. Data inconsistency is a headache. Inherits the bummers of both designs.

Multiple/Double Cohort: You have several cohorts with different levels of the same exposure. Lots of potential confounding risk here.

info

Cohort studies can start with cross-sectional studies where you can establish prevalence.

Case-Control Studies​

Select on outcome, look back at exposure. The logical inversion of a cohort study. Always retrospective.

                            β”Œβ”€β”€ Exposed (a) ◄──┐
Cases (O+) ──── β”‚
└── Unexposed (c) ◄─
β”‚ recall /
β”‚ records
β”Œβ”€β”€ Exposed (b) ◄───
Controls (Oβˆ’) ──── β”‚
└── Unexposed (d) β—„β”˜

◄──────── look backward in time ──────── tβ‚€ (sampling)

Odds Ratio = (aΒ·d) / (bΒ·c)

You can study just one outcome. Big challenge is bias: you recruit your cases separately from controls (you can match, for example, to get over this problem). Doesn’t have the temporal ordering that cohort studies do.

Controls approximate what the cases would have looked like had they not developed the disease β€” it’s a bit counterfactual-y. Cases and controls must come from the same source population (don’t break links). This is why matching matters: cases are straightforward but controls are hard β€” you don’t want confounding by covariates or time.

When to use: Rare diseases, long latency periods. Think glioblastoma β€” a cohort study would require following participants for decades.

Metrics: You cannot compute Risk Ratios! You’re fixing the outcome distribution (setting the case/control ratio), so you don’t know (a+b)(a + b) and (c+d)(c + d). You can compute the Odds Ratio. For rare diseases, RRβ‰ˆORRR \approx OR.

Nested Case-Control​

Say you follow 50,000 people over 20 years to see if an exposure caused a disease. Testing all of them would be expensive. What if you only tested people who didn’t get sick?

THE COHORT (already exists, already followed)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
50,000 people
β”‚
β–Ό
follow them for 20 years
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 500 people develop the disease = CASES β”‚
β”‚ 49,500 people don't = potential β”‚
β”‚ controls β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
For each case, randomly pick a few controls
from people who were still disease-free at
the moment that case got sick.
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 500 cases + ~2,000 controls (4 per case) β”‚
β”‚ β”‚
β”‚ NOW measure exposure on these 2,500 people. β”‚
β”‚ Skip the other 47,500. β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Sampling Strategies​

The hardest part. The control group should be a snapshot of the baseline population.

StrategyWhen SampledNotes
Case-Base / Case-CohortAt startControls = random sample from source population at beginning. Use for very rare outcomes. Controls may develop the outcome.
Cumulative Density / SurvivorAt endControls = people who survived without outcome. Problem: survivor bias.
Incidence Density / Risk SetEach time a case occursFor every incident case, randomly sample from risk set at that time. Preserves underlying risk process. Good for short-term or fluctuating exposures.

Crossover Studies​

Cases serve as their own controls. Reduces random error and confounding. Think seasonal allergies β€” within-person comparison across time periods.


Experimental​

Exposure is assigned by the researcher. Randomization is the dividing line between experimental and quasi-experimental.

Randomized Controlled Trial (RCT)​

The gold standard. You assign exposure randomly and follow for outcome. Establishes temporality and (with sufficient N) removes confounding.

Key features:

  • Stringent inclusion criteria
  • Typically one primary predictor and one primary outcome β€” keep it simple
  • Randomization only works with large N. Small samples may not remove all confounders.
                          ╔═══════════════╗
β•‘ RANDOMIZATION β•‘
β•šβ•β•β•β•β•β•β•β•€β•β•β•β•β•β•β•β•
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Treatment β”‚ β”‚ Control β”‚
β”‚ (n=N/2) β”‚ β”‚ (n=N/2) β”‚
β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚
β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Outcome β”‚ ◄─── compare ────► β”‚ Outcome β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

tβ‚€ ─── allocation ─── follow-up ─── outcome assessment ── t₁

Factorial Design​

Assess the impact of multiple interventions (all permutations) without creating a separate arm for each. More efficient for multi-intervention testing.

                      Factor B: Behavioral coaching
Bβˆ’ B+
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
Factor A: Aβˆ’ β”‚ Group 1 β”‚ Group 2 β”‚
Drug β”‚ Placebo + β”‚ Placebo + β”‚
β”‚ No coaching β”‚ Coaching β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
A+ β”‚ Group 3 β”‚ Group 4 β”‚
β”‚ Drug + β”‚ Drug + β”‚
β”‚ No coaching β”‚ Coaching β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Main effect of A = (G3+G4)/2 βˆ’ (G1+G2)/2
Main effect of B = (G2+G4)/2 βˆ’ (G1+G3)/2
AΓ—B interaction = (G4βˆ’G3) βˆ’ (G2βˆ’G1)

Crossover Design​

Each subject receives both treatment and control in sequence (order randomized). Subjects serve as their own controls. Reduces inter-subject variability. Washout period between treatments needed.

                    β”Œβ”€β”€ RANDOMIZE SEQUENCE ──┐
β–Ό β–Ό
Sequence 1: A ──► [washout] ──► B
β”‚ β”‚
└── outcomes Y₁ ─────────┴── outcomes Yβ‚‚

Sequence 2: B ──► [washout] ──► A
β”‚ β”‚
└── outcomes Y₁ ─────────┴── outcomes Yβ‚‚

Period 1 Period 2 Period 3
β”œβ”€β”€β”€β”€ Tx β”€β”€β”€β”€β”€β”œβ”€β”€ washout β”€β”€β”€β”œβ”€β”€β”€β”€ Tx ─────
(clears carryover)

Within-subject contrast: Yₐ βˆ’ Y_b for each participant

Cluster Randomized Trial​

Randomize at the level of groups (clinics, schools, communities) rather than individuals. Use when individual randomization is infeasible or risks contamination.

Population of clusters:  β—― β—― β—― β—― β—― β—― β—― β—―
β”‚
β–Ό
╔═══════════════════╗
β•‘ RANDOMIZE CLUSTERSβ•‘
β•šβ•β•β•β•β•β•β•β•β•β•€β•β•β•β•β•β•β•β•β•β•
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β–Ό β–Ό
Treatment clusters Control clusters
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β—― Clinic A β”‚ β”‚ β—― Clinic C β”‚
β”‚ β”œ pt pt pt β”‚ β”‚ β”œ pt pt pt β”‚
β”‚ β—― Clinic B β”‚ β”‚ β—― Clinic D β”‚
β”‚ β”œ pt pt pt β”‚ β”‚ β”œ pt pt pt β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚
β–Ό β–Ό
Outcomes in pts ◄── compare ──► Outcomes in pts

Note: individuals within a cluster are correlated β†’ ICC

Microrandomized Trials (MRT)​

Used a lot in behavioural change interventions. Idea is to not split participant into intervention and control group. Each participant, whenever there is an opportunity for an intervention, is randomized.

Smaller sample sizes (think half) than RCTs. Learning effects mess this up, perhaps. However, in CDS, you are either receive information or not so this effect is not as profound here.

  Participant i:

t₁ tβ‚‚ t₃ tβ‚„ tβ‚… t₆ t₇ ...
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β–Ό β–Ό β–Ό β–Ό β–Ό β–Ό β–Ό
[R] [R] [R] [R] [R] [R] [R] ← randomize at each
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ decision point
A Ø A A Ø A Ø ← prompt / no prompt
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
Y₁ Yβ‚‚ Y₃ Yβ‚„ Yβ‚… Y₆ Y₇ ← proximal outcome

Estimand: causal excursion effect
= E[Yβ‚œ | Aβ‚œ=1, Hβ‚œ] βˆ’ E[Yβ‚œ | Aβ‚œ=0, Hβ‚œ]
averaged over decision points and participants.

Sequential, Multiple Assignment, Randomized Trial (SMART)​

Adaptive treatment design: multiple decision points with re-randomization based on response. Useful for building adaptive treatment strategies.

Here’s a more colorful picture.

                ╔═══════════════╗
β•‘ RANDOMIZE R₁ β•‘
β•šβ•β•β•β•β•β•β•β•€β•β•β•β•β•β•β•β•
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β–Ό β–Ό
Treatment A Treatment B
β”‚ β”‚
evaluate response evaluate response
at week 8 at week 8
β”‚ β”‚
β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”΄β”€β”€β”€β”€β”
β–Ό β–Ό β–Ό β–Ό
Responder Non-resp. Responder Non-resp.
β”‚ β”‚ β”‚ β”‚
Continue ╔══════╗ Continue ╔══════╗
A β•‘ Rβ‚‚ β•‘ B β•‘ Rβ‚‚ β•‘
β•šβ•β•β•€β•β•β•β• β•šβ•β•β•€β•β•β•β•
β”‚ β”‚
β”Œβ”€β”€β”΄β”€β”€β” β”Œβ”€β”€β”΄β”€β”€β”
β–Ό β–Ό β–Ό β–Ό
Aug. Switch Aug. Switch
A to B B to A

Builds dynamic treatment regimes:
"Start A; if non-responder by wk 8, switch to B"

Quasi-Experimental​

Exposure is assigned but not randomly. Common when randomization is infeasible or unethical. Ranked roughly from weakest to strongest:

Pre-Post Design​

Weakest. Single measurement before and after intervention. Major threat: regression to the mean.

              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Pretest β”‚ β”‚ Posttest β”‚
β”‚ O₁ β”‚ ──► β”‚ Oβ‚‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β–²
β”‚ Intervention X β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Ίβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Estimate of effect = Oβ‚‚ βˆ’ O₁

⚠ Threats: history, maturation, regression to mean,
testing effects (no counterfactual)

Posttest Only​

Measure only after intervention. No baseline β€” hard to attribute change to the intervention.

      Group ──► Intervention X ──► Posttest O₁

tβ‚€ ─────────── X ─────────────────────────► t₁

⚠ No baseline, no comparison group.
Cannot establish change or causation; descriptive only.

Posttest Only with Controls​

Add a concurrent control group but still no pre-test. Better than posttest only but you can’t verify group equivalence at baseline.

              Group A (self-selected) ──► X ──► O₁
β”‚
β”‚ compare
β–Ό
Group B (self-selected) ────────► Oβ‚‚

⚠ Selection bias is the dominant threat β€” groups may have
differed before X. No pretest to verify equivalence.

Pre-Post with Concurrent Controls​

Very common. Also called Interrupted Time Series. Sample β†’ Treat β†’ Measure β†’ No Treat β†’ Measure β†’ …

  Treatment grp:   O₁ ──── X ────► Oβ‚‚
β”‚ β”‚
β”‚ β”‚ Ξ”_T = Oβ‚‚ βˆ’ O₁
β”‚ β”‚
Control grp: O₃ ──────────► Oβ‚„
β”‚
β”‚ Ξ”_C = Oβ‚„ βˆ’ O₃

Difference-in-differences:
DiD = Ξ”_T βˆ’ Ξ”_C
= (Oβ‚‚βˆ’O₁) βˆ’ (Oβ‚„βˆ’O₃)

tβ‚€ ──── pretest ──── X ──── posttest ──► t₁

Controls for: secular trends, history, maturation
Does NOT control for: differential selection on trajectory

Innate characteristics are eliminated as confounders. But: regression to the mean and learning effects remain. Not appropriate for drug efficacy.

Removed Treatment Design​

Single group acts as its own control. Over equal time periods, you add and remove the intervention. Stronger than pre-post because the pattern of improvement/reversal supports causal inference.

  Phase:        Baseline    Treatment    Removal
β”‚ β”‚ β”‚
Time: tβ‚€ ────► t₁ ────────► tβ‚‚ ────────► t₃

Group: O₁ ─────► Oβ‚‚ ──X──► O₃ ──×──► Oβ‚„
β–² β–² β–²
β”‚ β”‚ β”‚
pre-Tx during Tx after Tx
withdrawn

Expected pattern if X works:
Oβ‚‚ Oβ‚„
O₁ ───► ● ●
● / \ /
/ \ /
/ \ /
────────● ●●●
baseline treatment removal

Effect = (O₃ βˆ’ Oβ‚‚) and reversal at (Oβ‚„ βˆ’ O₃)

Double Pretest Pre-Post Design​

Measure β†’ wait β†’ Measure again β†’ Introduce intervention β†’ Measure. Measurements must be equally spaced. The double pretest lets you detect pre-existing trends (β€œsomething funky going on”) before attributing change to the intervention.

              β”Œβ”€β”€β”€β”€ pretests ────┐
Group: O₁ ────────► Oβ‚‚ ──── X ────► O₃
β”‚ β”‚ β”‚
β”‚ baseline β”‚ interventionβ”‚
β”‚ trend β”‚ effect β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

t₋₁ ─────── tβ‚€ ─────────── X ─────────► t₁

Uses (Oβ‚‚ βˆ’ O₁) to estimate the underlying trend,
then asks: does (O₃ βˆ’ Oβ‚‚) exceed that trend?

Helps rule out: maturation, regression to the mean,
pre-existing time trends

Bias​

Think of the contingency table. Bias is systematic/structural distortion between exposure and outcome.

O+O-
E+ab
E-cd

Two things can go wrong: (a) who gets into the table (selection bias) and (b) how/where they are placed in the table (information bias).

BiasWhat it isMost prone design
SelectionWho gets into the study. Loss-to-follow-up is a subtype.Retrospective; prospective (attrition)
InformationHow subjects are classified in cells.Retrospective
RecallDifferential accuracy of remembered exposures.Retrospective, case-control
SurvivorOnly survivors available for sampling.Cumulative/survivor sampling in case-control
Regression to the meanExtreme measurements tend toward average on re-test.Pre-post designs

Misclassification​

When subjects move between cells (e.g. a→ba \to b, c→dc \to d, or vertically).

  • Non-differential: Misclassification is the same across groups. Biases toward the null (slightly better β€” you underestimate the effect).
  • Differential: Misclassification rate differs between groups. Can distort in any direction.

Key Measures​

MeasureComputable in…Notes
PrevalenceCross-sectional, case-controlSnapshot of who has what
IncidenceCohort, RCTRequires temporality
Risk Ratio (RR)Cohort, RCTa/(a+b)c/(c+d)\frac{a/(a+b)}{c/(c+d)}
Odds Ratio (OR)Case-control (primary), all othersaβ‹…dbβ‹…c\frac{a \cdot d}{b \cdot c}; for rare diseases RRβ‰ˆORRR \approx OR
Incidence RateCohortCases / person-time

Validity​

  • Internal Validity: How sound your statistical treatment is. Did your study measure what it set out to measure?
  • External Validity: Generalizability. Do findings apply beyond your study population?

Also distinct from test validity, which is about instrumentation/measurement tools.


Simpler Diagrams​

Using the amazing MonoDraw.

OBSERVATIONAL
-------------

Cross-Sectional
Population ──► Measure exposure + outcome once
(single time point)

Cohort
Population ──► Exposed ───────► Follow over time ───────► Outcome?
└─► Unexposed ─────► Follow over time ───────► Outcome?

Case-Control
Cases = outcome present ─────┐
β”œβ”€β”€β–Ί Look backward for exposure
Controls = outcome absent β”€β”€β”€β”€β”€β”˜

EXPERIMENTAL
------------

Randomized Controlled Trial (RCT)
Eligible participants ──► Randomize
β”œβ”€β”€β–Ί Intervention ──► Outcome
└──► Control ───────► Outcome

Factorial Design
Eligible participants ──► Randomize
β”œβ”€β”€β–Ί A only ────────► Outcome
β”œβ”€β”€β–Ί B only ────────► Outcome
β”œβ”€β”€β–Ί A + B ─────────► Outcome
└──► Neither ───────► Outcome

Crossover Design
Participants ──► Randomize
β”œβ”€β”€β–Ί Treatment A ─► Washout ─► Treatment B ─► Outcome
└──► Treatment B ─► Washout ─► Treatment A ─► Outcome

Cluster Randomized Trial
Groups/clusters ──► Randomize
β”œβ”€β”€β–Ί Cluster intervention ─► Outcome
└──► Cluster control ──────► Outcome

Microrandomized Trial (MRT)
Participant timeline:
T1 ── R ──► Intervention? ──► Response
T2 ── R ──► Intervention? ──► Response
T3 ── R ──► Intervention? ──► Response
T4 ── R ──► Intervention? ──► Response

SMART
Participants ──► Randomize
β”œβ”€β”€β–Ί Treatment A ─► Respond? ─┬──► Continue A
β”‚ └──► Switch/intensify
└──► Treatment B ─► Respond? ─┬──► Continue B
└──► Switch/intensify

QUASI-EXPERIMENTAL
------------------

Pre-Post Design
O1 ──► X ──► O2

Posttest Only
X ──► O

Posttest Only with Controls
X ──► O1
No X ─► O2

Pre-Post with Concurrent Controls
O1 ──► X ──► O2
O1 ──► No X ──► O2

Removed Treatment Design
O1 ──► X on ──► O2 ──► X removed ──► O3

Double Pretest Pre-Post Design
O0 ──► O1 ──► X ──► O2