Hypothesis Testing
We really want to know if our interventions actually caused things in the world. Associations/correlations are nice but establishing causality is nicer.
With hypotheses, you are examining this association (and/or causality). You’re making statements about the population too (i.e. your study is not about the 26 people enrolled in it).
- Feasible
- Interesting
- Novel
- Ethical
- Relevant
Now if you suspect some association or causality but there’s no evidence out there (or it’s all over the place) you can perform a hypothesis generating study.
“Main outcome” is an emphasis in RCTs (the ‘gold standards’). Keep it simple. You focus on one primary outcome. You typically have one predictor and one outcome. You certainly can have multiple predictors and outcomes.
In defining a good hypothesis, you are engaged in the operationalization of all theoretical constructs into specific things (variables) you’ll be studying.
A good hypothesis precedes the experiment.
Aside on the crisis in statistics. TODO: Read Andrew Gelman’s blog post.
Falsifiability
A good hypothesis is falsifiable. “Science advances by rejecting inadequate theories” - Kuhn. 👉 This is why you start with the null hypothesis (which is what you want to disprove; there’s no relationships). The alternative hypothesis is the actual hypothesized association. Here, two-sidedness means there is an association that is unspecified (it’s bi-directional: your intervention made something better or worse) and it’s more conservative and preferred. Then there’s a one-sidedness which means association in a direction. If you do statistical hypothesis testing, you are trying to disprove the null hypothesis, the fantastic scientist that you are.
This is where the p-value comes in. It is:
If there is a very low probability
Validity
- Internal Validity refers to how sound your statistical treatment is.
- External Validity refers to generalizability.
Effect Size
TODO: What do you do here? Sure there’s an association but how ‘big’ is it?
Type I and II Errors
- Type I, is the probability of rejecting the Null if it is true (NO association!)
- Type II, is the probability of
acceptingfailing to reject the Null if it is false (YES association!)
Sampling
There’s a lot of work here (given the importance/preponderance of the CLT). Think of this:
Subject
-> Sample
-> Accessible Population
-> Target Population
-> Population
So how do you pick? You establish Eligibility Criteria (inclusion and exclusion). Doing this limits the effect of extraneous variables (“clean signal”). RCTs specify stringent inclusion criteria btw.
Now you need a sample that is representative of your population else you end up with Systematic Sampling Error where the sample mean and population mean don’t agree with each other. If they agree but the sample mean is flatter than the population mean, you have a Random Sampling Error.
You can perform Random Sampling by (a) sampling randomly lol (b) stratifying and then sampling randomly lol (c) sampling in clusters of people with X and people without X, like where natural clusters emerge.
If you have an ordered list of the population (like a census) you can do Systematic Sampling.