Final Review I
FINER.
Doing Research
Why do research?
What are the Goals of Research?
- Exploration, Hypothesis Testing/Evaluation, Theory Testing
- Measurement, Demonstration — What instruments are you using?
What Data is collected? Qualitative and Quantitative
How is Data collected? Observation and Experiment
Measurement Study
These tell you how accurate your Demonstration study will be. This helps you characterize the errors in what you’re going to collect.
Validity
Internal and External Validity. Bias (degree to which the estimate differs from target.) Easier to estimate direction than magnitude (you can say things even with direction.)
Error
Reliability and Validity
TODO: Precision and Accuracy
How do you improve both? Increase sample size? Blind? Instrument calibration? Double measurements? Blinding? (Last many are for bias)
Sources of Authority
Experiment > Observaton > Knowledge > Expertise
Ethical Issues
Translational Research
TODO: What is it?
T1 — Bench to Bedside, Lab to Clinical Practice, Efficacy
T2 — Research to Practice
T3 — General Populations to Genral Practice (dissemmination research)
T4 — TODO (policy research)
Efficacy is evaluation under idealized conditions, optimal environments.
Clinical Trials
Phase 0 — Exploratory
Phase 1 — TODO
Phase 2 — Efficacy
Phase 3 — Real world; multi-center, effectiveness
Phase 4 — Post-marketing surveillance
Literature Review — Synthesizing evidence
Quantitative
Systematic Review, Metaanalysis, Integrative Review.
Quantitative studies: Systematic Review. Usually efficacy studies. Search criteria is published as part of study.
Metaanalysis pools data and reanalyzes to arrive at its own conclusions. You report “pooled effect sizes”. Inclusion criteria are of supreme importance.
Integrative Review
Qualitative
Metasummary (a summarization, like Systematic Review, describe the findings) and Metasynthesis (what it sounds like, Metaanalysis, ).
Frameworks
Biases your research by definition; will obscure phenomena it doesn’t have a name for.
Concepts, Constructs, Variables
Concepts (abstract definition; or a “chair”; or “truth” “happiness”)
Constructs (concepts that cannot be directly observed; emotional response, satisfaction)
Variables (operationalize constructs; measurable “palmar sweating” → stress)
Links between these three must be questioned during research. Conventions change over time.
Concept Synthesis → Derivation → Analysis
What it sounds like. ANalysis you identify the characteristics of a concept.
Developing Frameworks
Select, define concepts, determine hierarches, draw a conceptual map of relationships.
Qualitative Methods
Interviews (open-ended or semi-structed), Observations, Ethnographic Methods (ends with a written account from the particiants’ perspective), Others (time and motion, work sampling, IT usage patters, clinical notes, surveys).
These are most helpful very early on in your research.
These end up with Qualitative data.
Rigor
What is your ontological stance? E.g. insist on inter-observer Reliability with a logical positivist view.
Triangulation
Member checks — present findings to target audience
Mixed methods.
Grounded Theory
A set of techniques for analysing (TODO?) Qualitative data.
Open/Axial/Selective coding.
Every theory is reductionist! Reduces to well-defined and articulated phenomena that the theory claims are important.
If you want to explain how people engage with LLMs… this is what you might want to do.
Thematic Analysis
“Grounded Theory Lite”. Not to produce a theory but to identify themes (not “emerge” them!)
Searching for themes -→ Axial coding.
Recursive, iterative process.
This is typically done as a prequel to interventions research (Grounded Theory may be overkill). There are experts who think that this is for lazy people who do not want to commit and get to some ‘core’.
Hierarchical process: You take a “loose” bag of codes → Glob somehow for bigger code → Bigger code → … → Theme
Hypothesis Testing
Typically don’t test hypotheses with qualitative research. You need to explicitly state an alternative to the null. Remember p-values.
Some journals won’t accept tests of statistical significance for baseline characteristics.
Sometimes we need a hypothesis generating study: maybe use an observational study to find and explore trends. Qualitative studies are good for this too.
Hypotheses must be
- Clear, specific, non-vague
- Grounded in theory or knowledge about the world
- Testable and Falsifiable
Discussion on hypothesis-testing steps.
- State null and alternative hypotheses
- Eval data
- Review assumptions
- Select test statistic
- Determine distribution of statistic
- State decision rule
- Calculate test statistic
- Make decision to reject/not-reject null
- Conclusions
Type I and Type II errors.
Study Designs (Quantitative)
Observational
Obsevational (don’t change the world) and Experimental (be the change you want to see). See this.
Observational Cross-sectional studies are best for measuring associations. Main outcome is prevalence. Fast, inexpensive, but no causal relationships
Observational Cohort studies. Select based on exposure/independent variable. Follow-up is extremely important. You typically measure Risks (N with outcome / N Exposed), Odds (N with outcome / N without outcome), and Rates (N with outcome / Person-time). Problem is you may have to wait a long time or will have problems with rare outcomes TODO
Observational Case-Control studies. Best for rare outcomes! No prospective CC studies (think about it.) Only one outcome can be studied. But you need to find controls and sampling is a problem because of confounders. Many strategies here: matching for example, or from same clinic, same demographic, general disease profile.
Experimental and Quasi-Experimental
- RCT
- Factorial Designs
- Cluster Randomized Designs - By group and not person (all patients in a given clinic, all clinicians in different wards; entire cluster goes into different groups.)
Then Quasi-Experimental. The difference here is Randomization. Remember that randomization helps with confounders by distributing them. You are more suscpetible to these in QE studies. Best example is a pre-post study design with concurrent controls in a single clinic with an AI scribe on and off.
- Time Series with Multiple Observations
- Crossover Studies
- Pre-post Design with Concurrent Controls
- Post-test only with Controls
- Removed Treatment Design
- Double-Pretest Pre-post
- Pre-post
- Post-test only
Pros and cons of each. TODO: diagrams for each from lectures.
There’s then continuous and dichotomous measurements.
Cluster Randomized Trial: people will talk to each other. Easiest to imagine a lot of clusters that you randomize across the treatment and control groups.
Factorial Design: This is not the same as multiple arms. You calculate sample size, ANOVA differently.
Crossover Design: Non-randomized. “Washout period” for effects of treatment/exposure to dissipate.
Quasi-Experimental: Cheap but less effective at controlling confounders.
Pre-post Design with Concurrent Controls: Most commonly used Experimental Design! Try and find out more about their practices, characteristics, etc.
| Intervention | Pretest | → Intervention → | Posttest |
| Control | Pretest | Posttest |
Time-Series Trial: Think about fresh residents in July. Identifies trends.
Posttest only with Controls: TODO
Removed Treatment Design: Intervention M1 → T → M2 → M3 → Remove T → M4 (T is interventions)
Double Pretest Pre-post Design: Intervention M1 → M2 → T → M3
Pre-post Design: One of the weakest, but cheap and fast and use for feasibility and estimating effect sizes.
Posttest Only: pre-Experimental only: T → M1
Outcome Measurements
- Level 1: Clinical outcomes (best but these take a long while. e.g. mortality from Diabetes)
- Level 2: Intermediate markers (look at literature; think of HbA1C: lots of research to show that higher the number the more problems.)
- Level 3: Processes (e.g. adherence to guidelines, time to diagnosis, adherence to HbA1C measurements… can be captured much faster! Do this when you have a short time period to conduct study and are looking for things like weight loss or HbA1C. Look at literature and make a connection/proxy.)
- Adverse and unanticipated effects
You need to connect Level 3 → Level 2 → Level 1. You need to use literature to perform this connection for your claim.
Adherence to self-management → Lower glycemic index → Lower mortality
Statistical Methods
Correlational: Pearson’s for association. Spearman’s for non-parametric. Kendall’s Tau is for variables at ordinal level (e.g. Likert Scale; do not report means and SDs!)
Causal Statistis: 2x2 tables, t-tests, ANOVA. If you’re taking measurements from the same group, dependent t-tests, two-way ANOVA. If independent, independent t-tests, one-way ANOVA.
Difference between groups: TODO
Analysis of Variance, between groups and within groups. Generalized version of t-test for more than 2 groups: uses F statistic instead of t. One-way (one categorical predictor), Factorial ANOVA (several categorical), Repeated Measure (what it says…)
Sample Size
- Significance Level (lower value requires more samples)
- Tailedness (two-tailed requires larger sample size)
- Effect Size (smaller effect sizes require larger sizes)
- Power (larger power requires larger sizes)
If you have a large , you can make all manner of statistically significant claims. Your p-value will keep going down.
Sampling strategies. Simple random, stratified random (by age, gender), cluster sampling, etc
Validity
Implementation effects.
Qualitative study safeguards.
Ethics
Belmont Report:
- Respect for person — autonomous agents capable of forming an opinion of participation - hence Informed Consent. Worry about people with diminished autonomy (trainees, juniors, power dynamic) need additional protections.
- Beneficience — Do no harm. Tradeoffs between benefits and risks. Should be scientifically sounds.
- Justice — Distribute benefits and buderns of research farly and equitably. Justify why you are excluding people from a study.
HIPAA and IRBs
Exemptions, etc.