Common misconceptions

Common mistake
Wrong: The p-value is the probability that the null hypothesis is true.
Right: The p-value is the probability of obtaining results at least as extreme as observed, assuming the null hypothesis is true.
The p-value is not a probability about the null hypothesis itself — it's a probability about your data given the null. Specifically, it answers: 'If the null were true, how likely is it I'd see results this extreme or more extreme?' A small p-value means your data are unlikely under the null, which is evidence against it — but it never tells you the probability that the null is actually true. Confusing these two framings leads to wildly wrong interpretations on vignettes that ask what the p-value means.
Common mistake
Wrong: Students swap Type I and Type II errors, thinking Type II is a false positive.
Right: Type I error (alpha) is a false positive — rejecting a true null; Type II error (beta) is a false negative — failing to reject a false null.
Type I error is a false positive — you cried wolf when there was none. You rejected the null hypothesis, but it was actually true. Type II error is a false negative — you missed a real effect. The null was false, but you failed to reject it. A memory anchor: Type I = you made a positive claim that was wrong (false positive); Type II = you failed to make a claim when you should have (false negative). Alpha is the acceptable rate of Type I errors you set before the study.
Common mistake
Wrong: Students apply the same null-crossing rule to both ratio measures and difference measures, always checking if the CI includes zero.
Right: For ratio measures (RR, OR, HR), the null value is 1; for difference measures (ARR, mean difference), the null value is 0.
The null value in statistics is the value that represents no effect — and what 'no effect' looks like depends on whether you're measuring a ratio or a difference. For ratios like RR, OR, and HR, a value of 1 means the groups are identical (event rates cancel out in the ratio). For differences like ARR or mean difference, a value of 0 means no difference. So when you check if a CI suggests statistical significance, ask: does this CI include the null value? For ratios, that's 1. For differences, that's 0. Applying the wrong rule to a ratio measure is a classic Step 1 trap.
Common mistake
Wrong: Increasing alpha (Type I error rate) decreases power.
Right: Increasing alpha actually increases power because it lowers the threshold for rejecting the null; power is increased by larger n, larger effect size, larger alpha, or smaller variance.
Raising alpha means you're willing to accept more false positives — you've lowered the bar for declaring a result significant. Because it's easier to reject the null, you're also less likely to miss a true effect, which means power goes up. This is counterintuitive because alpha and beta feel like they should move together, but they don't — they trade off against each other. The full list of power-increasing factors: larger n, larger true effect size, smaller variance in measurements, and larger alpha. Know all four.
Common mistake
Wrong: A 95% CI means there is a 95% probability that the true parameter lies within that specific interval.
Right: A 95% CI means that if the study were repeated many times, 95% of the constructed intervals would contain the true population parameter.
Once a confidence interval is calculated, the true parameter is either in it or it isn't — there's no probability involved for that specific interval. The 95% refers to the long-run performance of the method: if you ran this study 100 times and built a CI each time, about 95 of those intervals would capture the true value. This is a frequentist coverage statement, not a Bayesian probability about a fixed interval. On the exam, answers that say 'there is a 95% chance the true mean lies between X and Y' are technically wrong — the correct framing is about the procedure, not the specific interval.
Free Deck audit

See if your Anki deck covers this topic.

Upload your deck →
Guided session

Stuck on this? An AI tutor that probes your understanding.

Start a session →

What the exam tests

  1. Given a study scenario, correctly identify what the null hypothesis states (no difference/no association) and what the alternative hypothesis states (a difference or association exists).
  2. Distinguish between Type I error (false positive: rejecting a true null hypothesis, probability = alpha) and Type II error (false negative: failing to reject a false null hypothesis, probability = beta) — and not swap them under pressure.
  3. State the correct definition of a p-value: the probability of obtaining results at least as extreme as observed, assuming the null hypothesis is true — not the probability that the null hypothesis is true.
  4. Define power (1 − beta) and identify which study design changes increase it: larger sample size, larger effect size, lower variability, or higher alpha.
  5. Explain the directional tradeoffs between alpha, beta, sample size, effect size, and power — for example, what happens to power and Type II error when you increase sample size.
  6. Correctly interpret a 95% confidence interval as a frequentist coverage statement: if the study were repeated many times, 95% of such intervals would contain the true population parameter.
  7. Apply the correct null-crossing rule to determine statistical significance from a CI: for ratio measures (RR, OR, HR), check whether the CI crosses 1; for difference measures (ARR, mean difference), check whether it crosses 0.

Can you avoid these mistakes?

A study reports a 95% CI for relative risk of 0.72–0.98. Is this result statistically significant, and how do you know? What if the outcome measure had been absolute risk reduction instead — what null value would you check?
A researcher wants to increase the power of her study without changing the sample size. She decides to raise alpha from 0.05 to 0.10. A classmate says this will increase the risk of a false positive but not affect power. Who is right and why?
A clinical trial finds p = 0.03. A student writes in their notes: 'This means there is a 3% chance the null hypothesis is true.' What is wrong with this statement, and what does p = 0.03 actually mean?
A study is designed with alpha = 0.05 and power = 0.80. The investigators find a non-significant result (p = 0.12). They conclude there is no real effect. What type of error might they be making, and what is its probability in this study design?

Related topics

See how your Anki deck covers this topic.

Upload your deck for a free audit →