Common misconceptions

Common mistake
Wrong: Relative risk can be directly calculated from a case-control study.
Right: Because disease prevalence in the sample is set by the investigator, true disease incidence is unknown, so only the odds ratio can be calculated from case-control data.
Relative risk requires knowing disease incidence in exposed vs. unexposed populations, but in a case-control study, the investigator decides how many cases and controls to enroll — so the proportion of diseased people in the sample is artificial, not a reflection of true population prevalence. Since you can't recover true incidence from this setup, you can't calculate risk in each exposure group. The odds ratio is the correct measure here, and when disease is rare, it approximates the relative risk anyway.
Common mistake
Wrong: Case-control studies start with exposed and unexposed groups and follow them forward.
Right: Case-control studies start with diseased (cases) and non-diseased (controls) and look backward to compare past exposures.
Cohort studies go exposure → outcome (forward in time). Case-control studies go outcome → exposure (backward in time). This isn't just a semantic difference — it's what determines the study's feasibility, cost, and which statistics are valid. If a vignette says researchers identified people with a disease and then asked about their past exposures, that's case-control, full stop, regardless of what else the question implies.
Common mistake
Wrong: Recall bias affects cases and controls equally in case-control studies.
Right: Cases are more likely than controls to recall past exposures because their disease motivates them to search for causes, systematically inflating the apparent association.
Cases have a psychological incentive that controls don't: they're sick, so they've often spent time thinking about what caused their illness. This motivates more thorough — and sometimes inflated — recall of past exposures. Controls have no such motivation, so they may underreport the same exposures. The result is a systematic difference in reporting between the two groups, not random noise, which pushes the odds ratio away from 1 even when no true association exists.
Common mistake
Gap: Fails to recognize case-control as the design of choice for rare diseases or diseases with long latency
Case-control studies are the preferred design for rare diseases because they start by identifying existing cases rather than waiting for rare events to occur in a prospective cohort.
In a prospective cohort study, you'd need to enroll thousands of people and wait years just to accumulate a handful of cases of a rare disease — that's expensive and often impractical. Case-control studies sidestep this by starting with existing cases, so you can efficiently study rare outcomes or diseases with long latency without decades of follow-up. Anytime a Step 1 vignette describes a rare condition or asks which design requires the fewest participants to study an uncommon outcome, case-control is the answer.
Free Deck audit

See if your Anki deck covers this topic.

Upload your deck →
Guided session

Stuck on this? An AI tutor that probes your understanding.

Start a session →

What the exam tests

  1. Identify the directionality of a case-control study: it starts with disease status (cases vs. controls) and looks backward at past exposure — not forward from exposure to outcome.
  2. Explain why relative risk cannot be calculated from case-control data and why the odds ratio is the correct effect measure instead.
  3. Recognize which research scenarios favor a case-control design, specifically rare diseases and conditions with long latency periods where prospective follow-up would be impractical.
  4. Identify the biases that disproportionately threaten case-control studies, especially recall bias and selection bias, and understand the mechanism by which they distort results.

Can you avoid these mistakes?

A researcher wants to study the relationship between a rare form of childhood leukemia and prenatal pesticide exposure. She identifies 50 children with the disease and 200 healthy children, then interviews their mothers about pesticide use during pregnancy. What study design is this, what effect measure should she report, and why can't she report the other common effect measure?
You're given a 2x2 table from a case-control study: 40 cases exposed, 10 cases unexposed, 20 controls exposed, 30 controls unexposed. Calculate the odds ratio. Then explain why calculating the relative risk from this table would be incorrect.
A case-control study finds a strong association between a dietary supplement and liver disease. Critics argue the results are inflated by recall bias. Explain the mechanism: why would recall bias affect cases and controls differently, and in which direction would it push the odds ratio?
For each scenario below, decide whether a case-control or cohort study is more appropriate and justify your answer: (a) studying whether smoking causes lung cancer in a population of 10,000 adults over 20 years; (b) studying risk factors for a disease that affects 1 in 50,000 people per year.

Related topics

See how your Anki deck covers this topic.

Upload your deck for a free audit →