Common misconceptions

Common mistake
Wrong: A retrospective cohort study is the same as a case-control study because both look backward.
Right: A retrospective cohort still defines groups by exposure status (not outcome) and uses historical records to follow them forward to outcomes, preserving the cohort structure.
The confusion here comes from conflating the direction of data collection with the direction of study logic. A case-control study defines groups by who got the disease (cases) vs. who didn't (controls), then asks about past exposures. A retrospective cohort study still defines groups by exposure status — it just uses historical records to do it, then traces outcomes forward from that exposure point. The word 'retrospective' describes when you collected the data, not how you defined your groups. Keep this straight by asking yourself: were groups defined by exposure or by outcome? If exposure, it's a cohort study.
Common mistake
Wrong: Relative risk and attributable risk convey the same clinical information.
Right: Relative risk measures the strength of association, while attributable risk measures the absolute excess risk due to exposure, which is more relevant for public health impact.
Relative risk tells you how many times more likely exposed people are to develop the outcome compared to unexposed — it's a ratio that captures association strength. Attributable risk is a difference: it tells you how much additional absolute risk is due to the exposure alone. A drug could double the risk of a rare event (RR = 2, sounds scary) while the attributable risk is tiny (e.g., 2 extra cases per 100,000). For public health decisions — like whether to launch a screening campaign — AR matters more than RR. On Step 1, know when the question is asking about association strength vs. population impact.
Common mistake
Wrong: Loss to follow-up in a cohort study is only a problem if many participants are lost.
Right: Loss to follow-up biases results whenever those lost differ systematically from those retained, regardless of the absolute number lost.
The problem with loss to follow-up isn't just the number of people lost — it's whether the people who drop out are different from those who stay. If participants leave the study because they got sicker, moved away due to health reasons, or couldn't tolerate side effects, the remaining cohort is no longer representative. This is differential loss to follow-up, and it introduces bias regardless of whether 5% or 50% were lost. A large but random loss is less concerning than a small but systematic one. Step 1 often tests this by giving you a scenario where a specific subgroup drops out and asking whether bias was introduced.
Common mistake
Gap: Cannot identify Framingham as the canonical example of a prospective cohort study design
The Framingham Heart Study is the prototype prospective cohort study, enrolling disease-free participants and following them over decades to identify cardiovascular risk factors.
The Framingham Heart Study started in 1948 in Framingham, Massachusetts, enrolling thousands of adults who had no cardiovascular disease at baseline. They were followed prospectively — with repeat exams every two years — to see who developed heart disease and what factors predicted it. This design is why we know that hypertension, smoking, high LDL, low HDL, diabetes, and age are cardiovascular risk factors. It's the textbook example of a prospective cohort study because it has all the defining features: disease-free at enrollment, defined by baseline characteristics, followed longitudinally for incident outcomes.
Free Deck audit

See if your Anki deck covers this topic.

Upload your deck →
Guided session

Stuck on this? An AI tutor that probes your understanding.

Start a session →

What the exam tests

  1. Understand cohort study directionality: groups are defined by exposure status first, then followed to outcomes — and know how prospective and retrospective variants differ structurally (not just by time).
  2. Calculate and interpret relative risk (RR) and attributable risk (AR) from cohort data, and know what each one tells you clinically — RR for strength of association, AR for public health impact.
  3. Identify the key strengths of cohort studies (can calculate incidence and RR, good for rare exposures, can examine multiple outcomes) and their main limitations (expensive, slow, prone to loss-to-follow-up, bad for rare diseases).
  4. Recognize the Framingham Heart Study as the prototype prospective cohort study and know what it demonstrated: that cardiovascular risk factors (hypertension, smoking, hyperlipidemia) were identified by following disease-free participants longitudinally over decades.

Can you avoid these mistakes?

A researcher uses employee health records from 1990 to identify workers who were exposed to a chemical vs. those who were not, then reviews records through 2005 to see who developed lung disease. What study design is this, and can you calculate a relative risk from its data?
In a cohort study of smokers vs. non-smokers, the incidence of lung cancer is 150/100,000 in smokers and 15/100,000 in non-smokers. What is the relative risk? What is the attributable risk? Which number would you use to argue for a public health smoking cessation campaign, and why?
A cohort study on exercise and heart disease loses 20% of its participants over 10 years. A classmate says this is fine because 80% were retained. What is the key question you need to ask to determine whether this loss to follow-up actually biases the results?
A USMLE Step 1 vignette describes a study that enrolled 5,000 healthy middle-aged adults in 1948, measured their blood pressure and cholesterol, then followed them with biennial exams to track cardiovascular events. What study design is this, what famous study does it describe, and what class of effect measure can be directly computed from it?

Related topics

See how your Anki deck covers this topic.

Upload your deck for a free audit →