Distributions — Normal, Skewed, Mean vs Median

Q: Defaults to mean as a summary statistic even when data are skewed, where median is more appropriate

The mean incorporates every data point, which makes it powerful for symmetric data but misleading when outliers or skew are present — a few extremely high values can drag the mean far from where most of the data sits. The median, by contrast, is defined by rank order rather than magnitude, so outliers don't move it much. Whenever you see skewed data (income, hospital length of stay, lab values with outliers), the median gives a more honest picture of the 'typical' value.

USMLE Step 1 trap: Incorrectly places the mean toward the mode rather than toward the tail in skewed distributions. In a skewed distribution, the mean is pulled toward the tail — above the median in right skew and below it in left skew.

Distributions are a foundational concept in biostatistics that shows up on USMLE Step 1 in both direct recall questions and passage-based vignettes. You need to know the properties of the normal (Gaussian) distribution, what happens to mean/median/mode when data are skewed, and when to choose one summary statistic over another. These ideas sound simple but students lose points here because they memorize definitions without building a working model of what skew actually does to summary statistics.

The USMLE Step 1 tests this in a few specific ways: identifying whether a distribution is normal or skewed from a graph or description, applying the 68-95-99.7 rule to calculate probabilities or reference ranges, and interpreting which summary statistic a researcher should use or report. Vignette questions might describe a dataset of income or lab values and ask you to identify the most appropriate measure of central tendency — that's a direct test of your skew intuition.

The biggest trap here is the skew-mean relationship. Students instinctively feel like the mean should sit near the 'center' of the data or near the peak — but that's the mode. The mean gets dragged toward the tail, not the peak. Get that direction right and the rest of the concept clicks into place.

From real student decks

44%have cards covering this topic

29%have mature cards

Common misconceptions

Common mistake

Wrong: The mean is pulled toward the peak (mode) in a skewed distribution.

Right: In a skewed distribution, the mean is pulled toward the tail — above the median in right skew and below it in left skew.

The mean is the mathematical average and is sensitive to every value in the dataset — including extreme values in the tail. In a right-skewed distribution, the long tail pulls the mean up above the median; in a left-skewed distribution, the tail pulls the mean down below the median. The mode sits at the peak regardless, so the correct order for right skew is mode < median < mean, and for left skew it's mean < median < mode.

Common mistake

Gap: Does not recall the 68-95-99.7 rule for the normal distribution

In a normal distribution, 68% of values fall within 1 SD, 95% within 2 SD, and 99.7% within 3 SD of the mean.

The 68-95-99.7 rule is a fixed property of any normal distribution: 68% of observations fall within ±1 SD of the mean, 95% within ±2 SD, and 99.7% within ±3 SD. This rule is used on USMLE Step 1 to set reference ranges and calculate the probability that a value is abnormal — memorize the three numbers in order and know that each step adds roughly the same proportion of data.

Common mistake

Wrong: Students use the mean to summarize skewed data because it is more familiar.

Right: The median is preferred for skewed distributions because it is resistant to outliers; the mean is appropriate for normally distributed data.

The mean incorporates every data point, which makes it powerful for symmetric data but misleading when outliers or skew are present — a few extremely high values can drag the mean far from where most of the data sits. The median, by contrast, is defined by rank order rather than magnitude, so outliers don't move it much. Whenever you see skewed data (income, hospital length of stay, lab values with outliers), the median gives a more honest picture of the 'typical' value.

Guided session

Stuck on this? An AI tutor that probes your understanding and catches where your reasoning breaks.

Start a session →

Free Deck audit

Already run Anki? See if your deck covers this topic.

Upload your deck →

What the exam tests

Know the properties of the normal distribution: it is symmetric, bell-shaped, and described entirely by its mean and standard deviation — and know that 68% of values fall within 1 SD, 95% within 2 SD, and 99.7% within 3 SD of the mean.
Given a skewed distribution (or a graph), correctly order mean, median, and mode — specifically, that the mean is pulled toward the tail: above the median in right (positive) skew and below the median in left (negative) skew.
Determine whether mean or median is the more appropriate summary statistic for a given dataset — median is preferred when data are skewed or contain outliers; mean is appropriate when data are normally distributed.

Can you avoid these mistakes?

A dataset of household incomes in a city is strongly right-skewed due to a small number of very high earners. Which measure of central tendency — mean or median — should a public health researcher report to best represent the typical household income, and why?

In a normally distributed population, the mean serum sodium is 140 mEq/L with a standard deviation of 3 mEq/L. What percentage of the population has a serum sodium between 134 and 146 mEq/L?

A frequency histogram shows a long left tail with most values clustered to the right. How would you rank the mean, median, and mode from lowest to highest, and what type of skew is this?

A clinical study reports that a new drug reduces hospital length of stay. The data are heavily skewed by a few very long stays. If the investigators report only the mean, how might this mislead readers — and what should they report instead?

Distributions — Normal, Skewed, Mean vs Median

Common misconceptions

What the exam tests

Can you avoid these mistakes?

Related topics