Common misconceptions

Common mistake
Wrong: Students apply chi-square tests to continuous outcome data comparing two groups.
Right: Chi-square tests are for categorical (nominal) data; continuous data comparing two groups requires a t-test, and more than two groups requires ANOVA.
Chi-square tests assess whether two categorical variables are associated — they work on counts and proportions, not means. If the outcome is a measured number (blood pressure, weight, enzyme level), you need to compare means, which requires a t-test for two groups or ANOVA for three or more. Applying chi-square to continuous data is a category error, not just a minor mistake — the math literally doesn't apply to that data structure.
Common mistake
Wrong: Parametric tests are always preferred because they are more powerful.
Right: Parametric tests assume normally distributed data; when this assumption is violated (small samples, skewed data), non-parametric equivalents should be used despite lower power.
Parametric tests are more powerful, but that power is only valid when the data actually fit a normal distribution. When you have a small sample size or visibly skewed data, the normality assumption breaks down, and using a parametric test produces unreliable p-values. The right move is to drop to the non-parametric equivalent — you sacrifice some power, but you get a result that's actually trustworthy for the data you have.
Common mistake
Wrong: Logistic regression produces a mean difference as its output, like linear regression.
Right: Linear regression outputs a continuous predicted value (slope/coefficient); logistic regression outputs an odds ratio for a binary outcome.
Linear regression models a continuous outcome and outputs a coefficient that tells you how much the outcome changes per unit change in the predictor — it's a slope on a number line. Logistic regression models a binary outcome (disease yes/no, death yes/no) and outputs an odds ratio, which is a ratio of probabilities, not a mean difference. Confusing these two means you'd misinterpret the entire result of a regression study on the exam.
Free Deck audit

See if your Anki deck covers this topic.

Upload your deck →
Guided session

Stuck on this? An AI tutor that probes your understanding.

Start a session →

What the exam tests

  1. Given a study outcome type (categorical vs. continuous) and number of comparison groups, select the correct statistical test from: t-test, paired t-test, ANOVA, chi-square, or Fisher's exact test.
  2. Recognize when the assumptions for a parametric test (normality) are violated and identify the appropriate non-parametric substitute (e.g., Mann-Whitney U, Kruskal-Wallis, Spearman correlation).
  3. Distinguish between linear and logistic regression by their input requirements and outputs — linear regression predicts a continuous value and reports a coefficient/slope; logistic regression handles a binary outcome and reports an odds ratio.

Can you avoid these mistakes?

A researcher measures systolic blood pressure in three diet groups (low-carb, low-fat, Mediterranean). Which statistical test should be used to compare the means across all three groups?
A study compares rates of surgical site infection (infected vs. not infected) between two hospitals. The expected cell count in one group is only 3. Which test is most appropriate — chi-square or Fisher's exact — and why?
A dataset of 15 patients shows a heavily right-skewed distribution of hospital length of stay. A colleague wants to use a t-test to compare two treatment groups. What is wrong with this approach, and what test should be used instead?
A logistic regression study reports an output of 2.4 for the association between smoking and developing lung cancer. What does this number represent, and how does it differ from what a linear regression would have reported?

Related topics

See how your Anki deck covers this topic.

Upload your deck for a free audit →