Reliability vs Validity
MCAT trap: Assumes reliability guarantees validity. A measure can be highly reliable (consistent) yet invalid if it consistently measures the wrong construct; reliability is necessary but not sufficient for validity.
Reliability and validity are two of the most commonly tested research methods concepts on the MCAT, and they're also two of the most commonly confused. Reliability means consistency — a ruler that gives the same reading every time you measure the same object is reliable. Validity means accuracy — the measurement actually captures what you claim it captures. The classic analogy is a dartboard: throwing all your darts in a tight cluster in the wrong corner is reliable but not valid. Throwing them all over the board is neither. The key insight the MCAT hammers on: reliability is necessary but not sufficient for validity. You can't have a valid measure that's inconsistent, but you can absolutely have a consistent measure that's wrong.
The exam tests this in three main ways. First, straight recall — know the definitions cold and know the subtypes. Second, mechanism questions that ask you to distinguish between types of reliability (test-retest vs. inter-rater vs. internal consistency) or types of validity (content, construct, criterion, face, internal, external). Third — and this is where most students lose points — passage-based critique questions where you read about a study's measurement instrument and have to identify what type of reliability or validity problem it has. These require you to apply the conceptual distinctions, not just recall them.
The tricky part is that the subtypes all sound superficially similar. Students routinely swap construct validity for criterion validity because both sound like they're about 'what the test measures.' And the test-retest vs. inter-rater distinction trips people up because both sound like they're about 'consistency.' Get these distinctions sharp before test day — the MCAT will present a scenario and expect you to categorize it correctly without the labels being handed to you.
Common misconceptions
What the exam tests
- Know the core definitions: reliability means a measurement produces consistent, repeatable results; validity means it actually measures what it's supposed to measure — and understand that a reliable instrument is not automatically a valid one.
- Distinguish between the three types of reliability: test-retest (same measure, different time points), inter-rater (same measure, different observers), and internal consistency (different items on the same instrument measuring the same construct).
- Distinguish between the main types of validity: content validity (covers all aspects of the construct), construct validity (measures the intended theoretical construct), criterion validity (correlates with an external criterion — either concurrently or predictively), and face validity (appears to measure what it claims on the surface).
- Read a passage describing a study's measurement tool and identify specific reliability or validity weaknesses — for example, recognizing that a questionnaire with inconsistent responses across administrations has a test-retest reliability problem, or that a test correlating with an external gold standard is demonstrating criterion validity.
Can you avoid these mistakes?
Related topics
See how your Anki deck covers this topic.
Upload your deck for a free audit →