Statistics
SAT statistics questions are mostly about *interpretation* — knowing what mean, median, range, and standard deviation tell you, and which one a particular question is really asking for.
The four core summary statistics:
Mean (average): sum the values, divide by how many. Sensitive to outliers.
Median: the middle value when sorted. With an even count, average the middle two. Resistant to outliers.
Mode: the most frequent value. Useful for categorical data.
Range: max minus min. A simple measure of spread.
| Statistic | How to compute | Sensitive to outliers? | Use when |
|---|---|---|---|
| Mean | Sum / count | Yes (heavily) | Data is roughly symmetric |
| Median | Middle value when sorted | No | Data has outliers or is skewed |
| Mode | Most frequent value | No | Data is categorical |
| Range | Max − min | Yes (uses extremes) | Quick rough spread |
Standard deviation (SD) measures how spread out the data is. SAT problems rarely ask you to compute SD — instead they ask you to compare two datasets:
- "Which dataset has the larger SD?" → the one whose values are spread further from the mean.
- "What happens to SD if you add the same number to every value?" → SD doesn't change (it's a measure of spread).
- "What happens to SD if you multiply every value by 2?" → SD doubles (spread doubles).
Mean vs median trick. If a dataset is skewed by outliers, the mean shifts toward them but the median doesn't.
"Which is greater, the mean or the median, for the salaries: $30k, $32k, $33k, $35k, $200k?" → mean is dragged up by the $200k outlier; median is $33k. Mean > median.
Generally:
- Right-skewed (long tail on the right) → mean > median
- Left-skewed (long tail on the left) → mean < median
- Symmetric → mean ≈ median
| Shape | Visual | Relationship |
|---|---|---|
| Symmetric | Bell-shaped | Mean ≈ median ≈ mode |
| Right-skewed | Long tail to the right (high outliers) | Mean > median (mean pulled toward tail) |
| Left-skewed | Long tail to the left (low outliers) | Mean < median (mean pulled toward tail) |
Sampling and inference. When the SAT shows a study and asks if conclusions can be generalized:
- Random sample from the target population → conclusions about the population are valid.
- Random assignment to treatment/control → conclusions about causation are valid.
- Both → can claim cause-and-effect about the population.
- Neither → can only describe the sample, not generalize.
Read the question carefully — is it asking about mean, median, SD, or generalizability? Each demands a different approach.
What is the mean of: 70, 80, 90, 80?
Worked examples
A teacher records test scores: 72, 75, 78, 80, 82, 85, 92.
Which is greater: the mean or the median?
A researcher randomly selects 200 students from a single high school's biology class to test a new study technique. Half use the technique; half don't. The technique group scores significantly higher.
Which conclusion is most appropriate?
Common pitfalls
Mean is the average (sum / count). Median is the middle value. They're often different — especially when outliers exist. Read the question carefully.
Adding a constant to every value DOESN'T change SD (it shifts the dataset, not its spread). Multiplying every value by a constant DOES scale SD by that factor. Mixing these up costs easy points.
If a study sampled from one school, conclusions only apply to that school. Don't extend findings to populations not represented in the sample. The SAT puts trap answers that generalize too broadly.
If group A averages 80 and group B averages 70, the combined mean is NOT 75 unless the groups are the same size. Total = (sum of A) + (sum of B); combined mean = total / total count.
Key takeaways
Mean = sum / count. Median = middle value when sorted. Mode = most frequent. Range = max − min.
Outliers move the mean but not the median. Skewed-right → mean > median; skewed-left → mean < median.
Adding a constant to every value doesn't change SD. Multiplying scales SD by that factor.
Random sampling → can generalize. Random assignment → can claim causation. Need both for cause-and-effect across a population.
When two datasets are compared, larger SD = more spread out, not necessarily a larger mean.
Watch & learn
Curated Khan Academy walkthroughs on Statistics. They're complementary to this lesson — watch one if a written explanation isn't clicking, or after to reinforce.
Try it yourself
5 practice questions on Statistics, drawn from the question bank. The tutor is one click away if you get stuck.