Scatterplot questions show up on nearly every test, and the good news is they almost always reward the same handful of moves: plug into a line, read a slope, or judge a correlation. Master those and you bank easy points.
A line of best fit y = 8.5x + 42: plug in x = 6 hours to predict a score of 93.
Loading…
A negative-slope trend line: y falls 2.3 for each +1 in x (a strong negative correlation).
Loading…
Quick check
Check your understanding with a question from this topic:
A researcher collected data on the number of hours students studied and their exam scores. The line of best fit for the scatterplot is y = 8.5x + 42, where x is hours studied and y is the predicted exam score. What is the predicted exam score for a student who studied for 6 hours?
Worked examples
Example 1
Loading…
Example 2
Loading…
Example 3
Loading…
Common pitfalls
Mixing up slope and intercept
Loading…
Forgetting the units (thousands, percent)
Loading…
Claiming correlation proves causation
Loading…
Swapping x and y in interpretations
Loading…
Key takeaways
Loading…
Loading…
Loading…
Loading…
Loading…
Watch & learn
Curated Khan Academy walkthroughs on Two-Variable Data: Models and Scatterplots. They're complementary to this lesson — watch one if a written explanation isn't clicking, or after to reinforce.
Tracks your progress across lessons.
Try it yourself
5 practice questions on Two-Variable Data: Models and Scatterplots, drawn from the question bank. The tutor is one click away if you get stuck.
Two-variable data means each thing you measure gives you two numbers — like a student's study hours AND their exam score. We plot each person as a single dot at position (x, y). The whole picture of dots is a scatterplot.
When the dots roughly trend in a straight direction, we can draw a line of best fit (also called a trend line or regression line) — the single straight line that comes closest to all the dots. The test gives you this line as an equation, usually in the form:
y = mx + b
m is the slope — how much y changes when x goes up by 1.
b is the y-intercept — the predicted y when x = 0.
Three question types cover almost everything:
1. Predict a value. They give you the line and an x. Just substitutex and compute y. (Or give you a y and ask for x — then solve the equation.)
2. Interpret slope or intercept in context. The slope tells you the rate of change with units. If y is dollars and x is years, a slope of -2.3 means the value drops 2.3 units of y (here, $2,300 if y is in thousands) for each extra year. The intercept is the predicted value when x = 0 — the starting point.
3. Describe the correlation. The correlation coefficientr is a number between -1 and 1 that measures how tightly the dots hug a straight line.
Strength: the closer |r| is to 1, the stronger (tighter) the relationship. Near 0 = weak/no linear relationship.
r = 0.91 is strong positive; r = -0.4 is weak-to-moderate negative.
Key warning: correlation does not prove causation. A strong r means the variables move together, not that one causes the other.
One more habit: always attach units and read whether y is in thousands, percent, etc. The test loves to hide a factor of 1,000 in the answer choices.
A line of best fit models the relationship between the number of practice tests a student takes (x) and their score (y), given by y = 12x + 480. What is the predicted score for a student who takes 5 practice tests?
A scatterplot relates the temperature (in °C) and the number of cups of hot cocoa sold per hour at a cafe. The line of best fit is y = -1.4x + 35. Which statement best interprets the slope in context?
Researchers measured daily screen time (hours) and reported sleep quality (1–10 scale) for 200 people. The correlation coefficient was r = -0.78. Which statement is best supported?
The slope is the per-unit rate of change; the intercept is the value when x = 0. If a choice describes 'when x is 0...', it's about b, not the slope. Match the question to the right number.
If y is measured in thousands of dollars, a slope of -2.3 means **2,300∗∗,not2.30. Always read what the axis units are before picking an answer.
Even a strong r (like 0.9) only shows the variables move together. Any answer saying one variable causes the other from observational data is wrong on these questions.
Slope describes 'change in y per +1 in x.' Answers that flip the roles ('per additional cup, temperature drops...') are traps — confirm which variable is x.
Line of best fit is y = mx + b: plug in x to predict y, or solve for x given y.
Slope = change in y for each +1 in x (watch the sign AND the units); intercept = predicted y when x = 0.
Correlation coefficient r ranges from -1 to 1: sign = direction, |r| near 1 = strong, near 0 = weak.
Correlation never proves causation, especially in observational data.
Always check axis units — a hidden 'thousands' or 'percent' changes the answer.
Two-Variable Data: Models and Scatterplots — Learn | UnlimitedTests