Math

Two-Variable Data: Models and Scatterplots

2 min readEasy5-question drill

Scatterplot questions show up on nearly every test, and the good news is they almost always reward the same handful of moves: plug into a line, read a slope, or judge a correlation. Master those and you bank easy points.

Where this lesson fits

You are here

→Linear Functions →Evaluating Statistical Claims: Observational Studies and Experiments →One-Variable Data: Distributions and Measures of Center and Spread

Two-variable data means each thing you measure gives you two numbers — like a student's study hours AND their exam score. We plot each person as a single dot at position (x, y). The whole picture of dots is a scatterplot.

Loading…

A line of best fit y = 8.5x + 42: plug in x = 6 hours to predict a score of 93.

When the dots roughly trend in a straight direction, we can draw a line of best fit (also called a trend line or regression line) — the single straight line that comes closest to all the dots. The test gives you this line as an equation, usually in the form:

y = mx + b

m is the slope — how much y changes when x goes up by 1.
b is the y-intercept — the predicted y when x = 0.

Three question types cover almost everything:

1. Predict a value. They give you the line and an x. Just substitute x and compute y. (Or give you a y and ask for x — then solve the equation.)

2. Interpret slope or intercept in context. The slope tells you the rate of change with units. If y is dollars and x is years, a slope of -2.3 means the value drops 2.3 units of y (here, $2,300 if y is in thousands) for each extra year. The intercept is the predicted value when x = 0 — the starting point.

3. Describe the correlation. The correlation coefficient r is a number between -1 and 1 that measures how tightly the dots hug a straight line.

Loading…

A negative-slope trend line: y falls 2.3 for each +1 in x (a strong negative correlation).

Sign: positive r = dots rise left-to-right; negative r = dots fall.
Strength: the closer |r| is to 1, the stronger (tighter) the relationship. Near 0 = weak/no linear relationship.
r = 0.91 is strong positive; r = -0.4 is weak-to-moderate negative.

Key warning: correlation does not prove causation. A strong r means the variables move together, not that one causes the other.

One more habit: always attach units and read whether y is in thousands, percent, etc. The test loves to hide a factor of 1,000 in the answer choices.

Loading…

Quick check

Check your understanding with a question from this topic:

A researcher collected data on the number of hours students studied and their exam scores. The line of best fit for the scatterplot is y = 8.5x + 42, where x is hours studied and y is the predicted exam score. What is the predicted exam score for a student who studied for 6 hours?

Worked examples

Example 1

A line of best fit models the relationship between the number of practice tests a student takes (x) and their score (y), given by y = 12x + 480. What is the predicted score for a student who takes 5 practice tests?

Loading…

Example 2

A scatterplot relates the temperature (in °C) and the number of cups of hot cocoa sold per hour at a cafe. The line of best fit is y = -1.4x + 35. Which statement best interprets the slope in context?

Loading…

Example 3

Researchers measured daily screen time (hours) and reported sleep quality (1–10 scale) for 200 people. The correlation coefficient was r = -0.78. Which statement is best supported?

Loading…

Common pitfalls

Mixing up slope and intercept

The slope is the per-unit rate of change; the intercept is the value when x = 0. If a choice describes 'when x is 0...', it's about b, not the slope. Match the question to the right number.

Loading…

Forgetting the units (thousands, percent)

If y is measured in thousands of dollars, a slope of -2.3 means ** $2,300**, not$ 2.30. Always read what the axis units are before picking an answer.

Loading…

Claiming correlation proves causation

Even a strong r (like 0.9) only shows the variables move together. Any answer saying one variable causes the other from observational data is wrong on these questions.

Loading…

Swapping x and y in interpretations

Slope describes 'change in y per +1 in x.' Answers that flip the roles ('per additional cup, temperature drops...') are traps — confirm which variable is x.

Loading…

Key takeaways

Line of best fit is y = mx + b: plug in x to predict y, or solve for x given y.
Loading…
Slope = change in y for each +1 in x (watch the sign AND the units); intercept = predicted y when x = 0.
Loading…
Correlation coefficient r ranges from -1 to 1: sign = direction, |r| near 1 = strong, near 0 = weak.
Loading…
Correlation never proves causation, especially in observational data.
Loading…
Always check axis units — a hidden 'thousands' or 'percent' changes the answer.
Loading…

Watch & learn

Curated Khan Academy walkthroughs on Two-Variable Data: Models and Scatterplots. They're complementary to this lesson — watch one if a written explanation isn't clicking, or after to reinforce.

Tracks your progress across lessons.

Try it yourself

5 practice questions on Two-Variable Data: Models and Scatterplots, drawn from the question bank. The tutor is one click away if you get stuck.

Where this lesson fits

Two-Variable Data: Models and Scatterplots

You are here

→Linear Functions →Evaluating Statistical Claims: Observational Studies and Experiments →One-Variable Data: Distributions and Measures of Center and Spread