2022 Introduction to Statistics in Research Mitchell 2nd ed

I N T R O T O R E S E A R C H : D A T A V I S U A L I Z A T I O N & C O M M O N S T A T T E S T S

Scatter plot & An Introduction to Anscombe’s Four Datasets

Use a scatterplot for bivariate numerical data. Scatterplots help us see if there is a relationship between x and y. One of the best ways to explain the power of scatterplots is to use Anscombe's Quartet. In 1973, Anscombe built four datasets. Statistically they are very similar: averages, variance, correlation and even the linear regression formula are the same.

Table 102: Anscombe's Quartet of Data

If calculated, you would find the following statistical information is true for each of the four datasets.

• The average “x” value is 9; the average “y” value is 7.50 • The variance for “x” is 11 and the variance for “y” is 4.12 • The correlation between “x” and “y” is .816 • The linear regression is y = .05x + 3

But if we graph them, they are quite different (these are graphed in JMP). Look at the following graphs.

105

Made with FlippingBook Online newsletter creator