2022 Introduction to Statistics in Research Mitchell 2nd ed
I N T R O T O R E S E A R C H : D A T A V I S U A L I Z A T I O N & C O M M O N S T A T T E S T S
Scatter plot & An Introduction to Anscombe’s Four Datasets
Use a scatterplot for bivariate numerical data. Scatterplots help us see if there is a relationship between x and y. One of the best ways to explain the power of scatterplots is to use Anscombe's Quartet. In 1973, Anscombe built four datasets. Statistically they are very similar: averages, variance, correlation and even the linear regression formula are the same.
Table 102: Anscombe's Quartet of Data
If calculated, you would find the following statistical information is true for each of the four datasets.
• The average “x” value is 9; the average “y” value is 7.50 • The variance for “x” is 11 and the variance for “y” is 4.12 • The correlation between “x” and “y” is .816 • The linear regression is y = .05x + 3
But if we graph them, they are quite different (these are graphed in JMP). Look at the following graphs.
105
Made with FlippingBook Online newsletter creator