Graphing your data can provide valuable insights into the information it holds. While visual representations are essential, it’s also helpful to have a numerical measure to understand the strength of the relationship. This is where Pearson’s correlation coefficient, denoted by the symbol r, comes into play.
Table of Contents
What does Pearson’s r tell us?
Pearson’s correlation coefficient offers two key pieces of information about the association between two variables: direction and strength.
A positive correlation indicates that as one variable increases, the other tends to increase as well. On the other hand, a negative correlation suggests that as one variable increases, the other tends to decrease. For example, as a student’s level of economic deprivation decreases, their academic performance may increase.
The strength of a linear relationship is indicated by the magnitude (absolute value) of the correlation coefficient. The weakest relationship is indicated by a correlation coefficient of 0, which actually represents no correlation at all. The strongest linear correlation is indicated by a correlation coefficient of -1 or 1. It’s important to note that the sign of the correlation (+ or -) only determines the direction, not the strength. So a correlation of -0.6 is equally as strong as a correlation of +0.6, only differing in direction.
Calculating Pearson’s r using SPSS
Let’s use an example to understand how to calculate Pearson’s correlation coefficient using SPSS. Suppose we’re interested in exploring the relationship between exam scores at age 11 and age 14. We can follow these steps in SPSS:
- Go to
Analyze > Correlate > Bivariate.
- In the pop-up menu, select the two variables you want to examine (e.g., ks2stand and ks3stand) by moving them from the left-hand list to the variables box. Note that both variables must be continuous.
The output you’ll obtain (see Figure 2.4.2) will contain a correlation table for the selected variables. The Pearson’s correlation coefficient is represented by the value circled in red, while the level of statistical significance is displayed below.
In our example, we observe a positive correlation between exam performances at age 11 and age 14, indicating that a high score in one is associated with a high score in the other. The correlation coefficient value of 0.886 suggests that one variable accounts for approximately 79% (r²) of the variance in the other. The significance level of 0.000 indicates that the likelihood of this relationship occurring purely due to chance is less than 0.0005, making it highly likely that the relationship exists in the population as a whole.
Correlation of ordinal variables – Spearman’s rho
Pearson’s correlation coefficient is suitable for continuous variables. However, when dealing with ordinal data, the Spearman’s rho rank order correlation is the preferred approach. Fortunately, the procedure in SPSS is similar to Pearson’s correlation. After selecting
Analyze > Correlate > Bivariate, remember to check the Spearman box in the correlation coefficients section of the pop-up box and deselect the Pearson option.
Correlation and causation
It’s essential to differentiate between correlation and causation. A correlation merely shows a relationship between two variables and does not imply causation. Let’s consider an example to illustrate this point.
Looking at National test results, we may observe that pupils who join a school later in a key stage tend to perform worse than those who attend for the entire key stage. However, we cannot automatically conclude a causal relationship from this observation. Several possibilities exist, such as reverse causality or the influence of a third variable like socio-economic disadvantage.
The diagram in Figure 2.4.4 represents a few variables, but countless others could be involved in this relationship. In future chapters, we will explore such topics more deeply. For now, let’s shift our focus to simple linear regression.
For more information on various topics, check out 5 WS.
Remember, understanding correlation is valuable, but establishing causation requires a more rigorous analysis.