Product Moment Correlation Coefficient
Neil Trivedi
Teacher
Product Moment Correlation Coefficient
Correlation measures the strength and direction of a linear relationship between two variables.
• Positive correlation: as one variable increases, the other tends to increase as well.
• Negative correlation: as one variable increases, the other tends to decrease.
• No correlation: no linear relationship between the variables.
The Product Moment Correlation Coefficient (PMCC), denoted by , is what we use to measure the strength of the positive or negative correlation between two variables.
Important note: correlation does NOT imply causation.
Here are some examples of scatter graphs showing different values of the PMCC.


The PMCC will always lie in the range .
Here are some general descriptions to use for different values of :
• there is no linear relationship.
• there is a perfect positive/negative linear relationship.
• and there is a weak positive/negative linear relationship.
• and there is a moderate positive/negative linear relationship.
• and there is a strong positive/negative linear relationship.

Here are some general steps for finding the PMCC using a calculator:
1) Go to STATISTICS mode on your calculator.
2) Select the linear regression option, which is often shown as .
3) Input the data values with the data for one variable going in the column and the data for the other variable going in the column.
4) Use the regression or statistics calculation option to obtain the values of , and .
Example:
A group of students were asked about how much time, in hours, they spend on social media each week. The same students also sat a maths test, marked out of . The table shows the results.
Student |
|
|
|
|
|
|
|
|
|
|
Time Spent on Social Media (hours) |
|
|
|
|
|
|
|
|
|
|
Test Score (out of 100) |
|
|
|
|
|
|
|
|
|
a) Calculate the PMCC for all students, giving your answer to decimal places.
Single Step: Use the calculator to find the PMCC.
In this case, we input the data for the time spent on social media in the column while the corresponding test scores go in the column. Once we’ve inputted all the values in the table, we select the regression calculation option to get the PMCC. We find that the PMCC will be approximately to decimal places.
b) With reference to your answer in part a), comment on the suitability of a linear regression model for these data.
indicates a very strong negative (almost perfect) correlation, so a linear regression model is suitable for these data.
Note: We are choosing a linear regression model here since our independent variable (hours) and dependent variable (score) have not been modified in any way using coding. As seen in other areas of A-level Maths, we can take logs of each axis if we are modelling using exponentials/polynomials. If there is no linear relationship between the hours spent on social media and the test score, this doesn't necessarily mean that no relationship exists. There may be a non-linear relationship, which could be revealed by taking logs of one or both of the variables.