Testing for Correlation
Neil Trivedi
Teacher
Testing for Zero Correlation
Generally speaking, we say two variables have no correlation if they are independent.
We use hypothesis testing to determine whether we can suggest correlation between two variables given the sample size. Remember, the sample is supposed to represent the general population. Therefore, instead of using , we will use in our null and alternative hypothesis.
• The null hypothesis has the form . So, we are assuming that the population has no correlation unless proven otherwise.
• The alternative hypothesis can have one of three forms:
(for a one-tailed test testing for positive correlation)
(for a one-tailed test testing for negative correlation)
(for a two-tailed test testing for any correlation)

The critical value can be found using the table. Look for the column with the required significance level, and the row for the matching sample size.
We reject if the Product Moment Correlation Coefficient (PMCC) we calculated is outside of the critical region.
When we are testing for positive correlation, it would be the case that our PMCC is bigger than the critical value. When we are testing for negative correlation, we reject if the PMCC is less than the negative of the value in the table.
When we are conducting a two-tailed test, we are finding the critical value that is half of the significance level when we are looking through the table. We reject if the PMCC we find is bigger than the critical value or lower than the negative critical value.
We do not reject if the PMCC we calculated is outside the critical region.
Example 1:
Test at the significance level, whether there is positive correlation between the temperature and number of ice creams sold. The PMCC of the sample with size is .
Step 1: Set up the hypotheses.
This is a one-tailed test for positive correlation, so the hypotheses are:
(remember we are assuming no correlation unless proven otherwise)
Step 2: Find the critical value using the table
Sample size

From the PMCC critical values table above, the critical value at the significance level is .
Critical region at the significance: . This means that any calculated PMCC above is strong enough to suggest that there is positive correlation with at least certainty (because of the significance level).
Step 3: Draw a conclusion based on the result.
Since , we do not reject .
There is insufficient evidence to suggest a positive correlation between temperature and the number of ice creams sold.
Example 2:
From the Edexcel large data set, the daily mean windspeed, knots, and the daily maximum gust, knots, were recorded for the first days in July in Hurn, in 1987.
Day |
|
|
|
|
|
|
|
|
|
|
|
| |||
Windspeed kn |
|
|
|
|
|
|
|
|
|
|
|
|
| ||
Gust kn |
|
|
|
|
|
|
|
|
|
| n/a |
|
|
a) State the meaning of ‘n/a’ in the table.
Single Step: Apply knowledge of the Large Data Set.
‘n/a’ means that the data is not available for that entry.
b) Calculate the PMCC for the remaining days.
Single Step: Calculate PMCC using a calculator.
As the data for the gust on Day 12 are missing, we exclude that day from the calculation.
The PMCC is
dp
c) Test, at the level of significance, whether there is evidence of any correlation between the daily mean windspeed, and the daily maximum gust.
Step 1: Set up the hypotheses.
This is a two-tailed test, so the hypotheses are:
(here we are testing for any correlation as stated in the question, not necessarily positive nor negative)
Step 2: Find the critical value using the table.
Sample size = (excluding day 12)

For a two-tailed test at the significance level, each tail has . From the PMCC critical values table: Critical value .
Hence the critical region is or .
Step 3: Draw a conclusion based on the result.
Since , we reject in favour of . There is sufficient evidence to suggest a correlation between the daily mean wind speed and the daily maximum gust. (Note: Because this is a two-tailed test, we can only conclude that a correlation exists – not necessarily a positive one.)
Challenging Question