Spearman’s rank correlation: is there a relationship?
Some experiments do not compare two groups — they ask whether two things change together. Does shell height increase further up the shore? Spearman’s rank correlation is the test that answers it. This page builds it up from scratch, with a full worked example.
When do you use a correlation test?
Very often in biology you want to know whether two things are associated — whether they change together. Do taller plants have wider leaves? Does a woodlouse’s activity change with temperature? Does shell height increase as you go further up a rocky shore? A correlation test answers this.
Use a correlation test when…
- you have pairs of measurements — two values recorded for each individual or site (e.g. for each limpet: its distance up the shore and its shell height);
- you want to know if the two vary together (as one goes up, does the other go up or down?);
- you are not comparing two separate groups (that is a t-test) and not counting categories (that is chi-squared).
At A-level the correlation test you use is Spearman’s rank correlation. (Some textbooks describe Pearson’s correlation, but the boards ask for Spearman’s rank — it works on the rank order of the data, which makes it more forgiving and easier to calculate by hand.)
The two variables: which is which?
One variable is usually the independent variable (the one that could affect the other, plotted on the x-axis) and the other the dependent variable (the one that might be affected, on the y-axis). For limpets: distance up the shore is independent (x); shell height is dependent (y). For correlation it does not actually change the result which way round you put them — but plotting it conventionally makes the graph easier to read.
First, always draw a scatter graph
Before any calculation, plot the pairs of data as a scatter graph — each pair becomes one point (its x-value across, its y-value up). This lets you see at a glance whether the two variables are related, and how. There are three patterns to recognise:
- Positive correlation — the two variables increase together (e.g. leaf length and leaf width).
- Negative correlation — as one increases, the other decreases (e.g. a fox’s oxygen consumption falls as air temperature rises).
- No correlation — no relationship; the points are scattered all over.
The scatter graph gives you the picture. The correlation coefficient then turns that picture into a single number, so you can say how strong the correlation is and whether it is significant.
The correlation coefficient: one number from −1 to +1
Spearman’s rank correlation gives a single number, called the correlation coefficient, with the symbol rs. It is always somewhere between −1 and +1, and its value tells you two things at once — the direction and the strength of the correlation:
Reading rs
- rs = +1 — a perfect positive correlation (every point on a rising line).
- rs close to +1 (e.g. +0.9) — a strong positive correlation.
- rs = 0 — no correlation.
- rs close to −1 (e.g. −0.9) — a strong negative correlation.
- rs = −1 — a perfect negative correlation (every point on a falling line).
So the sign (+ or −) tells you the direction, and how far the number is from 0 tells you the strength. A value of −0.8 is just as strong a correlation as +0.8 — it is simply negative instead of positive.
The formula (given to you in the exam)
Spearman’s rank works on the rank order of the data, not the raw values. You rank each variable, find the difference in ranks for each pair, and put those into the formula. You are always given it — you just need to know what each symbol means:
Reading the symbols:
- rsthe Spearman’s rank correlation coefficient — the answer, between −1 and +1.
- dfor one pair, the difference between its two ranks (rank of x − rank of y).
- d²that difference squared (so negative differences do not cancel out positive ones).
- Σd²the sum of all the squared differences (add up the d² column).
- nthe number of pairs of data.
In plain steps: (1) rank each variable separately (1 = smallest); (2) for each pair, subtract the ranks to get d; (3) square each d; (4) add up the d² column to get Σd²; (5) put Σd² and n into the formula.
Worked example: limpets on the shore
A full worked example with real-style fieldwork data. Click to expand the method.
Does limpet shell height increase up the shore?
Correlationworked
At 10 points up a rocky shore, a student recorded the distance up the shore (m) and the height of a limpet shell (mm) at that point. Is there a significant correlation between distance up the shore and shell height?
Step 1 — State the null hypothesis
Null hypothesis (H₀): there is no correlation between distance up the shore and limpet shell height. (Any correlation we see is assumed to be due to chance until the test shows otherwise.)
Step 2 — The data, ranked
Rank each column separately (1 = smallest). Then find the difference in ranks (d) and square it (d²).
| Distance x (m) | Shell height y (mm) | Rank of x | Rank of y | d = rankX − rankY | d² |
|---|---|---|---|---|---|
| 0.5 | 8 | 1 | 1 | 0 | 0 |
| 1.2 | 10 | 2 | 3 | −1 | 1 |
| 2.0 | 9 | 3 | 2 | 1 | 1 |
| 3.1 | 12 | 4 | 4 | 0 | 0 |
| 4.0 | 14 | 5 | 6 | −1 | 1 |
| 5.2 | 13 | 6 | 5 | 1 | 1 |
| 6.0 | 16 | 7 | 8 | −1 | 1 |
| 7.1 | 15 | 8 | 7 | 1 | 1 |
| 8.3 | 18 | 9 | 9 | 0 | 0 |
| 9.0 | 19 | 10 | 10 | 0 | 0 |
| Σd² = | 6 | ||||
Plotting the pairs as a scatter graph first shows the pattern clearly — the points rise from bottom-left to top-right, so we expect a strong positive correlation:
Step 3 — Put the numbers into the formula
n = 10, so n² − 1 = 100 − 1 = 99, and n(n²−1) = 10 × 99 = 990. Σd² = 6.
rₛ = 1 − (6 × Σd²) ÷ [ n(n² − 1) ] = 1 − (6 × 6) ÷ 990 = 1 − 36 ÷ 990 = 1 − 0.036 = 0.96Step 4 — Compare with the critical value
From the Spearman’s rank critical-values table, for n = 10 at p = 0.05 the critical value is 0.648. Compare your rs (using how far it is from 0):
rₛ = 0.96 is GREATER than critical value 0.648 Medical example: is resting heart rate linked to blood pressure?
Correlationmedical
A nurse recorded, for 10 adult patients at a clinic, each patient’s resting heart rate (beats per minute) and their systolic blood pressure (mmHg). Is there a significant correlation between the two?
Step 1 — State the null hypothesis
Null hypothesis (H₀): there is no correlation between resting heart rate and systolic blood pressure.
Step 2 — The data, ranked
Rank each column separately (1 = lowest), find the difference in ranks (d) and square it.
| Heart rate x (bpm) | Blood pressure y (mmHg) | Rank of x | Rank of y | d = rankX − rankY | d² |
|---|---|---|---|---|---|
| 58 | 112 | 1 | 1 | 0 | 0 |
| 61 | 118 | 2 | 3 | −1 | 1 |
| 64 | 116 | 3 | 2 | 1 | 1 |
| 66 | 121 | 4 | 4 | 0 | 0 |
| 70 | 124 | 5 | 6 | −1 | 1 |
| 72 | 122 | 6 | 5 | 1 | 1 |
| 75 | 129 | 7 | 7 | 0 | 0 |
| 78 | 131 | 8 | 8 | 0 | 0 |
| 81 | 134 | 9 | 9 | 0 | 0 |
| 85 | 140 | 10 | 10 | 0 | 0 |
| Σd² = | 4 | ||||
Plotted as a scatter graph, the points rise together — higher heart rates tend to go with higher blood pressures:
Step 3 — Put the numbers into the formula
n = 10, so n(n² − 1) = 10 × 99 = 990. Σd² = 4.
rₛ = 1 − (6 × Σd²) ÷ [ n(n² − 1) ] = 1 − (6 × 4) ÷ 990 = 1 − 24 ÷ 990 = 1 − 0.024 = 0.98Step 4 — Compare with the critical value
For n = 10 at p = 0.05 the critical value is 0.648.
rₛ = 0.98 is GREATER than critical value 0.648Correlation is not causation
This is the single most important warning about correlation, and a very common exam point. A significant correlation shows two variables are associated — it does NOT prove that one causes the other.
Why not?
- The relationship might be caused by a third factor affecting both. For example, people’s hand size and foot size are strongly correlated — but big hands do not cause big feet; both are controlled by genes for overall body size.
- It could be a coincidence, especially with a small sample.
- Even if there is a real cause, the correlation alone does not tell you which way round it works.
So when you write a conclusion: say there is a significant correlation / association between the variables — do not claim one causes the other unless you have other evidence. To investigate a cause, you would design a controlled experiment.
Check your understanding
Self-marking questions, plus a chance to calculate rs yourself from a small data set.
What each exam board expects
All the main A-level Biology specifications name a correlation coefficient; Spearman’s rank is the one to know.
| Board | What is required |
|---|---|
| AQA (7402) | Select and use a correlation coefficient; AQA recommends Spearman’s rank. Interpret the probability value and reject/accept the null hypothesis at 0.05. |
| OCR A / B | Spearman’s rank correlation named explicitly; rank the data, calculate rs, compare with the critical value. |
| Edexcel A / B | Use a correlation coefficient (Spearman’s rank) to test for an association; understand significance at 5% and that correlation is not causation. |
| WJEC / Eduqas | Spearman’s rank correlation coefficient; rank data, use the formula, interpret against tabulated critical values. |
