Skip to content

Why biologists use statistics

Living things vary. Measure 20 leaves from one tree and no two are the same length; count the offspring of a genetic cross and you never get the exact ratio you predicted. So whenever biological data differ — between two groups, or from what was expected — there are always two possible explanations:

  • the difference is real (something is genuinely going on), or
  • the difference is just chance — the natural variation you would get anyway.

A statistical test decides between these two. It starts from the null hypothesis — the assumption that there is no real effect (no difference, no correlation, no difference from the expected) — and works out the probability that the data could look this way by chance alone. In biology the agreed cut-off is p = 0.05 (5%):

The 5% rule

  • if there is less than a 5% probability the result is chance → the result is significantreject the null hypothesis;
  • if there is more than a 5% probability it is chance → not significant → you cannot reject the null hypothesis.

Every test on this page works that same way — they differ only in what kind of data they handle and what question they answer. New to all this? Start with descriptive statistics, which explains variation, the normal distribution and significance from scratch.

Which test do I use?

Start at the top and answer each question. This is the single most useful thing to get right — picking the correct test is often a mark in itself.

What kind of data do you have?
Counts in categories
Comparing observed counts with expected counts?
Chi-squared teste.g. do offspring fit a 3:1 ratio? is recovery linked to treatment?
Measurements (continuous)
What do you want to know?

Just describing one set of measurements (average and spread)? That is descriptive statistics — mean, standard deviation, standard error and error bars.

The very first decision — counts vs measurements — is the one students most often get wrong. Counts of whole individuals in categories → chi-squared. Things you measured on a scale (length, mass, rate, time) → t-test or correlation.

Compare the tests at a glance

TestData typeQuestion it answersBiology example
Descriptive statisticsMeasurements (one set)What is the average and how spread out is it?Mean ± SD of 20 leaf lengths
t-testMeasurements (two groups)Is the difference between two means significant?Mean shell length: high vs low shore zone
Spearman’s rankMeasurements (paired variables)Is there a relationship between two variables?Shell height vs distance up the shore
Chi-squaredCounts in categoriesDo observed counts differ from expected counts?Offspring ratio vs an expected 3:1

The conditions for using each test

A test only gives a valid answer if its conditions are met. Check these before you choose — examiners award marks for justifying the test as well as doing it.

t-test — conditions

  • data are continuous measurements (length, mass, rate…), not counts;
  • you are comparing exactly two groups (two means);
  • the data are drawn from a population that is roughly normally distributed (bell-shaped);
  • use the paired t-test only when each value in one group is genuinely linked to one in the other (same individual, before/after); otherwise use the unpaired t-test.

Spearman’s rank correlation — conditions

  • you have pairs of measurements — two values recorded for each individual or site;
  • you are testing for a relationship (association) between the two variables, not a difference between groups;
  • the data can be ranked (put in order) — Spearman’s works on rank order, so it does not need the data to be normally distributed;
  • ideally at least about 7–8 pairs, so a critical value is meaningful.

Chi-squared test — conditions

  • data are counts / frequencies of whole individuals in categories — never percentages or measurements;
  • you have an expected set of counts (from a ratio, theory, or row & column totals);
  • the categories are independent (each individual falls in exactly one);
  • every expected value is 5 or more — the test is unreliable below that (collect more data).

Descriptive statistics — when

  • you simply want to summarise one set of measurements — its average (mean/median/mode) and spread (range/standard deviation);
  • standard error and error bars show how reliable a mean is, and overlapping ±2 SE bars are a quick visual check before a t-test.