Chi-Square Test Calculator
A Pearson chi-square goodness-of-fit test on a single set of category counts. Paste observed counts and the calculator returns χ², degrees of freedom, the right-tailed p-value, and every cell’s contribution. Expected counts are optional — leave blank for a uniform distribution, or enter counts or ratios to test a specific hypothesis.
Chi-square statistic
χ²(3) = 20, p = 1.70e-4
- Cell 1 — O=30, E=25
- (O − E)² / E = 1
- Cell 2 — O=10, E=25
- (O − E)² / E = 9
- Cell 3 — O=20, E=25
- (O − E)² / E = 1
- Cell 4 — O=40, E=25
- (O − E)² / E = 9
- Categories (k)
- 4
- Degrees of freedom (k − 1)
- 3
- Observed total (N)
- 100
- Expected total
- 100
- p-value (right tail)
- 1.70e-4
Reject the null hypothesis at α = 0.01 — strong evidence the observed distribution differs from expected. Expected counts assumed uniform across the k categories.
How to use this calculator
Type or paste your observed counts into the first box — separate values with commas, spaces, tabs, or new lines. The expected box is optional: leave it blank for a uniform distribution (every category gets the same share of the total), or enter a vector of expected counts, or enter a vector of ratios such as "9 3 3 1" and the calculator rescales them to match the observed total. The headline shows the chi-square statistic with its degrees of freedom and the right-tailed p-value; the breakdown lists each cell’s observed value, expected value, and squared-deviation contribution, plus the totals and the final p-value. Decimals are fine for expected values; observed counts should be non-negative.
How the calculation works
Pearson’s chi-square goodness-of-fit test asks whether a single categorical distribution matches a hypothesised one. For each of k categories you have an observed count Oᵢ and a hypothesised expected count Eᵢ — Eᵢ is the count you would see in that category if the null hypothesis were exactly true. The test statistic is χ² = Σᵢ (Oᵢ − Eᵢ)² ÷ Eᵢ. Each cell contributes a squared deviation divided by its expected count: cells where the observation is much larger or much smaller than expected push χ² up. Under the null, when the expected vector is fully specified (no parameters estimated from the data), χ² follows a chi-square distribution with df = k − 1 degrees of freedom. The p-value is the right tail of that distribution: P(χ²_{k−1} ≥ observed). A small p-value means the observed counts are inconsistent with the hypothesised distribution. The chi-square approximation is good when every expected count is at least 5; with sparser cells, prefer an exact test or pool categories.
Worked example
Mendel’s pea data has four phenotypes with observed counts (315, 108, 101, 32), N = 556, hypothesised to follow a 9:3:3:1 ratio. Expected counts: 556 × (9/16, 3/16, 3/16, 1/16) = (312.75, 104.25, 104.25, 34.75). Per-cell contributions: (315 − 312.75)² ÷ 312.75 = 0.0162; (108 − 104.25)² ÷ 104.25 = 0.1349; (101 − 104.25)² ÷ 104.25 = 0.1013; (32 − 34.75)² ÷ 34.75 = 0.2176. χ² = 0.470 on df = 3, giving p ≈ 0.925 — a huge p-value, confirming the data are entirely consistent with the 9:3:3:1 model. By contrast, observed (30, 10, 20, 40) against a uniform expected of (25, 25, 25, 25) gives contributions 1 + 9 + 1 + 9 = 20, χ² = 20 on df = 3, p ≈ 0.00017 — easily rejected at any conventional level.
Frequently asked questions
What exactly is a chi-square test testing?
A Pearson chi-square test asks whether observed category counts deviate from what some null hypothesis predicts. The classic goodness-of-fit variant — which this calculator runs — tests whether a single categorical distribution matches a fully specified expected distribution. The null hypothesis says the population proportions are exactly the ones you supplied; the alternative says they are not. There is also a chi-square test of independence on a contingency table, which tests whether two categorical variables are associated; the formula is identical but the expected counts come from row and column totals and df = (rows − 1)(cols − 1). Both share the same χ² statistic and the same chi-square sampling distribution.
What if I do not have expected counts — only ratios?
Enter the ratios in the expected box and the calculator rescales them automatically so the expected total matches the observed total. For example, if you want to test a 9 : 3 : 3 : 1 ratio against four observed counts that sum to 556, you can type "9 3 3 1" directly — the calculator multiplies by 556/16 to get the expected counts (312.75, 104.25, 104.25, 34.75). This is equivalent to typing the rescaled values yourself. Ratios must all be strictly positive; a hypothesised zero count is not a valid expected value because the test divides by it.
What does it mean if my expected counts are very small?
The chi-square distribution is the limiting null distribution of the statistic as expected counts grow large. A widely cited rule of thumb is that every expected count should be at least 5 (some authors say 1, with no more than 20% of expected counts below 5). With sparser data the asymptotic approximation breaks down: the p-value can be biased and the test loses power. Three remedies: pool adjacent rare categories until the expected counts pass the threshold, use an exact test (such as the multinomial exact test or Fisher’s exact test for contingency tables), or use a likelihood-ratio G-test which has better small-sample behaviour. The calculator does not refuse to compute χ² for sparse cells — it is your responsibility to check the assumption.
Why is the p-value right-tailed?
Under the null hypothesis the observed counts should sit close to the expected counts, so χ² should be close to its expected value of df. The further the observations stray from the expected distribution — in either direction — the larger χ² becomes, because each term (Oᵢ − Eᵢ)² is squared. There is no "negative χ²" to indicate the opposite direction, so the chi-square test is one-sided by construction: only large values of the statistic are evidence against the null. The p-value is P(χ²_{df} ≥ observed). A χ² value close to zero is actually suspicious for the opposite reason — it suggests the observed data fit the hypothesis too closely, possibly because the data were edited or the expected distribution was tuned to the sample.
How are degrees of freedom decided?
For a goodness-of-fit test with k categories and a fully specified expected distribution (no parameters estimated from the data), df = k − 1. The minus one comes from the constraint that observed and expected counts must sum to the same N. If you estimate m parameters of the expected distribution from the data — for example, fitting a Poisson mean to the observed counts to test Poisson goodness-of-fit — df = k − 1 − m. This calculator assumes the expected vector you supply (or the implicit uniform one) is fully specified, so it reports df = k − 1; if you estimated parameters first, subtract them yourself when interpreting the p-value.
Should I use a chi-square test or Fisher’s exact test?
For a single categorical distribution (the goodness-of-fit case this calculator runs) the alternative is the multinomial exact test, not Fisher’s. For a 2×2 or larger contingency table comparing two categorical variables, Fisher’s exact test is preferred whenever any expected count is small (< 5) — it conditions on the observed marginals and computes an exact p-value rather than relying on the chi-square approximation. Chi-square is fine when all expected counts are comfortably large and is the standard choice for big samples. For tables larger than 2×2, the exact-test equivalents are computationally expensive and chi-square (or the likelihood-ratio G-test) is the practical default.