How a Confidence Interval Calculator Works

Q: What does a 95% confidence interval actually mean?

It is a statement about the procedure that produced the interval, not a probability statement about the interval itself. If you repeated the same sampling and the same interval construction many times, about 95% of those intervals would contain the true population mean. The particular interval you have either contains the truth or does not; the 95% refers to the long-run hit rate of the method, not to a probability about one specific pair of numbers. The frequentist-correct phrasing is 'we are 95% confident that the procedure captures the true mean'. If you want a 'there is a 95% chance the mean lies between A and B' statement, you need a Bayesian credible interval, not a frequentist confidence interval.

Q: Should I use the z-interval or the t-interval?

Use the z-interval (the one this calculator implements) when the population standard deviation σ is genuinely known, or when the sample size is large enough (commonly n ≥ 30) that the sample SD is a reliable estimator of σ. Use the t-interval — built from Student's t-distribution with n − 1 degrees of freedom — when σ is unknown and n is small. The t-distribution has heavier tails, so it gives a wider, more conservative interval; that extra width is the honest cost of estimating σ from the same sample. For n ≥ 100 the t and z critical values agree to two decimal places (1.984 vs 1.960 for 95%) and the practical difference is negligible.

Q: Why are the z critical values 1.6449, 1.96, 2.5758 and 3.2905?

They come from the inverse cumulative distribution function of the standard normal, evaluated at 1 − α/2 for the chosen confidence level 1 − α. For 95% two-sided, α = 0.05, so the cut-off is at 0.975 of the cumulative distribution, which equals 1.959964 — the textbook 1.96. The others follow the same rule: 1.6449 (90%), 2.5758 (99%) and 3.2905 (99.9%). This calculator stores the unrounded values internally so the displayed interval matches statistical software (R's qnorm, Python's scipy.stats.norm.ppf, Excel's NORM.S.INV) to four decimals.

Q: How do I shrink the margin of error without changing the data?

Three levers. First, accept a lower confidence level — moving from 99% to 95% trims the margin by about 24%, and 95% to 90% trims another 16%. Second, cut variability in the underlying data — better instruments, stratified sampling and homogeneous subpopulations all reduce σ, and the interval scales linearly with σ. Third, increase the sample size — but because the standard error falls as 1/√n, you must quadruple n to halve the margin. Combining levers is usually more efficient than leaning on one of them alone, especially when each lever has its own cost (loss of certainty, cost of better instrumentation, cost of more respondents).

Q: What if my sample is not a simple random sample?

The formula assumes simple random sampling from the target population. With cluster samples, stratified samples or convenience samples, the standard error in this calculator under-estimates the true uncertainty and the resulting interval is too narrow. Survey methodologists correct for this with a design effect (deff) that inflates the variance; a deff above about 2 is the conventional warning that a naive z-interval will mislead. For complex sample designs use software built for them — R's survey package, Stata svy commands, SAS PROC SURVEYMEANS — which compute the correct standard error for the design rather than the simple-random-sample one.

Q: Can I use this calculator for a proportion or a difference between two means?

No — this calculator is built for the mean of a single continuous variable. For a single proportion the relevant formula is p̂ ± z · √(p̂(1−p̂)/n), which our sample-size calculator inverts to choose n for a chosen margin. For a difference between two means or two proportions, the standard error is built from both samples and differs from the single-sample form. The conceptual structure (point estimate ± z · SE) is the same, but the standard error itself is different in each case. Use a specialised two-sample tool, R's t.test or Python's scipy.stats.ttest_ind for difference-in-means intervals.

Q: How big does my sample need to be for the calculator to work?

The z-interval is exact when σ is known, regardless of sample size, but rests on the sampling distribution of the mean being approximately normal. The central limit theorem makes that approximation excellent at n ≥ 15 for symmetric mound-shaped data, n ≥ 30 for mildly skewed data, and n ≥ 100 or more for heavy-skewed populations (income, response times, hospital bills). For very small samples (n < 10) from non-normal populations, neither the z- nor the t-interval is reliable — use a bootstrap interval or a non-parametric method instead. The calculator does not refuse small samples but the user is responsible for checking that the central limit theorem assumption is reasonable.

Q: Why do I keep seeing different z values in different sources — 1.96 vs 1.959964?

Both are the same number rounded to different precisions. 1.96 is the textbook two-significant-figure version; 1.959964 is what you get if you compute Φ⁻¹(0.975) to six decimals; 1.95996398454... is the exact infinite-precision value. The difference matters when you are checking your work against statistical software — software uses the unrounded value, so a manual calculation with 1.96 will disagree at the third decimal of the margin. This calculator uses the unrounded constants internally so its output matches R, Python and Excel to four decimals; the displayed z critical value rounds to four decimals for readability.

A confidence interval is a range of plausible values for an unknown population mean, built from a sample. This guide walks through the x̄ ± z·(σ/√n) formula behind every confidence interval calculator, the z-critical values for the standard confidence levels, when to switch from the z-interval to the t-interval, the worked example our calculator uses by default, and the misreadings that trip up almost every first-course student.

Calc Dragon · 24 May 2026 · 11 min read

#math#statistics#confidence-interval#z-score#margin-of-error#sampling

What a confidence interval actually is

A confidence interval is a range of plausible values for an unknown population quantity, built from a sample. For a mean, the interval is centred on the sample average and stretches outward by a margin of error that reflects how much the sample could have varied if you had drawn a different one. A 95% confidence interval is the most common version, but the underlying machinery is the same at any chosen level — feed a sample mean, a standard deviation and a sample size into the confidence interval calculator and it returns the lower bound, the upper bound, and the margin of error that connects them.

The interval is not a guess at the true mean — it is a statement about the procedure that produced it. Build many intervals from many independent samples and 95% of them will contain the true population mean. Any single interval either contains the truth or does not, and which of the two is unknowable from the sample alone. That distinction sounds like philosophical hair-splitting but it is where almost every misreading of a confidence interval starts, so it is worth getting straight from the outset.

The formula behind the calculator

For a population mean μ with known (or well-estimated) standard deviation σ, the two-sided confidence interval is:

CI = x̄ ± z* · (σ / √n)

Three quantities go into the margin. The sample mean x̄ centres the interval. The standard error σ/√n is the standard deviation of the sampling distribution of the mean — it measures how much the sample average would jump around from one sample to the next. The two-sided critical value z* is the standard-normal quantile that captures the chosen confidence level in its central area: 1.6449 for 90%, 1.9600 for 95%, 2.5758 for 99% and 3.2905 for 99.9%. The formula is published in the NIST/SEMATECH e-Handbook of Statistical Methods §1.3.5.2 ("Confidence Limits for the Mean") and in OpenStax Introductory Statistics §8.1, and it is the basis of every textbook confidence-interval calculation for a mean.

Three properties are worth committing to memory. First, the interval is symmetric around x̄ — the margin is added to one side and subtracted from the other, so the calculator always returns two numbers equidistant from the centre. Second, the standard error shrinks like 1/√n, so quadrupling the sample size halves the margin of error; doubling the sample only narrows the interval by a factor of √2 ≈ 1.41. Third, the z critical value scales the standard error linearly, so moving from 95% to 99% widens the interval by 2.5758 / 1.96 ≈ 1.31 — about 31% wider for one extra "nine" of certainty.

Worked example: estimating mean calcium intake

A nutritionist samples 36 patients and records the mean daily calcium intake. The sample mean is x̄ = 68 mg with a known population standard deviation of σ = 3 mg. The team wants a 90% confidence interval for the true population mean.

z*    = 1.6449         (two-sided 90% critical value) SE    = σ / √n = 3 / √36 = 3 / 6 = 0.5 ME    = z* · SE = 1.6449 × 0.5 = 0.8224 CI    = x̄ ± ME = 68 ± 0.8224 = (67.18, 68.82)

Plugging the same numbers into the confidence interval calculator on this page returns (67.1776, 68.8224). The team can report that they are 90% confident the true mean daily intake in the sampled population lies between 67.18 and 68.82 mg.

Tightening the requirement to 95% confidence keeps the centre at 68 mg but widens the margin to 1.96 × 0.5 = 0.98, giving (67.02, 68.98). Pushing further to 99% confidence widens it to 2.5758 × 0.5 = 1.288, giving (66.71, 69.29). Every extra slice of certainty costs a wider interval — and the calculator makes the trade-off visible in a single click of the confidence-level selector.

The 95% confidence trap

Easily the most common mistake is reading "95% confidence interval" as "a 95% probability that the true mean lies in this interval". It feels right in plain English but it is formally wrong under the frequentist framework the formula assumes. Once the sample is drawn, the interval is a fixed pair of numbers; the true mean is also a fixed (unknown) number; neither is random. The 95% refers to the long-run hit rate of the procedure that generated the interval, not to a probability statement about any single interval.

The correct phrasing is: "We are 95% confident that the procedure used to construct this interval captures the true mean." If you want a sentence that genuinely assigns probability to the parameter itself — "there is a 95% chance the true mean is between A and B" — you need a Bayesian credible interval, not a frequentist confidence interval. The two often look similar numerically when the prior is uninformative, but the philosophical commitments are different and the difference shows up in non-trivial cases.

z-interval vs t-interval: when to switch

The formula in this calculator uses the standard normal distribution and is exact when the population standard deviation σ is known. In most practical work σ is not known; the analyst estimates it from the same sample that gave the mean, in which case the strictly correct distribution is Student's t with n − 1 degrees of freedom. The t-distribution has heavier tails than the normal, so it produces a slightly wider, more conservative interval — the price of using the sample to estimate two things at once.

The practical rule of thumb is simple. Use the z-interval when σ is genuinely known, or when the sample size is large enough (commonly n ≥ 30) that the sample SD is a reliable estimator of σ. Use the t-interval for small samples (n < 30) with unknown σ — the wider interval reflects the additional uncertainty from estimating σ. For samples of about n = 100 or above, the t-critical and z-critical values agree to two decimal places (at n = 100, t = 1.9840 vs z = 1.9600 for 95%), so the difference is rarely material.

This calculator deliberately implements the z-interval because it is the version introduced in every first-course textbook, it is the one most analysts memorise, and for the typical sample sizes encountered in business analytics and survey work it is indistinguishable from the t-interval. For published research with small samples and unknown σ, switch to a t-interval — every statistical package (R's t.test, Python's scipy.stats.t.interval, Excel's CONFIDENCE.T) has the function built in.

Factors that change the interval width

Sample size (n)

Sample size enters through 1/√n in the standard error, which means the relationship is non-linear and counter-intuitive. Doubling n only narrows the interval by a factor of √2 ≈ 1.41; halving the width of the interval requires quadrupling the sample. That is why going from n = 100 to n = 400 is a big deal and going from n = 1,000 to n = 1,100 is almost invisible. For sample-size planning around a target margin of error, see the sample size calculator, which inverts the same formula.

Standard deviation (σ)

The interval scales linearly with σ. A population that is twice as variable produces an interval twice as wide, holding everything else constant. Reducing measurement noise, stratifying the sample, or restricting the population to a more homogeneous subgroup all shrink σ and tighten the interval. The standard deviation calculator computes σ from a raw list of values for any sample you plan to feed in here.

Confidence level

The cheapest lever — but only because cheap means accepting a higher risk of being wrong. Moving from 95% to 90% confidence shrinks the interval by 1.6449 / 1.96 ≈ 16%; moving from 99% to 95% shrinks it by 1.96 / 2.5758 ≈ 24%. Use 95% as the default unless there is a domain-specific convention: 90% in some business and survey work, 99% in clinical and safety-critical settings, 99.9% only when the cost of an interval that misses the true value is catastrophic.

The distribution of the underlying data

The z-interval rests on the central limit theorem: the sampling distribution of the mean is approximately normal regardless of the shape of the underlying population, provided n is reasonably large. For symmetric, mound-shaped populations the approximation is excellent at n ≥ 15. For heavy-skewed populations (income, response times, hospital bills) the approximation needs larger n — sometimes n ≥ 100 — before the interval's coverage is close to nominal. For extremely heavy tails or count data with many zeros, the calculator's assumptions break and a bootstrap interval or a generalised linear model is the better tool.

How to use the calculator well

Default to 95% confidence. It is the universal convention for opinion polls, market research, social-science studies and most quality-control work. Only move when there is a domain reason to do so.
Use σ if you know it. Established processes — manufacturing tolerances, calibrated instruments, well-studied populations — often have a documented population SD. Plug it in directly. The z-interval is then exact.
For unknown σ, check that n is large enough. At n ≥ 30 the sample SD works as a substitute. Below n = 30, switch to a t-interval — the wider interval is the honest report.
Match units. The standard deviation and the mean must be in the same units (mg with mg, GBP with GBP, mm with mm). The calculator does no unit conversion; a mismatch silently produces a nonsense margin.
Report the centre and the margin together. "68 ± 0.82 (90% CI)" is the form most journals expect. The bare interval (67.18, 68.82) is also acceptable but loses the point estimate.
Look at the width, not just the bounds. A wide interval that includes zero (for differences) or some null reference value is the statistical signal that the data do not resolve the question. Reporting the interval is more informative than a binary "significant" or "not significant" call.

Common mistakes

Reading 95% as a probability statement about the interval. The single most common error, covered above: it is a statement about the procedure, not the interval. Reviewers and statisticians flag the loose phrasing reliably; better to write the careful version from the start.

Using the z-interval when σ is unknown and n is small. The z-interval will be too narrow; its actual coverage will be below the nominal 95%. With a sample of n = 10 and unknown σ, the t critical value is 2.262 versus 1.96 for z — a 15% difference. Use the t-interval and the wider, honest interval.

Confusing the standard deviation with the standard error. σ describes the spread of individual observations. The standard error σ/√n describes the spread of the sample mean. The interval uses the standard error; substituting σ directly produces an interval that is √n times too wide.

Forgetting that the sample must be random. The formula assumes simple random sampling. Convenience samples, clustered samples and self-selected respondents all violate the assumption, and the interval will under-estimate the true uncertainty. Survey methodologists use a design effect to inflate the variance; without that correction, treating a clustered sample as if it were random is wishful thinking.

Reporting too many decimal places. A confidence interval has roughly two significant figures of precision once n is in the tens or hundreds. Reporting (67.1776, 68.8224) implies four-figure precision the sample cannot support. Round to one or two decimals — (67.2, 68.8) at 90% CI — and the result reads honestly.

When to seek a statistician

For a single-sample mean from an approximately random, approximately normal sample, the confidence interval calculator on this page is enough. Bring in expert input when the design gets more complex: paired samples, two-group differences, ratio estimators, regression coefficients, clustered or stratified designs, heavily skewed outcomes, small samples in regulated contexts, or anything where the interval will appear in a published paper or a clinical report. The machinery generalises cleanly — every standard error sits inside the same point-estimate ± critical-value × SE structure — but the standard error itself can be hard to compute correctly, and bootstrap or model-based intervals are often the better choice for non-textbook data.

Frequently asked questions

Answers to the most common questions about confidence intervals, the z critical values and the difference between confidence and probability are listed in the FAQ block on this page. For related statistical work, see the sample size calculator, standard deviation calculator and average calculator.

Frequently asked questions

What does a 95% confidence interval actually mean?

It is a statement about the procedure that produced the interval, not a probability statement about the interval itself. If you repeated the same sampling and the same interval construction many times, about 95% of those intervals would contain the true population mean. The particular interval you have either contains the truth or does not; the 95% refers to the long-run hit rate of the method, not to a probability about one specific pair of numbers. The frequentist-correct phrasing is 'we are 95% confident that the procedure captures the true mean'. If you want a 'there is a 95% chance the mean lies between A and B' statement, you need a Bayesian credible interval, not a frequentist confidence interval.

Should I use the z-interval or the t-interval?

Use the z-interval (the one this calculator implements) when the population standard deviation σ is genuinely known, or when the sample size is large enough (commonly n ≥ 30) that the sample SD is a reliable estimator of σ. Use the t-interval — built from Student's t-distribution with n − 1 degrees of freedom — when σ is unknown and n is small. The t-distribution has heavier tails, so it gives a wider, more conservative interval; that extra width is the honest cost of estimating σ from the same sample. For n ≥ 100 the t and z critical values agree to two decimal places (1.984 vs 1.960 for 95%) and the practical difference is negligible.

Why are the z critical values 1.6449, 1.96, 2.5758 and 3.2905?

They come from the inverse cumulative distribution function of the standard normal, evaluated at 1 − α/2 for the chosen confidence level 1 − α. For 95% two-sided, α = 0.05, so the cut-off is at 0.975 of the cumulative distribution, which equals 1.959964 — the textbook 1.96. The others follow the same rule: 1.6449 (90%), 2.5758 (99%) and 3.2905 (99.9%). This calculator stores the unrounded values internally so the displayed interval matches statistical software (R's qnorm, Python's scipy.stats.norm.ppf, Excel's NORM.S.INV) to four decimals.

How do I shrink the margin of error without changing the data?

Three levers. First, accept a lower confidence level — moving from 99% to 95% trims the margin by about 24%, and 95% to 90% trims another 16%. Second, cut variability in the underlying data — better instruments, stratified sampling and homogeneous subpopulations all reduce σ, and the interval scales linearly with σ. Third, increase the sample size — but because the standard error falls as 1/√n, you must quadruple n to halve the margin. Combining levers is usually more efficient than leaning on one of them alone, especially when each lever has its own cost (loss of certainty, cost of better instrumentation, cost of more respondents).

What if my sample is not a simple random sample?

The formula assumes simple random sampling from the target population. With cluster samples, stratified samples or convenience samples, the standard error in this calculator under-estimates the true uncertainty and the resulting interval is too narrow. Survey methodologists correct for this with a design effect (deff) that inflates the variance; a deff above about 2 is the conventional warning that a naive z-interval will mislead. For complex sample designs use software built for them — R's survey package, Stata svy commands, SAS PROC SURVEYMEANS — which compute the correct standard error for the design rather than the simple-random-sample one.

Can I use this calculator for a proportion or a difference between two means?

No — this calculator is built for the mean of a single continuous variable. For a single proportion the relevant formula is p̂ ± z · √(p̂(1−p̂)/n), which our sample-size calculator inverts to choose n for a chosen margin. For a difference between two means or two proportions, the standard error is built from both samples and differs from the single-sample form. The conceptual structure (point estimate ± z · SE) is the same, but the standard error itself is different in each case. Use a specialised two-sample tool, R's t.test or Python's scipy.stats.ttest_ind for difference-in-means intervals.

How big does my sample need to be for the calculator to work?

The z-interval is exact when σ is known, regardless of sample size, but rests on the sampling distribution of the mean being approximately normal. The central limit theorem makes that approximation excellent at n ≥ 15 for symmetric mound-shaped data, n ≥ 30 for mildly skewed data, and n ≥ 100 or more for heavy-skewed populations (income, response times, hospital bills). For very small samples (n < 10) from non-normal populations, neither the z- nor the t-interval is reliable — use a bootstrap interval or a non-parametric method instead. The calculator does not refuse small samples but the user is responsible for checking that the central limit theorem assumption is reasonable.

Why do I keep seeing different z values in different sources — 1.96 vs 1.959964?

Both are the same number rounded to different precisions. 1.96 is the textbook two-significant-figure version; 1.959964 is what you get if you compute Φ⁻¹(0.975) to six decimals; 1.95996398454... is the exact infinite-precision value. The difference matters when you are checking your work against statistical software — software uses the unrounded value, so a manual calculation with 1.96 will disagree at the third decimal of the margin. This calculator uses the unrounded constants internally so its output matches R, Python and Excel to four decimals; the displayed z critical value rounds to four decimals for readability.

Related calculators

Informational only. Not personalised financial, legal, or tax advice.