Statistical Estimation
Last reviewed: May 2026
Confidence intervals are a cornerstone of statistical inference, providing a range estimate rather than a single point estimate.[1] They are used throughout research, polling, quality control, and business analytics. The width of the interval tells you how precise your estimate is: narrower is better, but narrower requires more data. For determining how many samples you need, use the Sample Size Calculator.
| Confidence Level | Z-Score | Alpha | Use Case |
|---|---|---|---|
| 80% | 1.282 | 0.20 | Preliminary analysis |
| 90% | 1.645 | 0.10 | Surveys, business |
| 95% | 1.960 | 0.05 | Standard in most research |
| 99% | 2.576 | 0.01 | High-stakes decisions |
| 99.9% | 3.291 | 0.001 | Extreme precision |
The formula for a confidence interval for a population mean is: sample mean plus or minus the critical value times the standard error. The standard error equals the standard deviation divided by the square root of the sample size. For a 95% confidence interval with a known population standard deviation, the critical z-value is 1.96. So for a sample of 100 observations with a mean of 50 and a standard deviation of 10, the 95% CI is 50 plus or minus 1.96 times (10/sqrt(100)) = 50 plus or minus 1.96, giving the interval [48.04, 51.96]. This means we are 95% confident the true population mean falls within this range.
When the population standard deviation is unknown (which is almost always the case in practice), we use the sample standard deviation and the t-distribution instead of the z-distribution. The t-distribution has heavier tails, producing wider intervals that account for the additional uncertainty of estimating the standard deviation. With small samples (below 30), the difference between t and z intervals is substantial. A 95% t-interval with 10 observations uses a critical value of 2.228 instead of 1.96, making the interval about 14% wider. As sample size grows, the t-distribution approaches the z-distribution, and the distinction becomes negligible above n=30 or so.
Proportion confidence intervals use a different formula because the data is binary (yes/no, success/failure) rather than continuous. The standard error for a proportion is sqrt(p-hat times (1 - p-hat) divided by n), where p-hat is the sample proportion. For a survey where 520 out of 1,000 respondents favor a policy (p-hat = 0.52), the 95% CI is 0.52 plus or minus 1.96 times sqrt(0.52 times 0.48 / 1,000) = 0.52 plus or minus 0.031, giving [0.489, 0.551]. This is the basis for the "margin of error" reported in political polls. Determine the sample size needed for your desired precision with our Sample Size Calculator.
| Sample Size | Margin of Error (p=0.5, 95% CI) | Typical Application |
|---|---|---|
| 100 | +/- 9.8% | Pilot study |
| 400 | +/- 4.9% | Internal survey |
| 1,000 | +/- 3.1% | National poll |
| 2,500 | +/- 2.0% | Market research |
| 10,000 | +/- 1.0% | Census supplement |
Confidence intervals are among the most frequently misunderstood concepts in statistics. The most common error is saying "there is a 95% probability that the true value is in this interval." The correct interpretation is: if we repeated this study many times, 95% of the resulting intervals would contain the true value. Any single interval either contains the true value or it does not; we just do not know which. This subtle distinction matters because it reminds us that confidence intervals describe the reliability of the method, not the probability of a particular result.
Another misconception is that non-overlapping confidence intervals necessarily indicate statistically significant differences between two groups. In fact, two group means can differ significantly even when their 95% confidence intervals overlap slightly. The correct test for comparing means is either a two-sample t-test or a direct confidence interval for the difference between means. Conversely, overlapping intervals do suggest that the difference may not be significant, but the overlap itself is not a formal test. Always use appropriate hypothesis tests alongside confidence intervals rather than relying on visual overlap assessment. For significance testing, explore our P-Value Calculator and Statistics Calculator.
Clinical trials report confidence intervals for treatment effects to convey both the estimated benefit and the precision of that estimate. A drug that reduces blood pressure by 8 mmHg with a 95% CI of [5, 11] is convincingly effective because the entire interval excludes zero. A drug with the same 8 mmHg estimate but a CI of [-1, 17] might or might not be effective because the interval includes zero. The width of the interval reflects sample size and variability: larger trials produce narrower intervals and more definitive conclusions. Regulatory agencies like the FDA use confidence intervals to evaluate whether drug effects meet minimum clinically meaningful thresholds, not just whether they differ from zero. Quality control processes similarly use confidence intervals to verify that manufacturing tolerances are met, with tighter intervals required for safety-critical components. For related statistical analysis, use our Standard Deviation Calculator.
A common source of confusion is the difference between confidence intervals and prediction intervals. A confidence interval estimates where a population parameter (like the mean) falls. A prediction interval estimates where a single new observation will fall. Prediction intervals are always wider than confidence intervals because they must account for both the uncertainty in estimating the mean and the natural variability of individual observations. For example, if a factory produces bolts with a mean diameter of 10.0 mm and standard deviation of 0.1 mm, the 95% confidence interval for the mean based on 50 samples might be [9.97, 10.03], while the 95% prediction interval for a single new bolt would be approximately [9.80, 10.20].
The Bayesian alternative to confidence intervals is the credible interval, which has a more intuitive interpretation: a 95% credible interval means there is a 95% probability that the parameter falls within the interval, given the data and prior beliefs. This directly addresses the interpretation most people incorrectly apply to frequentist confidence intervals. Bayesian methods require specifying a prior distribution that represents existing knowledge before seeing the data, which some view as a strength (incorporating expert knowledge) and others as a weakness (introducing subjectivity). With large sample sizes, Bayesian credible intervals and frequentist confidence intervals often produce nearly identical results because the data overwhelms the prior. The distinction matters most with small samples or when strong prior information exists. For fundamental descriptive statistics underlying these methods, see our Mean, Median, Mode Calculator and Standard Deviation Calculator.
Confidence intervals can drive sample size planning by specifying the desired margin of error in advance. If a political pollster wants a margin of error no wider than plus or minus 2 percentage points at 95% confidence, they need a sample of approximately 2,401 respondents assuming the worst-case scenario of p=0.5. Quadrupling the sample size halves the margin of error, which explains why achieving very precise estimates becomes expensive. In medical research, sample size calculations additionally account for statistical power (typically 80% or higher) and the minimum clinically meaningful effect size, ensuring the study can detect real differences if they exist. Plan your research using our Sample Size Calculator.
→ 95% is the default in most fields. Use it unless your field specifies otherwise.[1]
→ Larger samples = narrower intervals. Quadrupling n halves the margin of error.
→ Report the full interval. Saying 52% ± 3% is more informative than just 52%.[2]
→ Check normality assumptions. For small samples, verify your data is approximately normal before using z-based intervals.
See also: Sample Size · Standard Deviation · Z-Score · Statistics