Statistics is the language of data. It is how scientists determine whether a drug works, how pollsters predict elections, how quality engineers detect manufacturing defects, and how financial analysts assess risk. Yet most people encounter statistics only as an intimidating college requirement. This guide explains the essential concepts in plain language, with real-world examples that show why each concept matters and how to use the tools that do the math for you.
These three numbers each describe the "middle" of a dataset, but they do it differently — and which one you choose matters.
Mean (average): Add all values and divide by the count. For the dataset {4, 7, 8, 9, 12}: mean = 40 ÷ 5 = 8. The mean is sensitive to extreme values (outliers). If one value changes from 12 to 120, the mean jumps from 8 to 29.6, even though most values are still single digits.
Median: The middle value when data is sorted. For {4, 7, 8, 9, 12}: median = 8 (the third value in a five-item set). For even-numbered datasets, the median is the average of the two middle values. The median is resistant to outliers — changing 12 to 120 does not change the median at all. This is why median household income ($74,580 in the U.S.) is preferred over average household income ($105,000+) — the average is inflated by very high earners.
Mode: The most frequently occurring value. For {4, 7, 7, 9, 12}: mode = 7. A dataset can have no mode (all unique values), one mode, or multiple modes (bimodal, multimodal). Mode is most useful for categorical data: "the most common shoe size" or "the most popular color."
Use the Mean, Median, Mode Calculator to compute all three instantly for any dataset.
When to use which: Use mean for symmetric data without outliers (test scores, temperatures). Use median for skewed data or data with outliers (income, home prices, age at a concert). Use mode for categorical data or to find the most common value.
Standard deviation measures dispersion — how far individual values typically fall from the mean. A low standard deviation means data points cluster tightly around the average. A high standard deviation means they are spread widely.
Worked example: Two classrooms both average 80% on a test, but with very different distributions:
| Classroom | Scores | Mean | Standard Deviation |
|---|---|---|---|
| Class A | 78, 79, 80, 81, 82 | 80 | 1.6 |
| Class B | 55, 70, 80, 90, 105 | 80 | 18.7 |
Both classes average 80, but Class A's scores are nearly identical (SD = 1.6), while Class B has a huge range from 55 to 105 (SD = 18.7). The standard deviation reveals what the mean hides: in Class B, several students are struggling badly while others are excelling. Use the Standard Deviation Calculator to compute standard deviation for any dataset.
Many natural phenomena produce data that follows a characteristic bell-shaped pattern: most values cluster near the center, with progressively fewer values as you move toward the extremes. This pattern is called the normal distribution.
When data is normally distributed, a powerful set of rules applies:
| Range | Percentage of Data | Example (Mean=100, SD=15, like IQ) |
|---|---|---|
| Within ±1 SD of mean | 68.2% | 85–115 |
| Within ±2 SD of mean | 95.4% | 70–130 |
| Within ±3 SD of mean | 99.7% | 55–145 |
This is called the 68-95-99.7 rule (or empirical rule). It only applies to normally distributed data.
This rule has profound practical implications. If a manufacturing process produces bolts with a mean diameter of 10.0 mm and a standard deviation of 0.02 mm, you know that 99.7% of bolts will be between 9.94 and 10.06 mm. Any bolt outside ±3 SD is almost certainly defective. This is the foundation of quality control in every industry.
A z-score converts any value to a universal scale that tells you how many standard deviations it is from the mean.
Formula: z = (value − mean) ÷ standard deviation
Example: On a test with mean = 75 and SD = 10, a score of 90 has a z-score of (90 − 75) ÷ 10 = 1.5. This means the score is 1.5 standard deviations above average. From the normal distribution table, a z-score of 1.5 corresponds to approximately the 93rd percentile — better than 93% of all scores.
Z-scores solve the comparison problem. If you scored 85 on a history test (mean 70, SD 10) and 42 on a chemistry test (mean 35, SD 5), which performance was better? The history z-score is 1.5; the chemistry z-score is 1.4. You performed slightly better on history relative to each class, despite the raw numbers being on completely different scales. Use the Z-Score Calculator to convert any value.
Probability is a number between 0 and 1 (or 0% and 100%) that expresses the likelihood of an event occurring. 0 means impossible; 1 means certain.
Basic probability: P(event) = favorable outcomes ÷ total outcomes. A fair die has P(rolling a 3) = 1/6 ≈ 16.7%.
Independent events: When one event does not affect another, multiply probabilities. P(two heads in a row) = 0.5 × 0.5 = 0.25 (25%). P(rolling 6 three times in a row) = (1/6)³ ≈ 0.46%.
Complementary events: P(not A) = 1 − P(A). If there is a 30% chance of rain, there is a 70% chance of no rain.
For more complex probability scenarios involving successes in multiple trials, the Binomial Probability Calculator computes exact probabilities. For general probability calculations, use the Probability Calculator.
A percentile tells you the percentage of values in a dataset that fall below a given value. If your test score is at the 85th percentile, you scored higher than 85% of test-takers.
Percentiles are how standardized tests (SAT, GRE, ACT), growth charts (pediatric height and weight), and many health metrics (blood pressure, bone density) report results. They provide context that raw numbers alone cannot: a blood pressure reading of 135/85 is more meaningful when you know it is at the 75th percentile for your age group.
Z-scores and percentiles are directly related through the normal distribution. A z-score of 0 = 50th percentile. A z-score of +1 ≈ 84th percentile. A z-score of +2 ≈ 98th percentile. A z-score of −1 ≈ 16th percentile.
Confusing correlation with causation. Ice cream sales and drowning deaths both increase in summer. They are correlated, but ice cream does not cause drowning — a third variable (warm weather) drives both. Correlation measures the strength of a relationship; it says nothing about which variable causes the other, or whether a hidden variable drives both.
Ignoring sample size. Flipping a coin 10 times and getting 7 heads (70%) does not mean the coin is unfair. With only 10 trials, high variability is expected. Flipping 10,000 times and getting 7,000 heads (70%) would be strong evidence of an unfair coin. Larger samples produce more reliable statistics.
Reporting mean when median is more appropriate. Average net worth in the 55–64 age group is $1.57 million; median is $364,500. The average is pulled up by billionaires and is not representative of the typical household. When data is skewed, the median is almost always the better summary statistic.
Cherry-picking time periods. "The stock market returned 25% last year" and "The stock market lost 15% in the last two years" can both be true depending on the time period selected. Always ask: what is the full time period, and is the selected window representative?
Compute statistics instantly. Use the free Statistics Calculator for full descriptive statistics, the Z-Score Calculator for standardized scores, and the Probability Calculator for likelihood calculations — no signup required.
Related tools: Standard Deviation Calculator · Mean, Median, Mode Calculator · Binomial Probability Calculator · Percentage Calculator · Percent Error Calculator