Standard deviation

03How it works

Why this calculation

A single average rarely tells the whole story about a dataset. Two classes can both have a mean grade of 12 out of 20, yet one might be tightly clustered around that mean while the other splits into a group of high achievers and a group struggling to keep up. Two factories can produce parts with the same average diameter, yet one ships consistent goods and the other rejects half its output for being out of tolerance. The standard deviation captures exactly the dimension the mean is blind to: how spread out the values are around the centre. It is the single most-used statistic outside of the mean itself, the building block of confidence intervals, hypothesis tests, control charts, and risk metrics, and it is the first number any data analyst computes after the average. This calculator takes a list of numbers (separated by spaces, commas, or newlines), and returns the mean, the median, the variance, the standard deviation under both conventions (sample and population), the range, and the coefficient of variation — enough to characterize a dataset's location and spread in one screen.

The formula

The standard deviation is the square root of the variance, and the variance is the average squared deviation from the mean. There are two flavors. The population standard deviation σ uses the formula σ = √(Σ(xᵢ − μ)² / N), where N is the number of values and μ is the mean. The sample standard deviation s uses Σ(xᵢ − x̄)² / (n − 1), with the divisor reduced by one — this is Bessel's correction, and it removes a small bias that would otherwise make the sample standard deviation systematically underestimate the true population standard deviation when computed from a finite sample. When the data you have is the entire population (every student in the class, every screw produced today), use the population formula. When the data is a sample drawn from a larger population (a poll of 1 000 voters, a quality-control sample of 50 parts), use the sample formula. The default in this calculator is the sample formula because it matches the most common case: you have a sample and you want to estimate the population SD. The coefficient of variation is the standard deviation divided by the mean, expressed as a percentage; it is unitless and lets you compare spread across datasets with different units or different scales.

How to use it

Paste or type your numbers in the textarea. Separators can be commas, spaces, semicolons, or newlines (or any mixture of those). Non-numeric tokens are silently ignored, so you can paste a column straight out of a spreadsheet without cleaning it up. The toggle below switches between sample and population standard deviation. The result panel shows the SD as the headline KPI, alongside the mean, the median (which is robust to outliers in a way the mean is not), the variance, the range (max minus min), and the coefficient of variation. The default series is ten grades scattered between 9 and 18, which gives a mean of 13.5 and a sample SD of 2.51 — concrete numbers to play with.

Worked example

Take the dataset 12, 14, 11, 15, 13, 16, 10, 18, 14, 12 (n = 10). The mean is the sum divided by n: (12+14+11+15+13+16+10+18+14+12)/10 = 135/10 = 13.5. The squared deviations from the mean are 2.25, 0.25, 6.25, 2.25, 0.25, 6.25, 12.25, 20.25, 0.25, 2.25, summing to 52.5. Divide by n = 10 (population) and get a variance of 5.25 and a population SD of √5.25 ≈ 2.29. Divide by n − 1 = 9 (sample) and get a variance of ≈ 5.83 and a sample SD of ≈ 2.41. The median, sorted, is the average of the fifth and sixth values, (13 + 14)/2 = 13.5 — equal to the mean, suggesting a roughly symmetric distribution. The range is 18 − 10 = 8. The coefficient of variation is 2.41 / 13.5 ≈ 17.9 %, meaning the spread is about a sixth of the average size — a typical level of variability for grades.

Common pitfalls

First, picking the wrong divisor. The sample formula uses n − 1; using n on a small sample biases the SD downward. The difference matters most when n is small: at n = 5, the two differ by about ten percent; at n = 100, by half a percent. Second, treating the standard deviation as a confidence interval. SD describes the spread of the data; the standard error of the mean (SE = SD / √n) describes the uncertainty about the average. They are off by a factor of √n. Third, computing SD on data that is not normal. The 68/95/99.7 rule (one, two, three SDs cover those percentages of the data) only holds for a Gaussian distribution. Skewed or fat-tailed data will have far more outliers than the rule predicts. Fourth, mixing units. Variance is in the units squared, SD is in the original units; charts that plot variance against an axis in the original units are misleading. Fifth, ignoring outliers. A single extreme value can inflate the SD beyond all recognition; the median absolute deviation (MAD) is a more robust alternative.

Variations & context

The SD has many cousins. The interquartile range (Q3 − Q1) ignores the top and bottom 25 % and is robust to outliers. The median absolute deviation is the median of the absolute deviations from the median — an even more robust spread statistic. Mean absolute deviation uses absolute values rather than squares and is closer to a layperson's intuition of "average distance from the average," but is less mathematically tractable, which is why squares won the historical argument. Weighted standard deviation allows different observations to count more than others (useful when data points represent groups of different sizes). In financial markets, the SD of returns is what people call volatility, usually quoted on an annual basis after multiplying by √(trading days per year). In physics and engineering, the SD is reported as uncertainty on a measurement; in psychometrics it underlies the z-score, in quality control the process capability index Cpk, and in machine learning the standard scaler that normalises features to mean zero and SD one before training.