Let’s say we are given n numbers [x1, x2, x3, … xn]. The variance of the given n numbers is given by the formula:
And the standard deviation of the given n numbers is given by the formula:
Please note that the variance we calculate using the mentioned formula is the true variance. If we want to calculate the sample variance of a sample and use the sample variance to infer the variance of the population, then we need to use (n-1) in the denominator instead of n. So, the sample variance of the given n numbers is given by the formula:
And the sample standard deviation is given by the formula:
In Python, we can easily calculate the true variance, sample variance, true standard deviation and sample standard deviation of a set of n numbers using the numpy module.
import numpy
data = [1, 2, 3, 3, 4, 5, 5, 5, 8, 10]
true_variance = numpy.var(data)
sample_variance = numpy.var(data, ddof=1)
true_std = numpy.std(data)
sample_std = numpy.std(data, ddof=1)
print("True Variance: ", true_variance)
print("Sample Variance: ", sample_variance)
print("True Standard Deviation: ", true_std)
print("Sample Standard Deviation: ", sample_std)
The output of the above program will be:








































0 Comments