Variance
Variance is a statistical measurement that is used to determine the spread of numbers in a data set with respect to the average value or the mean. The standard deviation squared will give us the variance. Using variance we can evaluate how stretched or squeezed a distribution is.
There can be two types of variances in statistics, namely, sample variance and population variance. The symbol of variance is given by σ2. Variance is widely used in hypothesis testing, checking the goodness of fit, and Monte Carlo sampling. To check how widely individual data points vary with respect to the mean we use variance. In this article, we will take a look at the definition, examples, formulas, applications, and properties of variance.
1. | What is Variance? |
2. | Variance Formula |
3. | Variance and Standard Deviation |
4. | How to Find Variance? |
5. | Variance and Covariance |
6. | Variance Properties |
7. | FAQs on Variance |
What is Variance?
Variance is a measure of dispersion. A measure of dispersion is a quantity that is used to check the variability of data about an average value. Data can be of two types - grouped and ungrouped. When data is expressed in the form of class intervals it is known as grouped data. On the other hand, if data consists of individual data points, it is called ungrouped data. The sample and population variance can be determined for both kinds of data.
Variance Definition
Population Variance - All the members of a group are known as the population. When we want to find how each data point in a given population varies or is spread out then we use the population variance. It is used to give the squared distance of each data point from the population mean.
Sample Variance - If the size of the population is too large then it is difficult to take each data point into consideration. In such a case, a select number of data points are picked up from the population to form the sample that can describe the entire group. Thus, the sample variance can be defined as the average of the squared distances from the mean. The variance is always calculated with respect to the sample mean.
A general definition of variance is that it is the expected value of the squared differences from the mean.
Variance Example
Suppose we have the data set {3, 5, 8, 1} and we want to find the population variance. The mean is given as (3 + 5 + 8 + 1) / 4 = 4.25. Then by using the definition of variance we get [(3 - 4.25)2 + (5 - 4.25)2 + (8 - 4.25)2 + (1 - 4.25)2] / 4 = 6.68. Thus, variance = 6.68.
Variance Formula
Depending upon the type of data available and what needs to be determined, the variance formula can be given as follows:
Grouped Data Sample Variance = \(\sum \frac{f\left ( M_{i}-\overline{X} \right )^{2}}{N - 1}\)
Grouped Data Population Variance = \(\sum \frac{f\left ( M_{i}-\overline{X} \right )^{2}}{N}\)
Ungrouped Data Sample Variance = \(\sum \frac{\left ( X_{i}-\overline{X} \right )^{2}}{n - 1}\)
Ungrouped Data Population Variance = \(\sum \frac{\left ( X_{i}-\overline{X} \right )^{2}}{n}\)
where, \(\overline{X}\) stands for mean, \(M_{i}\) is the midpoint of the ith interval, \(X_{i}\) is the ith data point, N is the summation of all frequencies and n is the number of observations.
Mean for grouped data = \(\frac{\sum M_{i}f_{i}}{\sum f_{i}}\)
The general formula for variance is given as,
Var (X) = E[( X – μ)2]
Variance and Standard Deviation
When we take the square of the standard deviation we get the variance of the given data. Intuitively we can think of the variance as a numerical value that is used to evaluate the variability of data about the mean. This implies that the variance shows how far each individual data point is from the average as well as from each other. When we want to find the dispersion of the data points relative to the mean we use the standard deviation. In other words, when we want to see how the observations in a data set differ from the mean, standard deviation is used. σ2 is the symbol to denote variance and σ represents the standard deviation. Variance is expressed in square units while the standard deviation has the same unit as the population or the sample.
How to Find Variance?
The following steps can be used to find the variance of ungrouped data:
- Find the mean of the observations. This can be done by dividing the sum of all observations by the number of observations.
- Subtract the mean from each observation.
- Square each of these values.
- Add all the values obtained in the previous step.
- Divide the value from step 4 by n (for population variance) or n - 1 (for sample variance).
Variance of Binomial Distribution
A binomial distribution is defined as a discrete probability distribution that details the number of successes when a binomial experiment is conducted n number of times. Each time the outcome of the experiment can only be either 0 or 1. Say we have a binomial experiment that consists of n number of trials and the probability of success in each trial is given by p, then the variance of the binomial distribution is given as:
σ2 = np (1 - p).
Here, np is also equal to the mean.
Variance of Poisson Distribution
Poisson distribution is another type of discrete probability distribution that gives the probability of a certain number of events taking place within a specific time frame. The parameter of a Poisson distribution is given by λ. In this distribution the mean and the variance are equal. The variance of the Poisson distribution is given by:
σ2 = λ
Variance of Uniform Distribution
Uniform distribution is a type of continuous probability distribution. It is also known as a rectangular distribution as the outcome of the experiment will lie between a minimum and maximum bound. If a is the minimum bound and b is the maximum bound, then the variance of uniform distribution is as follows:
σ2 = (1/12)(b - a)2
The mean is given by (b + a) / 2.
Variance and Covariance
Variance is used to describe the spread of the data set and identify how far each data point lies from the mean. Covariance shows us how two random variables will be related to each other. It measures how one variable will get affected due to a change in the other random variable. If we have a positive covariance, it implies that both the variables are moving in the same direction. However, if we have a negative covariance, it means that both variables are moving in opposite directions. Suppose we have two random variables x and y. Here, x is the dependent variable and y is the independent variable. Let n be the number of data points in the sample, \(\overline{x}\) is the mean of x and \(\overline{y}\) is the mean of y, then the formula for covariance is given below:
cov (x, y) = \(\frac{\sum_{i = 1}^{n}(x_{i} - \overline{x})(y_{i} - \overline{y})}{n - 1}\)
Variance Properties
Some of the properties of variance are given below that can help in solving both simple and complicated problem sums.
- If the value of the variance is 0, it indicates that all the data points in the data set are of equal value.
- A large variance implies that the data is more vastly spread out from the mean. Similarly, a small variance shows that the values of the data points are closer together and are clustered around the mean.
- Var(X + C) = Var(X), where X is a random variable and C is a constant.
- Var(aX + b) = a2, here a and b are constants.
- Var(CX) = C2
Var(X), C is a constant. - Var(X1 + X2 +……+ Xn) = Var(X1) + Var(X2) +……..+Var(Xn) where X1, X2,……, Xn are independent random variables.
Related Articles:
Important Notes on Variance
- Variance is a measure of the variability of data and describes how the data points are spread out with respect to the mean.
- There can be two types of variance - sample variance and population variance.
- There can be two kinds of data - grouped and ungrouped. Thus, we can have grouped sample variance, ungrouped sample variance, grouped population variance, and ungrouped population variance.
- The variance is the standard deviation squared.
- Covariance describes how a dependent and an independent random variable are related to each other.
Examples on Variance
-
Example 1: Find the sample variance of the data (3, 4, 7, 12, 14).
Solution: n = 5
Mean = (3 + 4 + 7 + 12 + 14) / 5 = 8
Sample Variance = \(\sum \frac{\left ( X_{i}-\overline{X} \right )^{2}}{N - 1}\)
[(3 - 8)2 + (4 - 8)2 + (7 - 8)2 + (12 - 8)2 + (14 - 8)2) / 5 - 1 = 23.5
Answer: Variance = 23.5 -
Example 2: Find the population variance of the data (1.2, 4.5, 6.7, 2.3).
Solution: n = 4
Mean = (1.2 + 4.5 + 6.7 + 2.3) / 4 = 3.675
Population Variance = \(\sum \frac{\left ( X_{i}-\overline{X} \right )^{2}}{N}\)
[(1.2 - 3.675)2 + (4.5 - 3.675)2 + (6.7 - 3.675)2 + (2.3 - 3.675)2] / 4 = 4.461
Answer: Variance = 4.461 -
Example 3: Find the sample variance of
Class 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 Frequency 2 5 7 3 1
Solution:
The height of the interval is 10Class Frequency \(M_{i}\) d = (\(M_{i}\) - B) / 10 fd d2f 10 - 20 2 15 -2 -4 8 20 - 30 5 25 -1 -5 5 30 - 40 7 35 = B 0 0 0 40 - 50 3 45 1 3 3 50 - 60 1 55 2 2 4 Mean = \(\frac{\sum M_{i}f_{i}}{\sum f_{i}}\) = 32.778.
Variance = \(\frac{\sum fd^{2} - \frac{(\sum fd)^{2}}{n}}{n-1} . 10^{2}\) = 112.4183. [This formula can be derived from \(\sum \frac{f\left ( M_{i}-\overline{X} \right )^{2}}{N - 1}\) to simplify calculations]Answer: Variance = 112.4183
FAQs on Variance
What is Variance in Statistics?
Variance in Statistics is a measure of dispersion that indicates the variability of the data points with respect to the mean. Sample Variance and Population Variance are the two types of variance.
How to Calculate Variance?
The variance of ungrouped data can be calculated by using the following steps:
- Find the mean and subtract it from each data point.
- Take the summation of the squares of the values obtained in step 1.
- Divide this value by n(number of observations) or n - 1 if the population or sample variance needs to be calculated respectively.
What Does Variance Tell You About Data?
Variance tells us how spread out the data is with respect to the mean. If the data is more widely spread out with reference to the mean then the variance will be higher. If the data is clustered near the mean then the variance will be lower.
What is Variance and Standard Deviation?
The square of the standard deviation gives us the variance. The standard deviation will have the same unit as the data while the unit of the variance will differ as it is a squared value.
Is Variance a Measure of Dispersion?
Variance and standard deviation are the most commonly used measures of dispersion. Standard deviation is the square root of the variance. These measures help to determine the dispersion of the data points with respect to the mean.
Is Variance a Measure of Central Tendency?
Variance is not a measure of central tendency. There are three measures of central tendency, namely, mean, median, and mode. Variance is a measure of dispersion.
What are the Advantages of Variance?
One of the major advantages of variance is that regardless of the direction of data points, the variance will always treat deviations from the mean like the same. Moreover, variance can be used to check the variability within the data set.
visual curriculum