Sample Variance
Sample variance is used to calculate the variability in a given sample. A sample is a set of observations that are pulled from a population and can completely represent it. The sample variance is measured with respect to the mean of the data set. It is also known as the estimated variance.
As data can be of two types, grouped and ungrouped, hence, there are two formulas that are available to calculate the sample variance. Furthermore, the square root of the sample variance results in the sample standard deviation. In this article, we will elaborate on sample variance, its formulas, and various examples.
1. | What is Sample Variance? |
2. | Sample Variance Formula |
3. | How to Calculate Sample Variance? |
4. | Sample Variance vs Population Variance |
5. | FAQs on Sample Variance |
What is Sample Variance?
Sample variance is used to measure the spread of the data points in a given data set around the mean. All observations of a group are known as the population. When the number of observations start increasing it becomes difficult to calculate the variance of the population. In such a situation, a certain number of observations are picked out that can be used to describe the entire group. This specific set of observations form a sample and the variance so calculated is the sample variance.
Sample Variance Definition
Sample variance can be defined as the expectation of the squared difference of data points from the mean of the data set. It is an absolute measure of dispersion and is used to check the deviation of data points with respect to the data's average.
Sample Variance Example
Suppose a data set is given as 3, 21, 98, 17, and 9. The mean (29.6) of the data set is determined. The mean is subtracted from each data point and the summation of the square of the resulting values is taken. This gives 6043.2. To get the sample variance, this number is divided by one less than the total number of observations. Thus, the sample variance is 1510.8.
Sample Variance Formula
There can be two types of data - grouped and ungrouped. When data is in a raw and unorganized form it is known as ungrouped data. When this data is sorted into groups, categories, or tables it is known as grouped data. The sample variance formulas for both types of data are specified below:
- Ungrouped Data: s2 = \(\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{n-1}\)
- Grouped data: s2 = \( \frac{\sum_{i=1}^{n}f\left ( m_{i}-\overline{x} \right )^{2}}{N - 1}\)
n = total number of observations.
N = \(\sum_{i=1}^{n} f_{i}\)
f = the frequency of occurrence of an observation for grouped data
\(m_{i}\) = Mid-point of the ith interval
Mean for grouped data, \(\overline{x}\) = \(\frac{\sum_{i=1}^{n} m_{i}f_{i}}{\sum_{i=1}^{n} f_{i}}\)
Mean for ungrouped data, \(\mu = \frac{\sum_{i=1}^{n}x_{i}}{n}\)
The sample variance, on average, is equal to the population variance.
Let us understand the sample variance formula with the help of an example.
Example: There are 45 students in a class. 5 students were randomly selected from this class and their heights (in cm) were recorded as follows:
131 |
148 |
139 |
142 |
152 |
Sample size (n) = 5
Sample Mean = (131 + 148 + 139 + 142 +152 ) / 5 = 712 / 5 = 142.4 cm
Using the sample variance formula,
Sample Variance = \(\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{n-1}\) = \(\frac{\sum_{i=1}^{5}(x_{i}-142.4)^{2}}{5-1}\)
= [(131−142.4)2+(148−142.4)2+(139−142.4)2+(142−142.4)2+(152−142.4)2] / 4 = 66.3 cm2
Answer: Sample Mean = 142.4 cm, Sample Variance = 66.3 cm2.
How to Calculate Sample Variance?
Depending upon the type of data available, there can be different steps that can be used to calculate the sample variance. However, the general algorithm that should be followed is given below:
Suppose the data set is given as {5, 6, 1}
- Step 1: Calculate the mean of the data set. The mean can be defined as the sum of all observations divided by the total number of observations. Add all data values and divide by the sample size n. Thus, (5 + 6 + 1) / 3 = 4
- Step 2: Subtract the mean from each data point in the data set. This gives (5 - 4), (6 - 4), (1 - 4).
- Step 3: Take the square of the values obtained in step 2; (5 - 4)2 = 1, (6 - 4)2 = 4, (1 - 4)2 = 9
- Step 4: Add all the squared differences from step 3; 1 + 4 + 9 = 14
- Step 5: To get the sample variance, divide this value by one less than the total number of observations; 14 / (3 - 1) = 7. Thus, for the given example the sample variance is 7.
Sample Variance vs Population Variance
Both sample variance and population variance are used to measure how far a data point is from the mean of the data set. However, the value of the sample variance is higher than the population variance. The table given below outlines the difference between sample variance and population variance.
Sample Variance | Population Variance |
---|---|
When the variance is calculated using the sample data it gives the sample variance. | When the variance is calculated using the entire data, also known as the population, it gives the population variance. |
The formula for sample variance is given as \(\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{n-1}\) | The formula for population variance is equal to \(\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{n}\) |
Related Articles:
Important Notes on Sample Variance
- The variance that is computed using the sample data is known as the sample variance.
- Sample variance can be defined as the average of the squared differences from the mean.
- There are two formulas to calculate the sample variance: \(\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{n-1}\) (ungrouped data) and \( \frac{\sum_{i=1}^{n}f\left ( m_{i}-\overline{x} \right )^{2}}{n - 1}\) (grouped data)
Examples on Sample Variance
-
Example 1: Given the data set {4.5, 9.8, 2.3, 5.3, 8.9}, find the sample variance.
Solution: n = 5, \(\overline{x}\) = (4.5 + 9.8 + 2.3 + 5.3 + 8.9) / 5 = 6.16
Sample Variance = \(\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{n-1}\)
= \(\frac{(4.5 - 6.16)^{2}+ (9.8- 6.16)^{2}+ (2.3- 6.16)^{2}+ (5.3- 6.16)^{2}+ (8.9- 6.16)^{2}}{5-1}\) = 9.788
Answer: Sample variance = 9.788 -
Example 2: Find the sample variance of the following data:
\(x_{i}\) f 10 - 20 7 20 - 30 2 30 - 40 10 40 - 50 1
Solution:\(x_{i}\) f \(m_{i}\) (\(m_{i} - \overline{x}\))2 f(\(m_{i} - \overline{x}\))2 10 - 20 7 15 156.25 1093.75 20 - 30 2 25 6.25 12.5 30 - 40 10 35 56.25 562.5 40 - 50 1 45 306.25 306.25 \(\sum_{i=1}^{n} f_{i}\) = 20 \(\sum_{i=1}^{n}f\left ( m_{i}-\overline{x} \right )^{2}\) = 1975 \(\overline{x}\) = \(\frac{\sum_{i=1}^{n} m_{i}f_{i}}{\sum_{i=1}^{n} f_{i}}\) = 27.5
Sample Variance = \( \frac{\sum_{i=1}^{n}f\left ( m_{i}-\overline{x} \right )^{2}}{N - 1}\) = 1975 / (20 - 1) = 103.94
Answer: Sample variance = 103.94 -
Example 3: There were 105 oak trees in a forest. 6 were randomly selected and their heights were recorded in meters. Find the variance and standard deviation in the heights.
Heights (in m) = {43, 65, 52, 70, 48, 57}
Solution: As the variance of a sample needs to be calculated thus, the formula for sample variance is used.
n = 6, Mean = (43 + 65 + 52 + 70 + 48 + 57) / 6 = 55.833 m.
Sample Variance = \(\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{n-1}\) = 526.833 / (6 - 1) = 105.367 m2
Sample Standard Deviation = √105.367 = 10.26 m
Answer: Sample Variance = 105.367 m2, Sample Standard Deviation = 10.26 m -
Example 4: If all values in a data set are the same then the sample variance is equal to?
Solution:
Variance is the degree of spread or change in the given data points. The variance is calculated in relation to the mean of the data. The more the spread of the data, the more will be the variance in relation to the mean.
The formula for variance :
s2 = \(\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{n-1}\) ,
s2 = sample variance
\(x_{i}\) = Each data value
μ = mean of the data set
n = total number of values in the data set.
Special case: When all the data set points are the same
In this case, the mean of the data set i.e. μ is the same as each data value i.e. \(x_{i}\)
Thus, \(x_{i}\) - μ = 0
Hence, variance becomes 0.
FAQs on Sample Variance
What is the Sample Variance in Statistics?
In statistics, sample variance is calculated on the basis of sample data and is used to determine the deviation of data points from the mean.
What is the Symbol of Sample Variance?
Variance is usually represented using sigma square that is written as \(\sigma ^{2}\). However, to avoid confusion between population and sample variance, the latter is represented as s2.
What is the Formula for Sample Variance?
The formulas for sample variance are given as follows:
- Ungrouped Data: s2 = \(\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{n-1}\)
- Grouped data: s2 = \( \frac{\sum_{i=1}^{n}f\left ( m_{i}-\overline{x} \right )^{2}}{N - 1}\)
How to Find the Sample Variance?
The steps to find the sample variance are as follows:
- Find the mean of the data.
- Subtract the mean from each data point.
- Take the summation of the squares of values obtained in the previous step.
- Divide this value by n - 1.
Can Sample Variance be Negative?
No, the sample variance can never be negative. The sample variance is the square of the deviation from the mean. As a value resulting from a square can never be negative, thus, sample variance cannot be negative.
Is Sample Variance the Same as Standard Deviation?
The square root of the sample variance will result in the standard deviation. The unit of measurement of the sample variance will be different as compared to the data while the unit of the sample standard deviation will be the same.
What is the Difference Between Sample Variance and Population Variance?
The variance that is calculated using the sample data gives the sample variance while the population data gives population variance. The formula for sample variance is \(\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{n-1}\) and population variance is \(\frac{\sum_{i=1}^{n}(x_{i}-\mu)^{2}}{n}\)
What do Small and Big Variance Mean in Sample Variance Formula?
A small variance obtained using the sample variance formula indicates that the data points are close to the mean and to each other. A big variance indicates that the data values are spread out from the mean, and from one another.
visual curriculum