Descriptive Statistics
Descriptive statistics is a branch of statistics that is concerned with describing the characteristics of the known data. Descriptive statistics provides summaries about either the population data or the sample data. Apart from descriptive statistics, inferential statistics is another crucial branch of statistics that is used to make inferences about the population data.
Descriptive statistics can be broadly classified into two categories - measures of central tendency and measures of dispersion. In this article, we will learn more about descriptive statistics, its various types, formulas, and see associated examples.
What are Descriptive Statistics?
Descriptive statistics are used to quantitatively or visually summarize the features of a sample. By using certain tools data from a sample can be analyzed to catch certain trends or patterns followed by it. It helps to organize the data in a more manageable and readable format.
Descriptive Statistics Definition
Descriptive statistics can be defined as a field of statistics that is used to summarize the characteristics of a sample by utilizing certain quantitative techniques. It helps to provide simple and precise summaries of the sample and the observations using measures like mean, median, variance, graphs, and charts. Univariate descriptive statistics are used to describe data containing only one variable. On the other hand, bivariate and multivariate descriptive statistics are used to describe data with multiple variables.
Types of Descriptive Statistics
Measures of central tendency and measures of dispersion are two types of descriptive statistics that are used to quantitatively summarize the characteristics of grouped and ungrouped data. When an experiment is conducted, the raw data obtained is known as ungrouped data. When this data is organized logically it is known as grouped data. To visually represent data, descriptive statistics use graphs, charts, and tables. Some important types of descriptive statistics are given below.
Measures of Central Tendency
In descriptive statistics, the measures of central tendency are used to describe data by determining a single representative central value. The important measures of central tendency are given below:
Mean: The mean can be defined as the sum of all observations divided by the total number of observations. The formulas for the mean are given as follows:
Ungrouped data Mean: x̄ = Σ\(x_{i}\) / n
Grouped data Mean: x̄ = \(\frac{\sum M_{i}f_{i}}{\sum f_{i}}\)
Here, \(x_{i}\) is the ith observation, \(M_{i}\) is the midpoint of the ith interval, \(f_{i}\) is the corresponding frequency and n is the sample size.
Median: The median can be defined as the center-most observation that is obtained by arranging the data in ascending order. The formulas for the median are given as follows:
Ungrouped data Median (n is odd): [(n + 1) / 2]th term
Ungrouped data Median (n is even): [(n / 2)th term + ((n / 2) + 1)th term] / 2
Grouped data Median: l + [((n / 2) - c) / f] × h
l is the lower limit of the median class given by n / 2, c is the cumulative frequency, f is the frequency of the median class and h is the class height.
Mode: The mode is the most frequently occurring observation in the data set. The formulas for the mode are given as follows:
Ungrouped data Mode: Most recurrent observation
Grouped data Mode: L + h \(\frac{\left(f_{m}-f_{1}\right)}{\left(f_{m}-f_{1}\right)+\left(f_{m}-f_{2}\right)}\)
L is the lower limit of the modal class, h is the class height, f\(_m\) is the frequency of the modal class, f\(_1\) is the frequency of the class preceding the modal class and f\(_2\) is the frequency of the class succeeding the modal class.
Measures of Dispersion
In descriptive statistics, the measures of dispersion are used to determine how spread out a distribution is with respect to the central value. The important measures of dispersion are given below:
Range: The range can be defined as the difference between the highest value and the lowest value. The formula is given as follows:
Range = H - S
H is the highest value and S is the lowest value in a data set.
Variance: The variance gives the variability of the distribution with respect to the mean. The formulas for the variance are given as follows:
Grouped Data Sample Variance, s2 = \(\sum \frac{f\left ( M_{i}-\overline{X} \right )^{2}}{N - 1}\)
Grouped Data Population Variance, σ2 = \(\sum \frac{f\left ( M_{i}-\overline{X} \right )^{2}}{N}\)
Ungrouped Data Sample Variance, s2 = \(\sum \frac{\left ( X_{i}-\overline{X} \right )^{2}}{n - 1}\)
Ungrouped Data Population Variance, σ2 = \(\sum \frac{\left ( X_{i}-\overline{X} \right )^{2}}{n}\)
where, \(\overline{X}\) stands for mean, \(M_{i}\) is the midpoint of the ith interval, \(X_{i}\) is the ith data point, N is the summation of all frequencies and n is the number of observations.
Standard Deviation: The square root of the variance will result in the standard deviation. It helps to analyze the variability in a data set in a more effective manner as compared to the variance. The formula is given as follows:
Standard Deviation: S.D. = √Variance = σ
Mean Deviation: The mean deviation will give the average of the absolute value of the data about the mean, median, or mode. It is also known as absolute deviation. The formula is given as follows:
Mean Deviation = \(\sum_{1}^{n}\frac{|X - \overline{X}|}{n}\)
where \(\overline{X}\) is the central value.
Quartile Deviation: Half of the difference between the third and first quartile gives the quartile deviation. The formula is given as follows:
Quartile deviation = \(\frac{Q_{3}-Q_{1}}{2}\)
Other measures of dispersion include the relative measures also known as the coefficients of dispersion.
Descriptive Statistics Representations
Descriptive statistics can also be used to summarize data visually before quantitative methods of analysis are applied to them. Some important forms of representations of descriptive statistics are as follows:
Frequency Distribution Tables: These can be either simple or grouped frequency distribution tables. They are used to show the distribution of values or classes along with the corresponding frequencies. Such tables are very useful in making charts as well as catching patterns in data.
Graphs and Charts: Graphs and charts help to represent data in a completely visual format. It can be used to display percentages, distributions, and frequencies. Scatter plots, bar graphs, pie charts, etc., are some graphs that are used in descriptive statistics.
Descriptive Statistics Examples
Descriptive statistics help to provide the summary statistics for different data sets thereby, enabling comparison. The descriptive statistics examples are given as follows:
- Suppose the marks of students belonging to class A are {70, 85, 90, 65) and class B are {60, 40, 89, 96}. Then the average marks of each class can be given by the mean as 77.5 and 71.25. This denotes that the average of class A is more than class B.
- Using the same example, suppose it needs to be determined how far apart the most extreme responses are then the range is used. Range A = 25 and Range B = 56, thus, depicting that the range of class B is higher than the range of class A.
Descriptive Statistics vs Inferential Statistics
Inferential and descriptive statistics are both used to analyze data. Descriptive statistics helps to describe the data quantitatively while inferential statistics uses these parameters to make inferences about the population. The differences between descriptive statistics and inferential statistics are given below.
Descriptive Statistics | Inferential Statistics |
---|---|
It is used to describe the characteristics of either the sample or the population by using quantitative tools. | It is used to draw inferences about the population data from the sample data by making use of analytical tools. |
Measures of central tendency and measures of dispersion are the most important types of descriptive statistics. | Hypothesis testing and regression analysis are the types of inferential statistics. |
It is used to describe the characteristics of a known dataset. | It tries to make inferences about the population that goes beyond the known data. |
Measures of descriptive statistics are mean, median, variance, range, etc. | Measures of inferential statistics are z test, f test, linear regression, ANOVA test, etc. |
Related Articles:
Important Notes on Descriptive Statistics
- Descriptive statistics are used to describe the features of a sample or population using quantitative analysis methods.
- Descriptive statistics can be classified into measures of central tendency and measures of dispersion.
- Mean, mode, standard deviation, etc., are some measures of descriptive statistics.
- Data of descriptive statistics can be visually represented using tables, charts, and graphs.
Examples on Descriptive Statistics
-
Example 1: Using descriptive statistics, find the mean and mode of the given data.
{1, 4, 6, 1, 8, 15, 18, 1, 5, 1}
Solution: Total number of observations = 10
Sum of observations = 1 + 4 + 6 + 1 + 8 + 15 + 18 + 1 + 5 + 1 = 60
Mean = 60 / 10 = 6
Mode = Most frequently occurring observation = 1
Answer: Mean = 6, Mode = 1
-
Example 2: Find the sample variance of the following data
{7, 11, 15, 18, 36, 43}
Solution: Sample variance formula, s2 = \(\sum \frac{\left ( X_{i}-\overline{X} \right )^{2}}{n - 1}\)
Mean, \(\overline{X}\) = 21.67, n = 6
s2 = [(7 - 21.67)2 + (11 - 21.67)2 + (15 - 21.67)2 + (18 - 21.67)2 + (36 - 21.67)2 + (43 - 21.67)2] / 6 - 1
= 209.47
Answer: s2 = 209.47
-
Example 3: Find the median and the mean deviation about the median for the given data
{9, 10, 12, 16, 17, 17, 18, 20}
Solution: n = 8
Median = [(n / 2)th term + ((n / 2) + 1)th term] / 2
= [(8 / 2)th term + ((8 / 2) + 1)th term] / 2
= (4th term + 5th term) / 2
= (16 + 17) / 2 = 16.5
Mean deviation about median = \(\sum_{1}^{n}\frac{|X - 16.5|}{n}\)
= [|9 - 16.5| + |10 - 16.5| + |12 - 16.5| + |16 - 16.5| + |17 - 16.5| + |17 - 16.5| + |18 - 16.5| + |20 - 16.5| ] / 8
= 3.125
Answer: Median = 16.5, Mean deviation about median = 3.125
FAQs on Descriptive Statistics
What is the Meaning of Descriptive Statistics?
Descriptive statistics is a branch of statistics that focuses on describing the characteristics of a sample or a population by using various quantitative methods.
What are the Types of Descriptive Statistics?
Measures of central tendency and measures of dispersion are the two types of descriptive statistics. Apart from this graphs, charts and tables can also be used for a visual representation of data.
What are the Measures of Central Tendency in Descriptive Statistics?
There are three measures of central tendency in descriptive statistics. These are the mean, median, and mode.
What are the Important Formulas in Descriptive Statistics?
The important formulas in descriptive statistics are as follows:
- Mean = sum of all observations / number of observations.
- Median = [(n + 1) / 2]th term
- Variance = \(\sum \frac{\left ( X_{i}-\overline{X} \right )^{2}}{n - 1}\)
- Standard Deviation = √Variance
What are the Measures of Dispersion in Descriptive Statistics?
In descriptive statistics, the measures of dispersion are variance, standard deviation, range, mean deviation, quartile deviation, and coefficients of dispersion.
How to Represent Data in Descriptive Statistics?
The best way to get a visual representation of data in descriptive statistics is by using frequency distribution tables. These tables can also be used to create various graphs and charts that help in further analysis of data.
What is the Difference Between Descriptive Statistics and Inferential Statistics?
Descriptive statistics uses quantitative tools like mean, variance, range, etc., to describe the features of data. Inferential statistics uses analytical tools such as z test, t test, linear regression, etc., to make generalizations about the population from the sample data.
visual curriculum