ANOVA Test
ANOVA Test is used to analyze the differences among the means of various groups using certain estimation procedures. ANOVA means analysis of variance. ANOVA test is a statistical significance test that is used to check whether the null hypothesis can be rejected or not during hypothesis testing.
An ANOVA test can be either one-way or two-way depending upon the number of independent variables. In this article, we will learn more about an ANOVA test, the one-way ANOVA and two-way ANOVA, its formulas and see certain associated examples.
1. | What is ANOVA Test? |
2. | ANOVA Formula |
3. | One Way ANOVA |
4. | Two Way ANOVA |
5. | FAQs on ANOVA Test |
What is ANOVA Test?
ANOVA test, in its simplest form, is used to check whether the means of three or more populations are equal or not. The ANOVA test applies when there are more than two independent groups. The goal of the ANOVA test is to check for variability within the groups as well as the variability among the groups. The ANOVA test statistic is given by the f test.
ANOVA Test Definition
ANOVA test can be defined as a type of test used in hypothesis testing to compare whether the means of two or more groups are equal or not. This test is used to check if the null hypothesis can be rejected or not depending upon the statistical significance exhibited by the parameters. The decision is made by comparing the ANOVA test statistic with the critical value.
ANOVA Test Example
Suppose it needs to be determined if consumption of a certain type of tea will result in a mean weight loss. Let there be three groups using three types of tea - green tea, earl grey tea, and jasmine tea. Thus, to compare if there was any mean weight loss exhibited by a certain group, the ANOVA test (one way) will be used.
Suppose a survey was conducted to check if there is an interaction between income and gender with anxiety level at job interviews. To conduct such a test a two-way ANOVA will be used.
ANOVA Formula
There are several components to the ANOVA formula. The best way to solve a problem on an ANOVA test is by organizing the formulas into an ANOVA table. The ANOVA formulas are given below.
Sum of squares between groups, SSB = \(\sum n_{j}(\overline{X}_{j}-\overline{X})^{2}\). Here, \(\overline{X}_{j}\) is the mean of the jth group, \(\overline{X}\) is the overall mean and \(n_{j}\) is the sample size of the jth group.
\(\overline{X}\) = \(\frac{\overline{X}_{1} + \overline{X}_{2} + \overline{X}_{3} + ... + \overline{X}_{j}}{j}\)
Sum of squares of errors, SSE = \(\sum\sum(X-\overline{X}_{j})^{2}\). Here, X refers to each data point in the jth group.
Total sum of squares, SST = SSB + SSE
Degrees of freedom between groups, df1 = k - 1. Here, k denotes the number of groups.
Degrees of freedom of errors, df2 = N - k, where N denotes the total number of observations across k groups.
Total degrees of freedom, df3 = N - 1.
Mean squares between groups, MSB = SSB / (k - 1)
Mean squares of errors, MSE = SSE / (N - k)
ANOVA test statistic, f = MSB / MSE
Critical Value at \(\alpha\) = F(\(\alpha\), k - 1, N - k)
ANOVA Table
The ANOVA formulas can be arranged systematically in the form of a table. This ANOVA table can be summarized as follows:
Source of Variation | Sum of Squares | Degrees of Freedom | Mean Squares | F Value |
---|---|---|---|---|
Between Groups | SSB = Σnj(\(\overline{X}_{j}-\overline{X})^{2}\) | df1 = k - 1 | MSB = SSB / (k - 1) | f = MSB / MSE |
Error | SSE = ΣΣ(\(X-\overline{X}_{j})^{2}\) | df2 = N - k | MSE = SSE / (N - k) | |
Total |
SST = SSB + SSE |
df3 = N - 1 |
One Way ANOVA
The one way ANOVA test is used to determine whether there is any difference between the means of three or more groups. A one way ANOVA will have only one independent variable. The hypothesis for a one way ANOVA test can be set up as follows:
Null Hypothesis, \(H_{0}\): \(\mu_{1}\) = \(\mu_{2}\) = \(\mu_{3}\) = ... = \(\mu_{k}\)
Alternative Hypothesis, \(H_{1}\): The means are not equal
Decision Rule: If test statistic > critical value then reject the null hypothesis and conclude that the means of at least two groups are statistically significant.
The steps to perform the one way ANOVA test are given below:
- Step 1: Calculate the mean for each group.
- Step 2: Calculate the total mean. This is done by adding all the means and dividing it by the total number of means.
- Step 3: Calculate the SSB.
- Step 4: Calculate the between groups degrees of freedom.
- Step 5: Calculate the SSE.
- Step 6: Calculate the degrees of freedom of errors.
- Step 7: Determine the MSB and the MSE.
- Step 8: Find the f test statistic.
- Step 9: Using the f table for the specified level of significance, \(\alpha\), find the critical value. This is given by F(\(\alpha\), df1. df2).
- Step 10: If f > F then reject the null hypothesis.
Limitations of One Way ANOVA Test
The one way ANOVA is an omnibus test statistic. This implies that the test will determine whether the means of the various groups are statistically significant or not. However, it cannot distinguish the specific groups that have a statistically significant mean. Thus, to find the specific group with a different mean, a post hoc test needs to be conducted.
Two Way ANOVA
The two way ANOVA has two independent variables. Thus, it can be thought of as an extension of a one way ANOVA where only one variable affects the dependent variable. A two way ANOVA test is used to check the main effect of each independent variable and to see if there is an interaction effect between them. To examine the main effect, each factor is considered separately as done in a one way ANOVA. Furthermore, to check the interaction effect, all factors are considered at the same time. There are certain assumptions made for a two way ANOVA test. These are given as follows:
- The samples drawn from the population must be independent.
- The population should be approximately normally distributed.
- The groups should have the same sample size.
- The population variances are equal
Suppose in the two way ANOVA example, as mentioned above, the income groups are low, middle, high. The gender groups are female, male, and transgender. Then there will be 9 treatment groups and the three hypotheses can be set up as follows:
\(H_{01}\): All income groups have equal mean anxiety.
\(H_{11}\): All income groups do not have equal mean anxiety.
\(H_{02}\): All gender groups have equal mean anxiety.
\(H_{12}\): All gender groups do not have equal mean anxiety.
\(H_{03}\): Interaction effect does not exist
\(H_{13}\): Interaction effect exists.
Related Articles:
Important Notes on ANOVA Test
- ANOVA test is used to check whether the means of three or more groups are different or not by using estimation parameters such as the variance.
- An ANOVA table is used to summarize the results of an ANOVA test.
- There are two types of ANOVA tests - one way ANOVA and two way ANOVA
- One way ANOVA has only one independent variable while a two way ANOVA has two independent variables.
Examples on ANOVA Test
-
Example 1: Three types of fertilizers are used on three groups of plants for 5 weeks. We want to check if there is a difference in the mean growth of each group. Using the data given below apply a one way ANOVA test at 0.05 significant level.
Fertilizer 1 Fertilizer 2 Fertilizer 3 6 8 13 8 12 9 4 9 11 5 11 8 3 6 7 4 8 12 Solution:
\(H_{0}\): \(\mu_{1}\) = \(\mu_{2}\) = \(\mu_{3}\)
\(H_{1}\): The means are not equal
Fertilizer 1 Fertilizer 2 Fertilizer 3 6 8 13 8 12 9 4 9 11 5 11 8 3 6 7 4 8 12 \(\overline{X}_{1}\) = 5 \(\overline{X}_{1}\) = 9 \(\overline{X}_{1}\) = 10 Total mean, \(\overline{X}\) = 8
\(n_{1}\) = \(n_{2}\) = \(n_{3}\) = 6, k = 3
SSB = 6(5 - 8)2 + 6(9 - 8)2 + 6(10 - 8)2
= 84
df1 = k - 1 = 2
Fertilizer 1 (X - 5)2 Fertilizer 2 (X - 9)2 Fertilizer 3 (X - 10)2 6 1 8 1 13 9 8 9 12 9 9 1 4 1 9 0 11 1 5 0 11 4 8 4 3 4 6 9 7 9 4 1 8 1 12 4 \(\overline{X}_{1}\) = 5 Total = 16 \(\overline{X}_{1}\) = 9 Total = 24 \(\overline{X}_{1}\) = 10 Total = 28 SSE = 16 + 24 + 28 = 68
N = 18
df2 = N - k = 18 - 3 = 15
MSB = SSB / df1 = 84 / 2 = 42
MSE = SSE / df2 = 68 / 15 = 4.53
ANOVA test statistic, f = MSB / MSE = 42 / 4.53 = 9.33
Using the f table at \(\alpha\) = 0.05 the critical value is given as F(0.05, 2, 15) = 3.68
As f > F, thus, the null hypothesis is rejected and it can be concluded that there is a difference in the mean growth of the plants.
Answer: Reject the null hypothesis
-
Example 2: A trial was run to check the effects of different diets. Positive numbers indicate weight loss and negative numbers indicate weight gain. Check if there is an average difference in the weight of people following different diets using an ANOVA Table.
Low Fat Low Calorie Low Protein Low Carbohydrate 8 2 3 2 9 4 5 2 6 3 4 -1 7 5 2 0 3 1 3 3 Solution:
\(H_{0}\): \(\mu_{1}\) = \(\mu_{2}\) = \(\mu_{3}\) = \(\mu_{4}\)
\(H_{1}\): The means are not equal
Low Fat (X - 6.6)2 Low Calorie (X - 3)2 Low Protein (X - 3.4)2 Low Carbohydrate (X - 1.2)2 8 2 2 1 3 0.2 2 0.6 9 5.8 4 1 5 2.6 2 0.6 6 0.4 3 0 4 0.4 -1 4.8 7 0.2 5 4 2 2 0 1.4 3 13 1 4 3 0.2 3 3.2 \(\overline{X}_{1}\) = 6.6 Total = 21.4 \(\overline{X}_{2}\) = 3 Total = 10 \(\overline{X}_{3}\) = 3.4 Total = 5.4 \(\overline{X}_{4}\) = 1.2 Total = 10.6 Total mean, \(\overline{X}\) = 3.6
\(n_{1}\) = \(n_{2}\) = \(n_{3}\) = \(n_{4}\) = 5, k = 4
SSB = \(n_{1}(\overline{X}_{1}-\overline{X})^{2}\) + \(n_{2}(\overline{X}_{2}-\overline{X})^{2}\) +& \(n_{3}(\overline{X}_{3}-\overline{X})^{2}\) +\(n_{4}(\overline{X}_{4}-\overline{X})^{2}\)
= 75.8
SSE = 21.4 + 10 + 5.4 + 10.6 = 47.4
The ANOVA Table can be constructed as follows:
Source of Variation Sum of Squares Degrees of Freedom Mean Squares F Value Between Groups SSB = Σnj(\(\overline{X}_{j}-3.6)^{2}\)
= 75.8df1 = k - 1
= 4 - 1
= 3MSB = SSB / (k - 1)
= 25.3f = MSB / MSE
= 8.43Error SSE = ΣΣ(\(X-\overline{X}_{j})^{2}\)
= 47.4df2 = N - k
= 20 - 4
= 16MSE = SSE / (N - k)
= 3Total SST = SSB + SSE
= 123.2df3 = N - 1
= 19As no significance level is specified, \(\alpha\) = 0.05 is chosen.
F(0.05, 3, 16) = 3.24
As 8.43 > 3.24, thus, the null hypothesis is rejected and it can be concluded that there is a mean weight loss in the diets.
Answer: Reject the null hypothesis
-
Example 3: Determine if there is a difference in the mean daily calcium intake for people with normal bone density, osteopenia, and osteoporosis at a 0.05 alpha level. The data was recorded as follows:
Normal Density Osteopenia Osteoporosis 1200 1000 890 1000 1100 650 980 700 1100 900 800 900 750 500 400 800 700 350 Solution:
Using the ANOVA test the hypothesis is set up as follows:
\(H_{0}\): \(\mu_{1}\) = \(\mu_{2}\) = \(\mu_{3}\)
\(H_{1}\): The means are not equal
Normal Density (X - 938.3)2 Osteopenia (X - 800)2 Osteoporosis (X - 715)2 1200 68,486.9 1000 40,000 890 30,625 1000 3,806.9 1100 90,000 650 4,225 980 1,738.9 700 10,000 1100 148,225 900 1,466.9 800 0 900 34,225 750 35,456.9 500 90,000 400 99,225 800 19,126.9 700 10,000 350 133,225 \(\overline{X}_{1}\) = 938.3 Total = 130,083.3 \(\overline{X}_{2}\) = 800 Total = 240,000 \(\overline{X}_{3}\) = 715 Total = 449,750 Total mean, \(\overline{X}\) = 817.8
\(n_{1}\) = \(n_{2}\) = \(n_{3}\) = 6, k = 3
SSB = \(n_{1}(\overline{X}_{1}-\overline{X})^{2}\) + \(n_{2}(\overline{X}_{2}-\overline{X})^{2}\) + \(n_{3}(\overline{X}_{3}-\overline{X})^{2}\)
= 152,477.7
SSE = 130,083.3 + 240,000 + 449,750 = 819,833.3
The ANOVA Table can be constructed as follows:
Source of Variation Sum of Squares Degrees of Freedom Mean Squares F Value Between Groups SSB = Σnj(\(\overline{X}_{j}-817.8)^{2}\)
= 152,477.7df1 = k - 1
= 3 - 1
= 2MSB = SSB / (k - 1)
= 76,238.6f = MSB / MSE
= 1.395Error SSE = ΣΣ(\(X-\overline{X}_{j})^{2}\)
= 819,833.3df2 = N - k
= 18 - 3
= 15MSE = SSE / (N - k)
= 54,655.5Total SST = SSB + SSE
= 972,311df3 = N - 1
= 17Using the F table the critical value is F(0.05, 2, 15) = 3.68
As 1.395 < 3.68, the null hypothesis cannot be rejected and it is concluded that there is not enough evidence to prove that the mean daily calcium intake of the three groups is different.
Answer: Do not reject the null hypothesis
FAQs on ANOVA Test
What is an ANOVA Test in Statistics?
ANOVA test in statistics refers to a hypothesis test that analyzes the variances of three or more populations to determine if the means are different or not.
How to Set Up the Hypothesis for an ANOVA Test?
In an ANOVA test the equality of the means of different groups has to be examined. Thus, the hypothesis is set up as follows:
Null Hypothesis, \(H_{0}\): \(\mu_{1}\) = \(\mu_{2}\) = \(\mu_{3}\) = ... = \(\mu_{k}\)
Alternative Hypothesis, \(H_{1}\): The means are not equal
What is the Formula for the ANOVA Test Statistic?
The ANOVA test uses the F statistic. The formula for the test statistic is given as F = mean squares between groups (MSB) / mean square between errors (MSE)
What is an ANOVA Table?
An ANOVA table is a table that is used to summarize the findings of an ANOVA test. There are 5 columns that consist of the source of variation, the sum of squares, degrees of freedom, mean squares, and the f statistic respectively.
How to Perform an ANOVA Test?
The steps to perform an ANOVA test are as follows:
- Set up the hypothesis.
- Find the means of each group and then determine the overall mean.
- Find the SSB and the corresponding degrees of freedom.
- Determine the SSE and the degrees of freedom.
- Find the MSB and the MSE.
- Divide the MSB by the MSE to find the test statistic.
- Compare the test statistic with the critical value to determine statistical significance.
What is a One Way ANOVA?
One way ANOVA is a type of ANOVA test that is conducted when there is only one independent variable. It is used to compare the means of the various test groups. Such a test can only give information on the statistical significance of the means however, it cannot determine which groups have the differing means.
What is a Two Way ANOVA?
A two way ANOVA is an extension of a one way ANOVA and is conducted when there are two independent variables. It is used to find the main effect as well as the interaction effect of the different factors.
visual curriculum