Non-Parametric Test
Non-parametric test is a statistical analysis method that does not assume the population data belongs to some prescribed distribution which is determined by some parameters. Due to this, a non-parametric test is also known as a distribution-free test. These tests are usually based on distributions that have unspecified parameters.
A non-parametric test acts as an alternative to a parametric test for mathematical models where the nature of parameters is flexible. Usually, when the assumptions of parametric tests are violated then non-parametric tests are used. In this article, we will learn more about a non-parametric test, the types, examples, advantages, and disadvantages.
What is Non-Parametric Test in Statistics?
A non-parametric test in statistics does not assume that the data has been taken from a normal distribution. A normal distribution belongs to a parametrized family of probability distributions and includes parameters such as mean, variance, standard deviation, etc. Thus, a non-parametric test does not make assumptions about the probability distribution's parameters.
Non-Parametric Test Definition
A non-parametric test can be defined as a test that is used in statistical analysis when the data under consideration does not belong to a parametrized family of distributions. When the data does not meet the requirements to perform a parametric test, a non-parametric test is used to analyze it.
Reasons to Use Non-Parametric Tests
It is important to access when to apply parametric and non-parametric tests in order to arrive at the correct statistical inference. The reasons to use a non-parametric test are given below:
- When the distribution is skewed, a non-parametric test is used. For skewed distributions, the mean is not the best measure of central tendency, hence, parametric tests cannot be used.
- If the size of the data is too small then validating the distribution of the data becomes difficult. Thus, in such cases, a non-parametric test is used to analyze the data.
- If the data is nominal or ordinal, a non-parametric test is used. This is because a parametric test can only be used for continuous data.
Types of Non-Parametric Tests
Parametric tests are those that assume that the data follows a normal distribution. Examples include ANOVA and t-tests. There are many different methods available to perform a non-parametric test. These tests can also be used in hypothesis testing. Some common non-parametric tests are given as follows:
Mann-Whitney U Test
This non-parametric test is analogous to t-tests for independent samples. To conduct such a test the distribution must contain ordinal data. It is also known as the Wilcoxon rank sum test.
Null Hypothesis: \(H_{0}\): The two populations under consideration must be equal.
Test Statistic: U should be smaller of
\(U_{1} = n_{1}n_{2}+\frac{n_{1}(n_{1}+1)}{2}-R_{1}\) or \(U_{2} = n_{1}n_{2}+\frac{n_{2}(n_{2}+1)}{2}-R_{2}\)
where, \(R_{1}\) is the sum of ranks in group 1 and \(R_{2}\) is the sum of ranks in group 2.
Decision Criteria: Reject the null hypothesis if U < critical value.
Wilcoxon Signed Rank Test
This is the non-parametric test whose counterpart is the parametric paired t-test. It is used to compare two samples that contain ordinal data and are dependent. The Wilcoxon signed rank test assumes that the data comes from a symmetric distribution.
Null Hypothesis: \(H_{0}\): The difference in the median is 0.
Test Statistic: W. W is defined as the smaller of the sums of the negative and positive ranks.
Decision Criteria: Reject the null hypothesis if W < critical value.
Sign Test
This non-parametric test is the parametric counterpart to the paired samples t-test. The sign test is similar to the Wilcoxon sign test.
Null Hypothesis: \(H_{0}\): The difference in the median is 0.
Test Statistic: The smaller value among the number of positive and negative signs.
Decision Criteria: Reject the null hypothesis if the test statistic < critical value.
Kruskal Wallis Test
The parametric one-way ANOVA test is analogous to the non-parametric Kruskal Wallis test. It is used for comparing more than two groups of data that are independent and ordinal.
Null Hypothesis: \(H_{0}\): m population medians are equal
Test Statistic: H = \(\left ( \frac{12}{N(N+1)}\sum_{1}^{m} \frac{R_{j}^{2}}{n_{j}}\right ) - 3(N+1)\)
where, N = total sample size, \(n_{j}\) and \(R_{j}\) are the sample size and the sum of ranks of the jth group
Decision Criteria: Reject the null hypothesis if H > critical value
Non-Parametric Test Example
The best way to understand how to set up and solve a hypothesis involving a non-parametric test is by taking an example.
Suppose patients are suffering from cancer. They are divided into three groups and different drugs were administered. The platelet count for the patients is given in the table below. It needs to be checked if the population medians are equal. The significance level is 0.05.
Drug 1 | Drug 2 | Drug 3 |
---|---|---|
42000 | 67000 | 78000 |
48000 | 57000 | 89000 |
57000 | 79000 | 67000 |
69000 | 80000 | |
45000 |
As the size of the 3 groups is not same the Kruskal Wallis test is used.
\(H_{0}\): Population medians are same
\(H_{1}\): Population medians are different
\(n_{1}\) = 5, \(n_{2}\) = 3, \(n_{3}\) = 4
N = 5 + 3 + 4 = 12
Now ordering the groups and assigning ranks
Drug | Rank | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 1 | 2 | 3 |
42000 | 1 | ||||
48000 | 2 | ||||
45000 | 3 | ||||
57000 | 57000 | 4.5 | 4.5 | ||
67000 | 67000 | 6.5 | 6.5 | ||
69000 | 8 | ||||
78000 | 9 | ||||
79000 | 10 | ||||
80000 | 11 | ||||
89000 | 12 |
\(R_{1}\) = 18.5, \(R_{2}\) = 21, \(R_{3}\) = 38.5,
Substituting these values in the test statistic formula, \(\left ( \frac{12}{N(N+1)}\sum_{1}^{m} \frac{R_{j}^{2}}{n_{j}}\right ) - 3(N+1)\)
H = 6.0778.
Using the critical value table, the critical value will be 5.656.
As H < critical value, the null hypothesis is rejected and it is concluded that there is no significant evidence to show that the population medians are equal.
Difference between Parametric and Non-Parametric Test
Depending upon the type of distribution that the data has been obtained from both, a parametric test and a non-parametric test can be used in hypothesis testing. The table given below outlines the main difference between parametric and non-parametric tests.
Non-Parametric Test | Parametric test |
---|---|
A non-parametric test is a statistical test that is used when the population data does not belong to a parametrized distribution. | It is used when the data belongs to a specific probability distribution such as a normal distribution. |
Knowledge of the population is not required to conduct this test. | Complete knowledge of the population is required. |
The central tendency value used is the median for non-parametric tests. | The mean is used for parametric tests |
It is used for ordinal data and nominal data. | It is used for interval data. |
Less powerful | More powerful than non-parametric tests. |
Examples of non-parametric tests are signed test, Kruskal Wallis test, etc. | Examples of parametric tests are z test, t test, etc. |
Advantages and Disadvantages of Non-Parametric Test
Non-parametric tests are used when the conditions for a parametric test are not satisfied. In some cases when the data does not match the required assumptions but has a large sample size then a parametric test can still be used. Some of the advantages and disadvantages of a non-parametric test are listed as follows:
Advantages of Non-Parametric Test
The advantages of a non-parametric test are listed as follows:
- Knowledge of the population distribution is not required.
- The calculations involved in such a test are shorter.
- A non-parametric test is easy to understand.
- These tests are applicable to all data types.
Disadvantages of Non-Parametric Test
The disadvantages of a non-parametric test are given below:
- They are not as efficient as their parametric counterparts.
- As these are distribution-free tests the level of accuracy is reduced.
Related Articles:
Important Notes on Non-Parametric Test
- A non-parametric test is a statistical test that is performed on data belonging to a distribution whose parameters are unknown.
- It is used on skewed distributions and the measure of central tendency used is the median.
- Kruskal Wallis test, sign test, Wilcoxon signed test and the Mann Whitney u test are some important non-parametric tests used in hypothesis testing.
Examples on Non-Parametric Test
-
Example 1: A surprise quiz was taken and the scores of 6 students are given as follows:
Student 1 2 3 4 5 6 Score 8 6 4 2 5 6 After giving a month's time to practice, the same quiz was taken again and the following scores were obtained.
Student 1 2 3 4 5 6 Score 6 8 8 9 4 10
Solution: The Wilcoxon signed rank test will be used.Student Test 1 Score Test 2 Score Difference (Test 2 - Test 1) 1 8 6 -2 2 6 8 2 3 4 8 4 4 2 9 7 5 5 4 -1 6 6 10 4 Assigning signed ranks to the differences
Difference Rank Signed Rank -1 1 -1 2 2.5 2.5 -2 2.5 -2.5 4 4.5 4.5 4 4.5 4.5 7 6 6 \(H_{0}\): Median difference is 0.
\(H_{1}\): Median difference is positive.
W1: Sum of positive ranks = 17.5
W2: Sum of negative ranks = 3.5
As W2 < W1, thus, W2 is the test statistic.
Now from the table, the critical value is 2.
Since W2 > 2, thus, the null hypothesis cannot be rejected and it can be concluded that there is no difference between the scores of the two tests.
Answer: Fail to reject the null hypothesis -
Example 2: Use the sign test to solve example 1.
Solution:Student Test 1 Score Test 2 Score Difference (Test 2 - Test 1) Sign 1 8 6 -2 - 2 6 8 2 + 3 4 8 4 + 4 2 9 7 + 5 5 4 -1 - 6 6 10 4 +
\(H_{0}\): Median difference is 0.
\(H_{1}\): Median difference is positive.
Number of (-) signs = 2
Number of (+) signs = 4
As number of (-) signs < number of (+) signs, thus, the test statistic = 2
Now from the table, the critical value is 6.
As 2 < critical value, thus, the null hypothesis is rejected and there is no evidence to suggest that the median difference is 0.
Answer: Null hypothesis is rejected -
Example 3: A test was run on 5 patients to see if a new drug could cure sleepwalking. Another group of 5 patients was still taking the old drug. The number of sleepwalking cases in a month is as follows:
Sleepwalking cases in a month New Drug 7 8 4 9 8 Old Drug 3 4 2 1 1
Solution: The Mann Whitney U test is used.
Ordering the data and assigning ranksDrug Rank Old New Old New 1 1.5 1 1.5 2 3 3 4 4 4 5.5 5.5 7 7 8 8.5 8 8.5 9 10 \(H_{0}\): Two groups report same number of cases
\(H_{1}\): Two groups report different number of cases
\(R_{1}\) = 15.5, \(R_{2}\) = 39.5
\(n_{1}\) = \(n_{2}\) = 5
Using the formulas,
\(U_{1} = n_{1}n_{2}+\frac{n_{1}(n_{1}+1)}{2}-R_{1}\) and \(U_{2} = n_{1}n_{2}+\frac{n_{2}(n_{2}+1)}{2}-R_{2}\)
\(U_{1}\) = 24.5, \(U_{2}\) = 0.5
As \(U_{2}\) < \(U_{1}\), thus, \(U_{2}\) is the test statistic.
From the table the critical value is 2
As \(U_{2}\) < 2, the null hypothesis is rejected and it is concluded that there is no evidence to prove that the two groups have the same number of sleepwalking cases.
Answer: Null hypothesis is rejected
FAQs on Non-Parametric Test
What is a Non-Parametric Test?
A non-parametric test in statistics is a test that is performed on data belonging to a distribution that has flexible parameters. Thus, they are also known as distribution-free tests.
When Should a Non-Parametric Test be Used?
A non-parametric test should be used under the following conditions.
- The distribution is skewed.
- The size of the distribution is small.
- The data is nominal or ordinal.
What is the Test Statistic Used for the Mann-Whitney U Non-Parametric Test?
The Mann Whitney U non-parametric test is the non parametric version of the sample t-test. The test statistic used for hypothesis testing is U . U should be smaller of \(U_{1} = n_{1}n_{2}+\frac{n_{1}(n_{1}+1)}{2}-R_{1}\) or \(U_{2} = n_{1}n_{2}+\frac{n_{2}(n_{2}+1)}{2}-R_{2}\)
What is the Test Statistic Used for the Kruskal Wallis Non-Parametric Test?
The parametric counterpart of the Kruskal Wallis non parametric test is the one way ANOVA test. The test statistic used is H = \(\left ( \frac{12}{N(N+1)}\sum_{1}^{m} \frac{R_{j}^{2}}{n_{j}}\right ) - 3(N+1)\).
What is the Test Statistic Used for the Sign Non-Parametric Test?
The smaller value among the number of positive and negative signs is the test statistic that is used for the sign non-parametric test.
What is the Difference Between a Parametric and Non-Parametric Test?
A parametric test is conducted on data that is obtained from a parameterized distribution such as a normal distribution. On the other hand, a non-parametric test is conducted on a skewed distribution or when the parameters of the population distribution are not known.
What are the Advantages of a Non-Parametric Test?
A non-parametric test does not rely on the assumed parameters of a distribution and is applicable to all data types. Furthermore, they are easy to understand.
visual curriculum