F test is a statistical test that is used in hypothesis testing that determines whether the variances of two samples are equal or not. The article will provide detailed information on f test, f statistic, its calculation, critical value and how to use it to test hypotheses. To understand F test firstly we need to have some basic understanding of F-distribution.
F-distribution
The F-distribution is a continuous statistical distribution used to test whether two samples have the same variance. The F-Distribution has two parameters the numerator degrees of freedom (df1) and the denominator degrees of freedom (df2). Formula for F-distribution:
\text{f-value} =\frac{sample 1/df 1}{sample 2/df 2}
- The independent random variables Samples 1 and 2, have a chi-square distribution.
- The related samples' degrees of freedom are denoted by df1 and df2.
Understanding F-Test
In F test the data follows an F distribution. This test uses the F statistic to compare two variances by dividing them. An F test can either be one-tailed or two-tailed depending upon the parameters of the problem. The F value obtained after conducting an F test is used to perform the one-way ANOVA (analysis of variance) test. We can use this test when:
- The population is normally distributed.
- The samples are taken at random and are independent samples.
Hypothesis Testing Framework for F-test
For various hypothesis tests the F test formula is provided as follows:
1. Left Tailed Test:
Null Hypothesis: H0 : \sigma_{1}^2 = \sigma_{2}^2
Alternate Hypothesis: H1 : \sigma_{1}^2 < \sigma_{2}^2
Decision-Making Standard: The null hypothesis is to be rejected if the F statistic is less than the F critical value.
2. Right Tailed Test:
Null Hypothesis: H0 : \sigma_{1}^2 = \sigma_{2}^2
Alternate Hypothesis: H1 : \sigma_{1}^2 > \sigma_{2}^2
Decision-Making Standard: Dismiss the null hypothesis if the F test statistic is greater than the F test critical value.
3. Two Tailed Test:
Null Hypothesis: H0 : \sigma_{1}^2 = \sigma_{2}^2
Alternate Hypothesis: H1 : \sigma_{1}^2 \neq \sigma_{2}^2
Decision-Making Standard: When the F test statistic surpasses the F test critical value the null hypothesis is declared invalid.
F Test Statistics
The F test statistic or simply the F statistic is a value that is compared with the critical value to check if the null hypothesis should be rejected or not. The F test statistic formula is given below:
F statistic for large samples: F_{calc}=\frac{\sigma_{1}^{2}}{\sigma_{2}^{2}} where \sigma_{1}^{2} is the variance of the first population and \sigma_{2}^{2} is the variance of the second population.
F statistic for small samples: F_{calc}=\frac{s_{1}^{2}}{s_{2}^{2}} where s_{1}^{2} is the variance of the first sample and s_{2}^{2} is the variance of the second sample.
Steps to calculate F-Test
Step 1: Use Standard deviation (σ1) and find variance (σ2) of the data. (if not already given)
Step 2: Determine the null and alternate hypothesis.
- H0: no difference in variances.
- H1: difference in variances.
Step 3: Find Fcalc using Equation 1 (F-value).
NOTE : While calculating Fcalc, divide the larger variance with small variance as it makes calculations easier.
Step 4: Find the degrees of freedom of the two samples.
Step 5: Find Ftable value using d1 and d2 obtained in Step-4 from the F-distribution table. Take learning rate, α = 0.05 (if not given)
Looking up the F-distribution table:
In the F-Distribution table as per the given value of α in the question.
- d1 (Across) = df of the sample with numerator variance. (larger)
- d2 (Below) = df of the sample with denominator variance. (smaller)
Consider the F-Distribution table given below, while performing One-Tailed F-Test.
GIVEN:
α = 0.05
d1 = 2
d2 = 3
d2 /d1 | 1 | 2 |
---|
1 | 161.4 | 199.5 |
---|
2 | 18.51 | 19.00 |
---|
3 | 10.13 | 9.55 |
---|
Then, Ftable = 9.55
Step 6: Interpret the results using Fcalc and Ftable.
Interpreting the results:
If Fcalc < Ftable :
Cannot reject null hypothesis.
∴ Variance of two populations are similar.
If Fcalc > Ftable :
Reject null hypothesis.
∴ Variance of two populations are not similar.
Example Problem for calculating F-Test
Consider the following example In this we conduct a two-tailed F-Test on the following samples:
| Sample 1 | Sample 2 |
---|
σ | 10.47 | 8.12 |
---|
n | 41 | 21 |
---|
Step 1: The statement of the hypothesis is formatted as:
- H0: no difference in variances.
- H1: difference in variances.
Step 2: Let's calculated the value of the variances in numerator and denominator as F-value= \frac{\sigma^2_{1}}{\sigma^2_{2}}
- σ12 = (10.47)2 = 109.63
- σ22 = (8.12)2 = 65.99
Fcalc = (109.63 / 65.99) = 1.66
Step 3: Now, let's calculate the degree of freedom: Degree of freedom = sample - 1 Here we have Sample 1 = n1 = 41 and
Sample 2 = n2 = 21
Degree of sample 1 = d1 = (n1 - 1) = (41 – 1) = 40
Degree of sample 2 = d2 = (n2 — 1) = (21 – 1) = 20
Step 4: The usual alpha level of 0.05 is selected because the question does not specify an alpha level. The alpha level should be lowered during the test to half of its starting value. Using d1 = 40 and d2 = 20 in the F-Distribution table. (link here) and Take α = 0.05 as it's not given. Since it is a two-tailed F-test then:
α = 0.05/2 = 0.025
Step 5: The critical F value is found with alpha at 0.025 using the F table. For (40, 20), the critical value at alpha equal to 0.025 is 2.287. Therefore, Ftable = 2.287
Step 6: Since Fcalc < Ftable (1.66 < 2.287):
We cannot reject null hypothesis.
∴ Variance of two populations is similar to each other.
F-Test is the most often used when comparing statistical models that have been fitted to a data set to identify the model that best fits the population.
Similar Reads
Machine Learning Tutorial Machine learning is a branch of Artificial Intelligence that focuses on developing models and algorithms that let computers learn from data without being explicitly programmed for every task. In simple words, ML teaches the systems to think and understand like humans by learning from the data.Do you
5 min read
Introduction to Machine Learning
Python for Machine Learning
Machine Learning with Python TutorialPython language is widely used in Machine Learning because it provides libraries like NumPy, Pandas, Scikit-learn, TensorFlow, and Keras. These libraries offer tools and functions essential for data manipulation, analysis, and building machine learning models. It is well-known for its readability an
5 min read
Pandas TutorialPandas is an open-source software library designed for data manipulation and analysis. It provides data structures like series and DataFrames to easily clean, transform and analyze large datasets and integrates with other Python libraries, such as NumPy and Matplotlib. It offers functions for data t
6 min read
NumPy Tutorial - Python LibraryNumPy (short for Numerical Python ) is one of the most fundamental libraries in Python for scientific computing. It provides support for large, multi-dimensional arrays and matrices along with a collection of mathematical functions to operate on arrays.At its core it introduces the ndarray (n-dimens
3 min read
Scikit Learn TutorialScikit-learn (also known as sklearn) is a widely-used open-source Python library for machine learning. It builds on other scientific libraries like NumPy, SciPy and Matplotlib to provide efficient tools for predictive data analysis and data mining.It offers a consistent and simple interface for a ra
3 min read
ML | Data Preprocessing in PythonData preprocessing is a important step in the data science transforming raw data into a clean structured format for analysis. It involves tasks like handling missing values, normalizing data and encoding variables. Mastering preprocessing in Python ensures reliable insights for accurate predictions
6 min read
EDA - Exploratory Data Analysis in PythonExploratory Data Analysis (EDA) is a important step in data analysis which focuses on understanding patterns, trends and relationships through statistical tools and visualizations. Python offers various libraries like pandas, numPy, matplotlib, seaborn and plotly which enables effective exploration
6 min read
Feature Engineering
Supervised Learning
Unsupervised Learning
Model Evaluation and Tuning
Advance Machine Learning Technique
Machine Learning Practice