CHI-SQUARE TEST in Python
In this tutorial, we are going to learn about CHI-SQUARE FIT TEST to find correlation between variables in Python.
Chi-Square test
To determine that if two categorical variables are having a significant correlation between them we use the Chi-Square test. For example, we build a dataset and try to find a correlation between vegetarian food and non-vegetarian food with low-calorie and diabetic food. If a correlation is found then we can find out the food preferences of different people.
The general formula for this test: Square of the sum of observation frequencies-Square of the sum of expected frequencies/sum of Expected frequencies

STEP 1: Import Libraries
We import chi-square from scipy.stats to directly use it in the code. There are other ways also to perform this function.
STEP 2: Initialize observed frequencies and then expected frequencies with the percentage.
Parameters
- f_obs: array_like
-
Observed frequencies in each category.
- f_exp: array_like, optional
-
STEP 3: Perform Chi-square test
If the p-value is very small we should reject the hypothesis, the value of p less than 0.05 is statistically significant. It indicates strong evidence against the null hypothesis, as there is less than a 5% probability the null is correct, as the p-value is less than 0.05, we do not retain the null-hypothesis and hence, the assumption is rejected.
Project Files
| .. | ||
| This directory is empty. | ||