Data Preprocessing Template in Python
By Radhika Talwar
This is a Data Preprocessing Template which we can apply on any dataset. You just have to change the name of Dataset. It contains Dataset and the template with extension .ipynb.
These sections are present in the template and you can change the required little changes to apply this on your own dataset.
1. Importing the libraries: To import the libraries that you need. A Python library is a reusable chunk of code that you may want to include in your programs/ projects. Compared to languages like C++ or C, Python libraries do not pertain to any specific context in Python.
2. Importing the dataset: To import the Dataset you want to apply to preprocess on.
3. Splitting the dataset into the Training set and Test set: This will Split the Data into the Training set(80%) and Test Set(20%).
4. Taking care of missing data: If any data is missing then fill it with the mean of all the data of that column.
5. Encoding categorical data: If data is Numerical value then We will Categorize it but Assigning some numerical value.
a. Encoding the Independent Variable: This will categorize the Independent variable.
b. Encoding the Dependent Variable: This will categorize the Dependent variable.
6.Feature Scaling: This will Scale the features and transform the value between -3 to 3.
This will make your work easy. Use and enjoy it.
Comments