Coders Packet

Loan Data Analyzing in Python

By Subhojit Jalal

In this project, finding unique values in every feature, finding maximum value minimum value of numerical_features, plotting histogram and plot histogram

Step:- 1

First, we upload the necessary libraries and then we upload the dataset

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
LOAN = pd.read_csv("LOAN_DATASET.csv")
LOAN
LOAN.info()
print(f'Total no of empty values: {LOAN.isna().sum().sum()}')
LOAN.isna().sum()
LOAN.describe()

Step:- 2

Finding unique values in every feature

 def get_unq(LOAN):
    for i in LOAN.columns:
        print(f'{i} - {len(LOAN[i].unique())}')
get_unq(LOAN)

Step:- 3

Finding maximum value minimum value of numerical_features

def min_max(LOAN):
    for i in LOAN.columns:
        if LOAN[i].dtypes!='object':
            print(f'{i} -> {sorted(list(LOAN[i]))[0]} to {sorted(list(LOAN[i]))[-1]}')
min_max(LOAN)

 

Step:- 4

# label encoder for categorical data
from sklearn.preprocessing import LabelEncoder
encoder = LabelEncoder()
LOAN['purpose'] = pd.DataFrame(encoder.fit_transform(LOAN['purpose']))

Now we plot graph

def histplo(df):
    for i in LOAN.columns:
        plt.figure(figsize=(5,7))
        if i!= 'not.fully.paid':
            sns.histplot(data=df,x = i,bins=30,kde = True,hue='not.fully.paid')
histplo(LOAN)
plt.figure(figsize=(20,13))
sns.heatmap(LOAN.corr(),linewidths=0.5,annot= True)

 

Download Complete Code

Comments

No comments yet