Coders Packet

Plotting of Heatmap and Pairplot in Python using CSV

By Aanchal Kaushal Sharma

The code will use a CSV file and plot Heatmap and Pairplot in Python. Colored blocks and visual implementation of the graph are observed in the output.

In this tutorial, we will be using CSV files and plotting Heat map, and pair plot with them. Varies libraries such as pandas, numpy, seaborn, cufflinks, matplotlib would Also be used along with it.

First, we will see what is Heatmap and Pairplot and why is it used.

Heatmap is one of the values showing charts or graphical representations in which various shades are used to represent the data. Darker values represent higher values and lighter represents lower values.

Pairplot is a representation which plots pairwise relationship in the data. It creates multiple small graphs where each variable in data will be shared in the y-axis across a single row and in the x-axis across a single column. 

Importing libraries in the code

import numpy as np
import pandas 
import matplotlib.pyplot as plt
import seaborn as sns
import cufflinks as cf
import plotly.offline as pyo

We use NumPy when some array operation is to be done.
Pandas is used to analyze the data, works best with tabular data
We import matplotlib to visualize the data and plot it.
Cufflinks are used to connect two libraries i.e pandas with plotly using both libraries' advantages to plot them.
Plotly makes it easy to how to display plots and visualize them.
Seaborn library is used in both heat map and pair plot representation. It is a data visualization library that uses matplotlib and integrates panda data structure to make the chart.

Reading the CSV file

# reading the CSV file
heart = pandas.read_csv('heart.csv')
  
# displaying the contents of the CSV file
print(heart)

Using pandas to read the file and displaying it.

Grouping the target data from the file

According to size and sum of target dataset they are grouped

heart.groupby('target').size()
heart.groupby('target').sum()
heart.shape
heart.size

 

Visualizing the graphs

Making columns from the specific dataset and then plotting them.

numeric_columns=['trestbps','chol','thalach','age','oldpeak']
sns.pairplot(heart[numeric_columns])
heart['target']
sns.heatmap(heart[numeric_columns].corr(),annot=True, cmap='terrain', linewidths=0.1)
fig=plt.gcf()
fig.set_size_inches(8,6)
plt.show()

 

Complete Code:

import numpy as np
import pandas 
import matplotlib.pyplot as plt
import seaborn as sns
import cufflinks as cf
import plotly.offline as pyo


pyo.init_notebook_mode(connected=True)
cf.go_offline()

# reading the CSV file
heart = pandas.read_csv('heart.csv')
  
# displaying the contents of the CSV file
print(heart)

info = ["age","1: male, 0: female","chest pain type, 1: typical angina, 2: atypical angina, 3: non-anginal pain, 4: asymptomatic","resting blood pressure"," serum cholestoral in mg/dl","fasting blood sugar > 120 mg/dl","resting electrocardiographic results (values 0,1,2)"," maximum heart rate achieved","exercise induced angina","oldpeak = ST depression induced by exercise relative to rest","the slope of the peak exercise ST segment","number of major vessels (0-3) colored by flourosopy","thal: 3 = normal; 6 = fixed defect; 7 = reversable defect"]

for i in range(len(info)):
 print(heart.columns[i]+":\t\t\t"+info[i])

heart['target']

heart.groupby('target').size()
heart.groupby('target').sum()
heart.shape
heart.size

heart.describe()
heart.info()
heart['target'].unique()

#Visualization
numeric_columns=['trestbps','chol','thalach','age','oldpeak']
sns.pairplot(heart[numeric_columns])
heart['target']
sns.heatmap(heart[numeric_columns].corr(),annot=True, cmap='terrain', linewidths=0.1)
fig=plt.gcf()
fig.set_size_inches(8,6)
plt.show()

 

Output:

Pairplot and heatmap output

Download project

Reviews Report

Submitted by Aanchal Kaushal Sharma (AanchalShar)

Download packets of source code on Coders Packet