Here we see a Suicide Case Study, analyze Suicide Cases Worldwide and we calculate the mean value of 100k per Population
Step:-1
First, we import the necessary libraries after that, we upload the dataset
import pandas as pd import numpy as np import seaborn as sns import matplotlib.pyplot as plt from scipy import stats
Suicide_Dataset = pd.read_csv("Suicide_Dataset.csv") Suicide_Dataset
Step:-2
Now we plot a graph of who commits Suicide between male and female
Suicide_Dataset_M = Suicide_Dataset[Suicide_Dataset.sex == "male"] Suicide_Dataset_F = Suicide_Dataset[Suicide_Dataset.sex == "female"] sns.lineplot(Suicide_Dataset_M.year, Suicide_Dataset.suicides_no, ci = None) sns.lineplot(Suicide_Dataset_F.year, Suicide_Dataset.suicides_no, ci = None) plt.legend[("male","female")] plt.show()
Here we see that female Suicide Case are less than male Suicide Case
Step:-3
Suicide_Dataset_suino = Suicide_Dataset.groupby(["country","year"])["suicides_no"].sum() Suicide_Dataset_sum = Suicide_Dataset_suino.sort_index(ascending=True)[:] * 100 Suicide_Dataset_pop = Suicide_Dataset.groupby(["country","year"]).population.sum() Suicide_Dataset_pop_sum = Suicide_Dataset_pop.sort_index(ascending=False)[:] Suicide_Dataset_total = Suicide_Dataset_sum / Suicide_Dataset_pop_sum Suicide_Dataset_total.head(10)
Step:-4
Here we see which of countries people are most commit a Suicide
Country_Dict={} for country in Suicide_Dataset_total.index.get_level_values(0): if country not in Country_Dict.keys(): Country_Dict[country] = Suicide_Dataset_total[country].mean() else: pass pup = list(Country_Dict.items()) pup.sort(key= lambda pair:pair[1], reverse = True) country_list = [a[0] for a in pup] country_suicide = [a[1] for a in pup]
plt.figure(figsize=(10,40)) sns.barplot(x=country_suicide[:],y=country_list[:], palette="bright") plt.title("Rate vs Country") plt.show()
Step:-5
Now we calculate the mean value of 100k per Population Worldwide also plot a graph of a mean value
Suicide_mean= Suicide_Dataset['suicides/100k pop'].mean() Suicide_mean
Suicide_mean_data=[] year=[] for i in range(len(Suicide_Dataset['year'].unique())-1): year.append(Suicide_Dataset['year'].unique()[i]) Suicide_mean_data.append(Suicide_Dataset[Suicide_Dataset['year']==Suicide_Dataset['year'].unique()[i]]['suicides/100k pop'].mean())
fig = plt.figure(figsize=(15,5)) sns.pointplot(x=year, y=Suicide_mean_data) plt.axhline(y=Suicide_mean,linestyle='--') plt.text(2, 13, 'Suicide mean=12.81', fontsize=9, va='center', ha='center') plt.ylabel("Survivied per 100K people") plt.xlabel("Year") plt.title("Worldwide Suicides by Year")
Here we see Suicide at most in 1995
Step:-6
Suicide_sex= Suicide_Dataset[['sex', 'suicides/100k pop']].groupby(['sex'], as_index=False).mean() Suicide_sex
fig = plt.figure(figsize=(10,5)) labels = 'female','male' Suicide_sex.plot(kind='pie',labels=labels, y = 'suicides/100k pop', autopct='%1.1f%%') plt.title('Worldwide Suicides by Gender')
Here we see 100k per Population 79% male and 21% female commit Suicide
Step:-7
Now we see at which age people commit Suicide most
Suicide_age= Suicide_Dataset[['age', 'suicides/100k pop']].groupby(['age'], as_index=False).mean() Suicide_age
fig1, ax = plt.subplots() plt.pie(Suicide_age['suicides/100k pop'], labels=Suicide_age['age'], autopct='%1.1f%%') plt.title('Worldwide suicides by Age') plt.axis('equal') plt.show()
Here we see 31.2% of people are 75+ year old who commits a Suicide Worldwide
Step:-8
Now we see at which generation people commit a Suicide most
Suicide_generation= Suicide_Dataset[['generation', 'suicides/100k pop']].groupby(['generation'], as_index=False).mean() Suicide_generation
fig1, ax1 = plt.subplots() plt.pie(Suicide_generation['suicides/100k pop'], labels=Suicide_generation['generation'], autopct='%1.1f%%') plt.title('Worldwide suicides by Generation') plt.axis('equal') plt.show()
Here we see 32.5% of people are G.I Generation who commits a Suicide Worldwide
Submitted by Subhojit Jalal (Subhojit1234)
Download packets of source code on Coders Packet
Comments