Coders Packet

Titanic data accuracy in Python with spark

By THOTA SOWMIKA

Below is the given Python code with Machine learning algorithms. This is about the titanic ship and the cause of the death of thousand passengers. So, here most of them have lost their lives.

 Here, our main aim is to analyze the hidden or unknown information. i applied machine learning algorithms to the features that we have in the dataset. Like, prediction analysis is to incorporate the use of computational methods to determine important and useful patterns in the large data.

 I used pip install pyspark package.

 The Python packaging for Spark is not intended to replace all of the other use cases. This Python packaged version of Spark is suitable for interacting with an existing cluster.

 For accuracy, i used Randomclassifier and Navis bayes for comparision. In random classifier i got 100.00% accuracy but in navis bayes i got 0.743.

 so, here for navis bayes also i calculated the accuracy and the test error rate. Test error rate is important metric as it would be danderous if we predict the passengers would survive but the truth is he dosen't survive. so, here test Accuracy should be less.