GitHub - romulloferreira/Titanic_Dataset: This project is to analyse the Kaggle Titanic dataset

Installation

You will need the standard data science libraries found in the Anaconda distribution of Python. The code should run with no issues using Python versions 3.*.

Project Motivation

This project is to analyse the Kaggle Titanic dataset that has samples that list passengers who survived or did not survive the Titanic disaster. Our model will analyze what made the passengers survive or not to survive the disaster.

As we all know, the vessel left on its maiden voyage from Southampton (S) to New York on April 10, 1912, en route passing through Cherbourg-Octeville (C) in France and through Queenstown (Q) in Ireland. It crashed into an iceberg and sank on April 15 with 2,224 people on board, making it one of the biggest maritime disasters in all history.

The sinking of the Titanic can be attributed to several causes, natural and human. The high number of deaths can be attributed to the lack of lifeboats, and insufficient capacity for everyone on board.

There was also the fact that some groups of people were more likely to survive than others, such as women, children and the upper class. But to draw conclusions with scientific bases we will analyze the data from our csv file.

Specifically, I looked at the following questions:

What is the mean age of passengers on board?
How is the distribution of passengers on the ship by class?
What is the mean sex of the passengers who survived?
What class of passengers survived?
What is the mean age of the passengers who survived?
What were the factors that made people survive?

File Descriptions

The following are the files available in this repository:

Titanic_Dataset.ipynb - a notebook of the analysis performed following the CRISP-DM process
titanic-data-6.csv - contains the data analysed by the .ipynb file. To use it properly put it in the same directory as the .ipynb file.

Results

The results are saved in the Titanic_Dataset.ipynb file in the repository.

Licensing, Authors, Acknowledgements

This study uses passenger data from the voyage of the RMS Titanic (1912). Data can be obtained from Kaggle.

You can find the Licensing for the data and other descriptive information on Kaggle website.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
Titanic_Dataset.ipynb		Titanic_Dataset.ipynb
titanic-data-6.csv		titanic-data-6.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

Installation

Project Motivation

File Descriptions

Results

Licensing, Authors, Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Installation

Project Motivation

File Descriptions

Results

Licensing, Authors, Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages