Here we look at different predictive models using Python for data science and machine learning. The aim is to run through some different methods to develop a basic understanding of the workflow for producing different explanatory and predictive models.
Setup the workspace and get the data
Like all of my projects, first you need to set up a root folder. So created a folder call ‘Titanic’, this folder should be used for all you .ipynb files (if you are using Jupyter Notebook) or .py files (if you are just using a python IDE). Next, in the Titanic folder create a folder call ‘Data’, this is where we will store the data file.
The data we are using here comes from Kaggle, you can read all about it here. For this work you should download this version, it is exactly the same but just renamed to Titanic.csv. Download the file to the Data folder.
Next you can have a look at the following parts…