Titanic Survivability Prediction (Python and R with Docker)

Overview

This project develops logistic regression models in both Python and R to predict passenger survival on the Titanic. Each implementation is containerized using Docker to ensure reproducibility and portability. The instructions below explain how to download the data and run both containers step by step.

Repository Structure

titanic-disaster/
│
├── data/                  # CSV files (download manually from Kaggle)
├── src/
│   ├── code/              # Python implementation
│   │   └── main.py
│   └── r/                 # R implementation
│       └── main.R
│       └── install_packages.R  # R dependencies
│
├── Dockerfile             # Python container configuration
├── Dockerfile_R           # R container configuration
├── requirements.txt       # Python dependencies
└── README.md              # Project documentation

Step 1: Download the Data

Download the dataset from the official Kaggle Titanic competition page:

URL: https://www.kaggle.com/competitions/titanic/code

Required Files

train.csv
test.csv
gender_submission.csv

Instructions

Visit the Kaggle link above and log in.
Click the Data tab and select Download All.
Extract the ZIP file.
Move the three CSV files into your local project directory under titanic-disaster/data/.

The final structure should look like this:

titanic-disaster/
└── data/
    ├── train.csv
    ├── test.csv
    └── gender_submission.csv

Step 2: Run the Python Docker Container

Build the Image

Run the following command from the project root directory:

docker build -t titanic-app .

Execute the Container

docker run --rm -it titanic-app

Output

The container will:

Load and clean the training data
Display data summaries and missing value statistics
Train a logistic regression model
Output model coefficients and intercepts
Display training and test accuracy

All progress and results are printed directly in the terminal.

Step 3: Run the R Docker Container

Build the Image

From the project root directory:

docker build -t titanic-r -f Dockerfile_R .

Execute the Container

docker run --rm -it -v "%cd%/data:/app/data" titanic-r

Note: This command might not work for Non-Windows systems. For macOS / Linux, replace %cd% with $(pwd).

Output

The R container will:

Load and clean the Titanic dataset
Display missing value summaries and cleaned data overview
Train a logistic regression model using glm
Display model coefficients and training accuracy
Output test prediction summaries and test accuracy

Notes

The data directory is excluded from version control. You must download the CSV files manually before running the containers.
Both containers access the same dataset through the mounted /app/data directory.
Docker caching ensures efficient rebuilds as dependencies are reinstalled only when package files change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Titanic Survivability Prediction (Python and R with Docker)

Overview

Repository Structure

Step 1: Download the Data

Required Files

Instructions

Step 2: Run the Python Docker Container

Build the Image

Execute the Container

Output

Step 3: Run the R Docker Container

Build the Image

Execute the Container

Output

Notes

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile_R		Dockerfile_R
README.md		README.md
requirements.txt		requirements.txt

ishan-monie/titanic-disaster

Folders and files

Latest commit

History

Repository files navigation

Titanic Survivability Prediction (Python and R with Docker)

Overview

Repository Structure

Step 1: Download the Data

Required Files

Instructions

Step 2: Run the Python Docker Container

Build the Image

Execute the Container

Output

Step 3: Run the R Docker Container

Build the Image

Execute the Container

Output

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages