Crime Description Prediction Based on Spatial-Temporal Data

Python: 3.11.14
conda (Anaconda): 24.11.3

Project Description

This project applies classification models to Baltimore crime data to predict the description of a crime based on features like location, time, and premise type. The dataset includes major crimes against people reported under the NIBRS system.

Link of the dataset

The UI based on streamlit can be accessed publicly from the link which is hosted in streamlit's community cloud, but do note if the number of website visitors is low or even none, which is expected, the working link can switch to sleeping/hibernation mode. This does not mean, the link will stop working, but rather it can take up couple of minutes till the website wakes up and is functional to use.

Streamlit App Link (give it some minutes to wake up)

https://mlda-cw1-15775-baltimore.streamlit.app

Prerequisities

conda 24.11.3 (if not available then follow the instructions here based on your OS. For this project Anaconda was used.)
Python 3.11.14
jupyter 1.1.1

Running the program locally

However if you want to git clone and make the program work locally, then do the following setups:

git clone https://github.com/00015775/MLDA-CW1-15775

cd MLDA-CW1-15775

environments.yml related to this project should be at the root directory, and if not found, then cd to where it is located. The below given command, recreates the conda environment with exact package versions. After than, simply activate the conda environment.

conda env create -f environments.yml

conda activate baltimore_crime_env

The model is already trained and saved in the corresponding folder, for more info scroll below of where it is. Basically, to run the Streamlit ui app locally, run the following command, and if .py is not found, then cd to where the baltimore-crime-app.py is located.

streamlit run ui/baltimore-crime-app.py

Streamlit will prompt to ask your gmail for its news feed, simply leave it empty(if you do not need that). After that, from the terminal you should see Local URL: or Network URL:, either of them if pasted to brower should open the website and you are ready to specify the inputs and get your predicted grade(G3).

You can see two environment files listed here requirements.txt and environments.yml. Basically, environments.yml is used for recreating the conda environment which you should use, but requirements.txt is created only for the Streamlit app, since it cannot download dependencies from .yml

Reading the reproducibility.md is completely optional, it is a self-note for making further conda environment reproducible and OS agnostic.

Folder Tree Structure

MLDA-CW1-15775/  
├── paper/
│   └── MLDA-CW1-15775-REPORT.pdf          # project description
│
├── src/            
│   ├── baltimore-crime-data.ipynb 
│   ├── models/                     # contains trained models
│   ├── plots/                      # any related diagrams
│   └── data/                       # dataset itself
│
├── ui/  
│   └── baltimore-crime-app.py       
├── .gitignore  
└── README.md

Machine Learning algorithms

RandomForestClassifier
HistGradientBoostingClassifier
CatBoostClassifier

Model evaluation metrics

Accuracy
Precision
Recall
F1-score

Hyperparameter tunning

GridSearchCV was used for finding the best values for n_estimators for Random Forest model individually. However, due to the time consuming nature of cross validation, for HistGradientBoostingClassifier and CatBoostClassifier the hyperparameters were chosen manually through heuristic experimentation of trying out different values for parameters such as max_iter, learning_rate and depth. Basically, the higher values for those parameters yielded better accuracy, however at the cost of computational power and time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Crime Description Prediction Based on Spatial-Temporal Data

Project Description

Prerequisities

Running the program locally

Folder Tree Structure

Machine Learning algorithms

Model evaluation metrics

Hyperparameter tunning

As for the reference, most of the libraries, coding examples and information were learned from GeeksForGeeks

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
paper		paper
src		src
ui		ui
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environments.yml		environments.yml
reproducibility.md		reproducibility.md
requirements.txt		requirements.txt

License

00015775/MLDA-CW1-15775

Folders and files

Latest commit

History

Repository files navigation

Crime Description Prediction Based on Spatial-Temporal Data

Project Description

Prerequisities

Running the program locally

Folder Tree Structure

Machine Learning algorithms

Model evaluation metrics

Hyperparameter tunning

As for the reference, most of the libraries, coding examples and information were learned from GeeksForGeeks

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages