Skip to content

AnuttaraR/MachineLearning_CW

Repository files navigation

Introduction

This report explores the Spambase dataset of the UCI Irevine Machine learning Repository. The Spambase dataset is a widely used benchmark dataset for the classification of emails as spam or non-spam. This dataset contains a total of 4601 email messages, which are classified into two classes - spam and non-spam. The dataset provides a range of attributes that capture different aspects of the emails' content, such as the frequency of specific words, characters, and punctuation marks. This report aims to showcase the performance of KNN and Decision Trees models on the task of email classification and provide insights into the optimal strategies for preparing the corpus dataset for model implementation purposes.

About

Module CM2604 - Machine Learning Coursework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published