Internet Firewall Data Classification

Overview

This project applies several machine learning algorithms to classify internet firewall data into different action categories. The dataset used for this classification task comes from the Internet Firewall Data repository.

Objective

The goal of this project is to implement and evaluate commonly used machine learning algorithms on a multi-class classification problem. By analyzing network traffic attributes, we aim to distinguish between different firewall actions, enhancing network security decision-making.

Machine Learning Algorithms Implemented

This project explores and implements the following machine learning techniques:

Principal Component Analysis (PCA) - Used for dimensionality reduction.
Least Squares Classification - A simple linear classification approach.
Logistic Regression - A probabilistic model for binary and multi-class classification.
K-Nearest Neighbors (KNN) - A distance-based classification method.
Naïve Bayes - A probabilistic classifier based on Bayes' theorem.
Multilayer Perceptron (MLP) - A feedforward neural network model.
Support Vector Machines (SVM) - A powerful classification method using hyperplanes.
K-Means - A clustering algorithm to identify patterns in the data.

Each algorithm is tested on the firewall dataset to evaluate its performance in classifying network traffic behavior.

Dataset Description

The dataset consists of 12 features, with the 'Action' feature representing the target variable. Below is the description of each feature:

Variable Name	Description
Source Port	Sender's initiating port.
Destination Port	Receiver's target port.
NAT Source Port	Sender's port after NAT.
NAT Destination Port	Receiver's port after NAT.
Bytes	Packet size in bytes.
Bytes Sent	Bytes sent by the sender.
Bytes Received	Bytes received by the receiver.
Packets	Total packets transmitted.
Elapsed Time (sec)	Duration of communication.
pkts_sent	Packets sent by the sender.
pkts_received	Packets received by the receiver.
Action	Class label (e.g., allow, block, etc.).

Classification Task

The goal is to classify each network traffic observation into one of the following four classes:

allow
deny
drop
reset-both

Each record belongs to only one of these classes. The classification models are evaluated based on their accuracy and ability to generalize to unseen data.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Project.ipynb		Project.ipynb
README.md		README.md
firewall_data.csv		firewall_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Internet Firewall Data Classification

Overview

Objective

Machine Learning Algorithms Implemented

Dataset Description

Classification Task

About

Uh oh!

Releases

Packages

Languages

PetridhsG/Firewall-Data-Classification

Folders and files

Latest commit

History

Repository files navigation

Internet Firewall Data Classification

Overview

Objective

Machine Learning Algorithms Implemented

Dataset Description

Classification Task

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages