KDD Cup 1999 dataset by DARPA. The whole dataset can be downloaded from-
This project implements an Intrusion Detection System (IDS) using the KDD Cup 1999 dataset. It includes tools for feature extraction from network traffic and machine learning-based threat detection.
-
DATASET kddcup99/
Contains the KDD Cup 1999 dataset files used for training and evaluation. -
FeatureExtraction/
C++ utility for extracting features from network traffic or pcap files, compatible with the KDD '99 dataset format. -
Threat Detection/
Python notebooks and scripts for building, training, and evaluating classifiers on the extracted features.
Download the KDD Cup 1999 dataset from UCI KDD Archive and place the files in the DATASET kddcup99/ directory.
The feature extraction tool is implemented in C++ in the FeatureExtraction directory.
- Create a build directory:
mkdir build-files cd build-files - Generate build files with CMake:
cmake -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" .. - Build the project:
cd .. cmake --build ./build-files --target kdd99extractor -- -j 4 - The compiled binary will be at
build-files/src/kdd99extractor.
For more details, see FeatureExtraction/README.md.
The [Threat Detection](Threat Detection/README.md) directory contains Jupyter notebooks for:
- Data preprocessing
- Model training and evaluation
- Visualization of results
Open the notebooks in JupyterLab or VS Code to run experiments.
- Sniffer: Captures and parses network traffic.
- IP Reassembler: Handles IP header summaries.
- Conversation Reconstructor: Reconstructs network conversations and computes intrinsic features.
- Statistical Engine: Computes derived features for machine learning.
- KDD Cup 1999 Data: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
- Lee, W. & Stolfo, S. J. (2000), 'A framework for constructing features and models for intrusion detection systems'
- Dybey, D. & Dubey, J. (2014), 'A Survey Intrusion Detection with KDD99 Cup Dataset'
For educational and research