This project focuses on identifying phishing URLs using machine learning algorithms. It extracts key lexical, domain-based, and technical features from URLs to distinguish between legitimate and phishing websites.
A variety of classification algorithms are implemented and evaluated, including Random Forest, Decision Tree, XGBoost, SVM, and Logistic Regression.
- π Feature extraction from URLs
- π§ Multiple machine learning models implemented
- π Model evaluation using Accuracy, Precision, Recall, and F1 Score
- π Confusion matrix and classification report visualization
- π§ͺ Train-test data split with reproducible results
- β Supports binary classification (Phishing / Legitimate)
Follow these instructions to get a copy of the project up and running on your local machine.
- Python 3.7+
- Jupyter Notebook / VSCode
- Basic knowledge of machine learning
-
Clone the repository:
git clone https://github.com/Gauri9977/Phishing-URL-Detection-using-Machine-Learning.git cd Phishing-URL-Detection-using-Machine-Learning -
(Optional) Create and activate a virtual environment:
-
On Windows:
python -m venv env env\Scripts\activate
-
On Linux/Mac:
python3 -m venv env source env/bin/activate
-
-
Install the required dependencies
-
Launch Jupyter Notebook:
jupyter notebook
-
Open
Phishing_URL_Detection.ipynband run all the cells sequentially. -
Explore model performance and tune hyperparameters as needed.