Project repository for DA 204o Data Science in Practice (Aug semester 2024) @ IISc BLR
Enable a phishing URL detection system.
Source: PhiUSIIL Phishing URL (Website)
Summary: PhiUSIIL Phishing URL Dataset is a substantial dataset comprising 134,850 legitimate and 100,945 phishing URLs. Most of the URLs we analyzed, while constructing the dataset, are the latest URLs. Features are extracted from the source code of the webpage and URL. Features such as CharContinuationRate, URLTitleMatchScore, URLCharProb, and TLDLegitimateProb are derived from existing features.
Additional Info:
- Column "FILENAME" can be ignored.
- Label 1 corresponds to a legitimate URL, label 0 to a phishing URL
- Deepansh Sood
- Shambo Samanta
- Sudipta Ghosh
- Sourajit Bhar