This project is a machine learning–based Network Security application. It can train a model and predict whether network data is safe or malicious using a web API built with FastAPI.
Below is the complete workflow explained in simple language.
When the project starts:
- Environment variables are loaded using
.envfile (for MongoDB URL). - A secure connection to MongoDB is created.
- A FastAPI web server is started.
The API is accessible through a browser.
- The project connects to MongoDB using a URL stored in an environment variable.
- MongoDB is used to store raw and processed network data during training.
- Database and collection names are taken from constant files to keep things organized.
- Redirects the user to FastAPI documentation (
/docs). - Useful for testing APIs easily.
Purpose: Train the machine learning model.
What happens internally:
-
TrainingPipelineobject is created. -
The training pipeline runs step-by-step:
- Data ingestion (fetching data)
- Data validation
- Data transformation
- Model training
- Model evaluation
- Model saving
-
Final trained model and preprocessor are stored in the
final_model/folder.
Output:
- A success message once training is completed.
Purpose: Predict results for new network data.
Input:
- A CSV file uploaded by the user.
Workflow:
-
Uploaded CSV file is read using Pandas.
-
Saved preprocessor is loaded from
preprocessor.pkl. -
Saved trained model is loaded from
model.pkl. -
Both are combined using
NetworkModel. -
Predictions are made on the uploaded data.
-
A new column (
predicted_column) is added to the data. -
Output is:
- Saved as a CSV file (
prediction_output/output.csv) - Displayed as an HTML table in the browser
- Saved as a CSV file (
Output:
- Prediction results shown in a table format.
preprocessor.pkl→ Handles data scaling/encodingmodel.pkl→ Trained ML modelNetworkModel→ Combines preprocessing + prediction into one step
This makes prediction easy and clean.
- Custom exception class
NetworkSecurityExceptionis used. - Errors are logged properly for debugging.
- If prediction fails, error details are returned clearly.
- Predictions are shown using Jinja2 templates.
- Data is displayed as a clean table in the browser.
User → Upload CSV / Trigger Training
→ FastAPI
→ ML Pipeline / Model
→ Prediction
→ CSV + Table Output
- Python
- FastAPI
- MongoDB
- Machine Learning
- Pandas