Interactive Machine Learning Model Explainer -- train, evaluate, and understand scikit-learn models through feature importance, confusion matrices, and side-by-side comparison.
MLExplain is an educational and practical tool for exploring how different machine-learning algorithms perform on classic datasets. Train models with a click, inspect feature importance bar charts, study confusion matrices, and compare algorithms side by side -- all from a clean, browser-based interface.
| Feature | Description |
|---|---|
| Built-in Datasets | Iris, Wine, Breast Cancer, Digits from scikit-learn |
| Five Algorithms | Decision Tree, Random Forest, SVM, KNN, Logistic Regression |
| Hyperparameters | Configurable max depth, n_estimators, C, k, solver, etc. |
| Train / Test Split | Adjustable ratio (default 80/20) with reproducible seed |
| Metrics Dashboard | Accuracy, Precision, Recall, F1-Score per experiment |
| Feature Importance | Tree-based & permutation importance bar charts |
| Confusion Matrix | Interactive heatmap with per-class counts |
| Model Comparison | Train multiple models, compare metrics side by side |
| Experiment History | Every run saved to SQLite with full metadata |
| Prediction API | POST features, receive prediction + confidence scores |
| REST API | Full CRUD for experiments, datasets, and predictions |
mlexplain/
+-- app.py # Flask entry point & factory
+-- config.py # App configuration
+-- requirements.txt # Pinned dependencies
+-- models/
| +-- __init__.py
| +-- database.py # SQLAlchemy setup
| +-- schemas.py # Experiment, Dataset, ModelResult
+-- routes/
| +-- __init__.py
| +-- api.py # REST API endpoints
| +-- views.py # HTML page routes
+-- services/
| +-- __init__.py
| +-- ml_engine.py # Training, prediction, explanation
| +-- datasets.py # Built-in dataset loading
+-- templates/
| +-- base.html # Layout with navigation
| +-- index.html # Dashboard
| +-- train.html # Train a model
| +-- explain.html # Model explanations
| +-- compare.html # Compare models
| +-- about.html # About page
+-- static/
| +-- css/style.css # Scientific theme
| +-- js/main.js # Chart.js visualisations
+-- tests/
| +-- conftest.py
| +-- test_api.py
| +-- test_models.py
| +-- test_services.py
+-- seed_data/data.json
- Python 3.11 or later
- pip
git clone https://github.com/your-org/mlexplain.git
cd mlexplain
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
python app.pyThe application will be available at http://localhost:8006.
docker compose up --build| Method | Endpoint | Description |
|---|---|---|
| GET | /api/datasets |
List available datasets |
| GET | /api/datasets/<name> |
Get dataset info & preview |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/train |
Train a model |
| GET | /api/experiments |
List all experiments |
| GET | /api/experiments/<id> |
Get experiment details |
| DELETE | /api/experiments/<id> |
Delete an experiment |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/experiments/<id>/importance |
Feature importance |
| GET | /api/experiments/<id>/confusion |
Confusion matrix |
| GET | /api/experiments/<id>/metrics |
Detailed metrics |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/predict/<id> |
Predict with a trained model |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/compare |
Compare multiple models |
| Variable | Default | Description |
|---|---|---|
PORT |
8006 |
Server port |
FLASK_DEBUG |
0 |
Enable debug mode |
DATABASE_URL |
sqlite:///instance/mlexplain.db |
Database URI |
SECRET_KEY |
(dev key) | Flask secret key |
pytest -v --cov=. --cov-report=term-missingInterpretable tree-based classifier. Configurable max_depth and
min_samples_split. Provides direct feature importance via Gini
impurity.
Ensemble of decision trees with bagging. Configurable n_estimators,
max_depth. Feature importance averaged across all trees.
Kernel-based classifier (RBF, linear, poly). Configurable
regularisation parameter C and kernel type.
Instance-based learning. Configurable n_neighbors and distance
metric (euclidean, manhattan).
Linear model for classification. Configurable C, solver
(lbfgs, liblinear, saga), and max_iter.
- Tree-based importance -- uses
feature_importances_attribute from Decision Tree and Random Forest (Gini impurity reduction). - Permutation importance -- available for all models. Measures accuracy drop when each feature is randomly shuffled.
| Layer | Technology |
|---|---|
| Backend | Python 3.11, Flask 3.0 |
| ML Engine | scikit-learn 1.3, NumPy 1.26 |
| Database | SQLite via SQLAlchemy 2.0 |
| Frontend | Jinja2 templates, Chart.js 4 |
| Testing | pytest 7.4 with coverage |
| Deployment | Docker, Gunicorn |
This project is licensed under the MIT License -- see LICENSE.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- scikit-learn for ML algorithms and datasets
- Chart.js for interactive visualisations
- Flask for the web framework