A complete MLOps pipeline demonstrating automated machine learning model training, evaluation, and reporting using TensorFlow, GitHub Actions, and Continuous Machine Learning (CML). This project showcases best practices for ML automation, model performance tracking, and reproducible machine learning workflows.
This repository implements an end-to-end machine learning pipeline that:
- Automatically trains a TensorFlow neural network on synthetic linear data
- Evaluates model performance using comprehensive metrics
- Generates visual reports with training results and predictions
- Creates automated reports via CML comments on GitHub PRs
- Ensures reproducibility through version-controlled ML workflows
- π Automated CI/CD Pipeline: Triggered on every push/PR
- π Performance Tracking: MAE, MSE, RΒ² score monitoring
- π Visual Analytics: Automated plot generation and publishing
- π Production Ready: Near-perfect model performance (RΒ² β 1.0)
- π Comprehensive Reporting: Detailed model configuration and metrics
βββ .github/workflows/
β βββ cml.yml # GitHub Actions CI/CD workflow
βββ model.py # Main ML training script
βββ requirements.txt # Python dependencies
βββ README.md # Project documentation
βββ metrics.txt # Generated model performance metrics
βββ model_results.png # Generated visualization plot
- Python 3.8+
- GitHub repository with Actions enabled
- Basic understanding of TensorFlow and MLOps
-
Clone the repository:
git clone https://github.com/dev-opsss/MLOps-CI.git cd MLOps-CI -
Install dependencies:
pip install -r requirements.txt
-
Run locally (optional):
python model.py
-
Enable GitHub Actions:
- Push to your repository to trigger the automated pipeline
- Check the Actions tab for workflow execution
- View CML reports in PR comments
model = tf.keras.Sequential([
tf.keras.layers.Dense(1, input_shape=(1,)) # Single layer for linear regression
])- Framework: TensorFlow 2.20+
- Model Type: Sequential Neural Network
- Architecture: Single Dense Layer (Linear Regression)
- Optimizer: Adam (learning_rate=0.1)
- Loss Function: Mean Squared Error
- Training Epochs: 200
- Data Normalization: StandardScaler applied
- Relationship:
y = x + 10 - Total Samples: 50
- Feature Range: X β [-100, 96] (step=4)
- Target Range: y β [-90, 106] (step=4)
- Train/Test Split: 70/30 (shuffled)
- Validation Split: 20% of training data
- Random shuffling to prevent extrapolation issues
- Feature standardization for stable training
- Proper tensor reshaping for TensorFlow compatibility
Mean Absolute Error = 0.000709
Mean Squared Error = 0.000001
RΒ² Score = 1.000000
Final Training Loss = 4.62e-10
Final Validation Loss = 1.89e-10
- β Near-perfect accuracy (MAE < 0.001)
- β Perfect correlation (RΒ² = 1.0)
- β No overfitting (validation loss β training loss)
- β Production ready performance levels
The automated pipeline (/.github/workflows/cml.yml) performs:
-
Environment Setup
- Ubuntu latest runner
- Python dependencies installation
- CML tools configuration
-
Model Training
- Execute
model.pyscript - Generate performance metrics
- Create visualization plots
- Execute
-
Report Generation
- Publish model results visualization
- Create comprehensive performance report
- Post automated comments on PRs
- Push events: Any commit to main branch
- Pull requests: Automatic model evaluation on PRs
- Manual dispatch: On-demand workflow execution
The pipeline generates publication-ready visualizations showing:
- Training data points (blue scatter)
- Test data points (green scatter)
- Model predictions (red scatter)
- True relationship line (black dashed)
- Performance metrics overlay
Automated GitHub comments include:
- π Model Performance Metrics
- π Training Result Visualizations
- π§ Model Configuration Details
- π Training Process Summary
- π― Results Analysis & Status
tensorflow>=2.20.0
numpy>=2.3.0
matplotlib>=3.10.0
# Training Configuration
EPOCHS = 200
LEARNING_RATE = 0.1
BATCH_SIZE = 35 # Full batch training
VALIDATION_SPLIT = 0.2
TRAIN_TEST_SPLIT = 0.7
# Data Configuration
RANDOM_SEED = 42
FEATURE_RANGE = (-100, 96)
STEP_SIZE = 4# Install dependencies
pip install -r requirements.txt
# Execute training script
python model.py
# View generated files
ls -la *.png *.txtTo experiment with different architectures:
# Example: Multi-layer network
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=(1,)),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(1)
])To use your own data:
# Replace synthetic data generation
X = your_features.reshape(-1, 1)
y = your_targets.reshape(-1, 1)1. GitHub Actions Permission Errors
# Add to workflow permissions
permissions:
contents: read
pull-requests: write
issues: write2. TensorFlow Version Compatibility
# Ensure compatible versions
pip install tensorflow>=2.20.03. CML Report Generation Issues
# Use heredoc syntax for complex reports
cat << 'EOF' >> report.md
# Your markdown content
EOF- β Perfect Model Performance: RΒ² = 1.000000
- β Automated MLOps Pipeline: End-to-end automation
- β Comprehensive Testing: Training & validation monitoring
- β Production Readiness: Sub-millimeter precision
- β Reproducible Workflows: Version-controlled ML pipeline
- Data normalization for training stability
- Proper train/test splitting with shuffling
- Comprehensive metrics tracking (MAE, MSE, RΒ²)
- Automated visualization generation
- CI/CD integration with GitHub Actions
- Version control for ML experiments
- Fork the repository
- Create a feature branch (
git checkout -b feature/improvement) - Commit your changes (
git commit -am 'Add improvement') - Push to the branch (
git push origin feature/improvement) - Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- TensorFlow Team for the excellent ML framework
- Iterative.ai for CML (Continuous Machine Learning)
- GitHub for Actions CI/CD platform
- Open Source Community for inspiration and best practices
Last Updated: Automatically updated by CML workflow
For the most recent model performance and visualizations, check the latest GitHub Actions run or PR comments.
This project demonstrates production-ready MLOps practices with automated model training, evaluation, and reporting. Perfect for learning CI/CD for machine learning workflows! π