🚀 Nitro Project - Industrial Predictive Monitoring System

📋 Table of Contents

Features
System Architecture
Services and Ports
Quick Start
Project Structure
Analysis Notebooks
Data Flow
Implemented Technologies
Maintenance
Troubleshooting

✨ Features

📊 Real-time data ingestion with Kafka Producer
⚡ Distributed processing with Spark Streaming
🔧 Intelligent orchestration with Apache Airflow
💾 Scalable storage in PostgreSQL and MinIO
🤖 Advanced MLOps with MLflow and SHAP
📈 Professional visualization with Streamlit and Grafana
📓 Comprehensive analysis with specialized notebooks
🔍 Real-time monitoring with interactive dashboards

🏗️ System Architecture

graph LR
A[Kafka Producer] --> B[Kafka Cluster]
B --> C[Spark Processor]
C --> D[(PostgreSQL)]
C --> E[MinIO Storage]
D --> F[FastAPI ML Service]
E --> F
F --> G[Streamlit Dashboard]
F --> H[Grafana Monitoring]
D --> H

🌐 Services and Ports

Service	URL	Port	Credentials	Status
🔧 Airflow	http://localhost:8080	8080	admin/admin	✅ Operational
⚡ Spark Master	http://localhost:8081	8081	-	✅ Operational
📊 Streamlit Dashboard	http://localhost:8501	8501	-	✅ Operational
📈 Grafana	http://localhost:3000	3000	admin/admin123	✅ Operational
💾 MinIO Console	http://localhost:9001	9001	admin/admin12345	✅ Operational
🗄️ PostgreSQL	localhost:5432	5432	nitro_user/nitro_pass	✅ Operational
🚀 FastAPI	http://localhost:8000	8000	-	✅ Operational
📡 Kafka	localhost:9092	9092/29092	-	✅ Operational

🚀 Quick Start

# Clone the repository
git clone <your-repository>
cd proyecto-nitro

# Start all services
docker-compose up -d

# Check service status
docker-compose ps

# Start the data producer
./start-producer.sh

# Access dashboards (wait 2-3 minutes for complete initialization)
echo "Access URLs:"
echo "Airflow: http://localhost:8080"
echo "Grafana: http://localhost:3000"
echo "Streamlit: http://localhost:8501"

📁 Project Structure

proyecto-nitro/
├── 📊 airflow/                 # Airflow DAGs and configuration
├── 🚀 api-dashboard/           # FastAPI and Streamlit
│   ├── fastapi/               # ML prediction API
│   └── dashboards/            # Interactive dashboards
├── 📡 kafka-producer/          # Kafka data producer
│   ├── Dockerfile
│   ├── kafka_producer.py
│   └── requirements.txt
├── 🗄️ minio-setup/             # MinIO bucket configuration
├── 📓 notebooks/              # Analysis and modeling
│   ├── 📊 EDA.ipynb
│   ├── ⚙️ feature_engineering.ipynb
│   ├── 🤖 model_training.ipynb
│   ├── 🔍 mlflow_tracking.ipynb
│   ├── 📈 SHAP_analysis.ipynb
│   ├── 📋 reports/
│   ├── 🧠 models/
│   └── 💾 data/
│       └── enhanced_predictions.csv
├── 🗃️ postgres-setup/          # PostgreSQL schemas and config
├── ⚡ python-processor/        # Spark data processor
├── 🔥 spark-processing/       # Spark jobs
├── 🐳 docker-compose.yml      # Container orchestration
├── 🚀 start-producer.sh       # Startup script
└── 📖 README.md              # This file

📓 Analysis Notebooks

Notebook	Description	Technologies
📊 `EDA.ipynb`	Exploratory Data Analysis	Pandas, Matplotlib, Seaborn
⚙️ `feature_engineering.ipynb`	Feature engineering	Scikit-learn, Featuretools
🤖 `model_training.ipynb`	Predictive model training	Scikit-learn, XGBoost, MLflow
🔍 `mlflow_tracking.ipynb`	ML experiment tracking	MLflow, Hyperopt
📈 `SHAP_analysis.ipynb`	Model explainability	SHAP, Matplotlib

🔄 Data Flow

Ingestion: Kafka Producer generates simulated industrial sensor data
Streaming: Kafka publishes to sensor_topic with 1 partition
Processing: Spark processes data in real-time with transformations
Storage: Data persisted in PostgreSQL (structured) and MinIO (raw)
Analysis: Specialized notebooks for EDA and predictive modeling
Visualization: Real-time dashboards with Streamlit and Grafana
Prediction: FastAPI serves ML models for predictive maintenance

🛠️ Implemented Technologies

🏗️ Data Engineering

🤖 Machine Learning

📊 Visualization & Monitoring

🐳 Infrastructure

🛠️ Maintenance

Useful Commands

# View logs for all services
docker-compose logs

# View specific service logs
docker-compose logs kafka
docker-compose logs spark-master

# Restart a specific service
docker-compose restart kafka-producer

# Check container status
docker-compose ps

# Access PostgreSQL
docker-compose exec postgres psql -U nitro_user -d nitro_db

# List Kafka topics
docker-compose exec kafka kafka-topics --list --bootstrap-server localhost:9092

# Scale Spark workers
docker-compose up -d --scale spark-worker=3

🐛 Troubleshooting

Common Issues and Solutions

Issue	Solution
Kafka won't start	Check Zookeeper health: `docker-compose logs zookeeper`
Producer can't connect	Wait 30-60 seconds for Kafka complete initialization
PostgreSQL connection refused	Check logs: `docker-compose logs postgres`
Dashboards won't load	Wait 2-3 minutes and verify all services are UP
Airflow webserver error	Run: `docker-compose restart airflow`

Cleanup and Reinstallation

# Stop and remove all containers and volumes
docker-compose down -v

# Rebuild and start all services
docker-compose up -d --build

# Force container recreation
docker-compose up -d --force-recreate

📊 Grafana Dashboard - Recommended Configuration

To configure Grafana with your PostgreSQL data:

Access http://localhost:3000
Configure PostgreSQL data source:
- Host: postgres:5432
- Database: nitro_db
- User: nitro_user
- Password: nitro_pass
Import dashboards for:
- Kafka monitoring (lag, throughput)
- Spark performance metrics
- ML model prediction analysis
- Real-time sensor health status

📝 License

MIT License - see LICENSE file for details.

🤝 Contribution

Contributions are welcome!

# 1. Fork the project
# 2. Create your feature branch
git checkout -b feature/AmazingFeature

# 3. Commit your changes
git commit -m 'Add some AmazingFeature'

# 4. Push to the branch
git push origin feature/AmazingFeature

# 5. Open a Pull Request

📞 Support

If you encounter issues:

Check the Troubleshooting section
Verify logs with docker-compose logs [service]
Open an issue in the repository with:
- Detailed description of the problem
- Commands executed
- Relevant logs
- Screenshots (if applicable)

⭐ If you find this project useful, please give it a star on GitHub!

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
airflow		airflow
api-dashboard/dashboards		api-dashboard/dashboards
data		data
grafana/provisioning/datasources		grafana/provisioning/datasources
infra		infra
kafka-producer		kafka-producer
minio-setup		minio-setup
mlartifacts/1		mlartifacts/1
mlflow		mlflow
models		models
notebooks		notebooks
postgres-setup		postgres-setup
python-processor		python-processor
spark-3.3.0-bin-hadoop3		spark-3.3.0-bin-hadoop3
.env		.env
.gitignore		.gitignore
DEPLOY.md		DEPLOY.md
README.MD		README.MD
dashboard.json		dashboard.json
docker-compose.yml		docker-compose.yml
mc		mc
mlflow.db		mlflow.db
requirements.txt		requirements.txt
restore-nitro-system.sh		restore-nitro-system.sh
start-producer.sh		start-producer.sh
wait-for-postgres.sh		wait-for-postgres.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Nitro Project - Industrial Predictive Monitoring System

📋 Table of Contents

✨ Features

🏗️ System Architecture

🌐 Services and Ports

🚀 Quick Start

📁 Project Structure

📓 Analysis Notebooks

🔄 Data Flow

🛠️ Implemented Technologies

🏗️ Data Engineering

🤖 Machine Learning

📊 Visualization & Monitoring

🐳 Infrastructure

🛠️ Maintenance

Useful Commands

🐛 Troubleshooting

Common Issues and Solutions

Cleanup and Reinstallation

📊 Grafana Dashboard - Recommended Configuration

📝 License

🤝 Contribution

📞 Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 Nitro Project - Industrial Predictive Monitoring System

📋 Table of Contents

✨ Features

🏗️ System Architecture

🌐 Services and Ports

🚀 Quick Start

📁 Project Structure

📓 Analysis Notebooks

🔄 Data Flow

🛠️ Implemented Technologies

🏗️ Data Engineering

🤖 Machine Learning

📊 Visualization & Monitoring

🐳 Infrastructure

🛠️ Maintenance

Useful Commands

🐛 Troubleshooting

Common Issues and Solutions

Cleanup and Reinstallation

📊 Grafana Dashboard - Recommended Configuration

📝 License

🤝 Contribution

📞 Support

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages