ForestGuardAI is a production-oriented Python project for real-time illegal forest activity detection on modest laptop hardware. It combines a lightweight audio CNN and a single YOLOv8n object detector, then fuses both streams into alert events exposed through a Flask API and dashboard.
The system targets hardware similar to:
- NVIDIA GTX 1050 with 4 GB VRAM
- 4 to 8 CPU cores
- 16 GB RAM
Detected events:
- Audio: chainsaw, gunshot, vehicle, background
- Vision: person, vehicle
Pipeline:
Microphone -> Audio model -> Sound classification
Camera -> Vision model -> Object detection
Audio + Vision -> Fusion engine -> Alert
Alert -> API -> Dashboard
Core runtime components:
audio/audio_stream.py: non-blocking microphone capture at 16000 Hz.audio/audio_model.py: lightweight Conv2D audio classifier using MelSpectrogram and AmplitudeToDB.vision/camera_stream.py: low-latency camera capture with buffer size set to 1.vision/vision_model.py: a single YOLOv8n detector resized to 416x416 and executed every fifth frame.fusion/fusion_engine.py: alert logic for chainsaw, gunshot, and person detections.pipeline/detection_pipeline.py: threaded inference loop.api/alert_api.py: JSON API and dashboard host.
Thread model:
- Thread 1: camera capture
- Thread 2: audio capture
- Thread 3: detection pipeline
- Thread 4: Flask API and dashboard
ForestGuardAI implements the requested GTX 1050 optimizations:
- Single YOLO model instance only.
- YOLOv8n nano model only.
- Frames resized to 416x416 before inference.
- Detection on every fifth frame.
torch.no_grad()used for inference.- GPU used automatically when CUDA is available.
- Camera buffer size reduced to limit lag.
- Audio sampled at 16000 Hz with 2 second windows.
- Batch size 1 for runtime inference.
- Models loaded once and reused.
ForestGuardAI/
├── main.py
├── config.py
├── requirements.txt
├── README.md
├── api/
├── audio/
├── dashboard/
├── dataset/
├── fusion/
├── models/
├── pipeline/
├── scripts/
├── tests/
├── utils/
└── vision/
- Create and activate a Python 3.10+ virtual environment.
- Install dependencies:
pip install -r requirements.txt- Prepare the audio dataset:
python scripts/prepare_dataset.py- Train the audio model:
python -m audio.train_audio_model --epochs 15 --batch-size 8 --learning-rate 0.001- Run the system:
python main.pyDashboard:
The preparation script downloads ESC-50 and extracts relevant classes:
chainsaw->dataset/audio/chainsawengine->dataset/audio/vehiclegun_shot->dataset/audio/gunshot- ambient classes such as rain, wind, birds, insects ->
dataset/audio/background
Every exported file is converted to:
- mono
- 16000 Hz
- normalized amplitude
models/audio_model.pthis a placeholder until training completes.vision/vision_model.pywill usemodels/yolov8n.ptwhen a valid local weight file exists, otherwise Ultralytics will resolveyolov8n.ptautomatically.- If the audio model has not been trained yet, audio inference falls back to background to avoid false positives.
Run unit tests with:
python -m unittest discover tests