AeroGuard-CV is a high-fidelity computer vision engine designed to bridge the gap between static surveillance and autonomous robotics. By integrating Open-Vocabulary YOLO-World detection with a time-synchronized PID Control Loop, the system transforms a standard camera feed into a dynamic Virtual Gimbal Simulation.
This project is specifically engineered for fixed-camera systems that require autonomous tracking capabilities. Instead of physical movement, the system calculates the necessary vector adjustments to keep a target centered.
- Simulated Kinematics: Even without physical motors, the system computes real-time Pan, Tilt, and Zoom (Z) coordinates as if it were a mobile robotic unit.
- Coordinate Mapping: It utilizes a Dynamic Virtual Canvas (via global padding) to allow the "virtual lens" to move beyond the physical boundaries of the raw frame.
- Hardware Pre-Visualization: This allows engineers to validate tracking stability, PID tuning, and UI responsiveness before deploying to real Pan-Tilt hardware.
- Open-Vocabulary Inference: Powered by
YOLO-World, enabling the detection of any object (e.g., "black balloon", "drone", "sphere") using natural language prompts without retraining. - Precision PID Control: A custom-coded Proportional-Integral-Derivative loop ensures smooth, fluid tracking that mimics high-end robotic gimbals.
- Sci-Fi HUD (Head-Up Display): Real-time Iron-Man style telemetry overlay including a sliding Compass, Tilt-Inclinometer, and Z-axis depth indicators.
- Environmental Robustness: Features dynamic brightness analysis and CLAHE-based texture scoring to prevent false-positive locks in complex lighting.
The system is highly modular. You can fine-tune its behavior in the following files:
| Parameter | Technical Description | Default Value |
|---|---|---|
SEARCH_PROMPT |
Natural language classes for the Open-Vocabulary model. | ["balloon", "sphere"] |
CONF_THRESHOLD |
Minimum confidence score for valid target detection. | 0.15 |
INFERENCE_SIZE |
Input resolution for the CNN (higher = more accuracy, lower = higher FPS). | 320 |
MIN_AREA |
Minimum pixel area to filter out background noise/artifacts. | 400 |
| Constant | Engineering Effect |
|---|---|
KP_PAN / KP_TILT |
Proportional Gain: Controls the immediate reaction speed to target displacement. |
KI_PAN / KI_TILT |
Integral Gain: Eliminates steady-state error (ensures the target is perfectly centered). |
MAX_INTEGRAL |
Anti-Windup: Prevents the integral term from accumulating excessive error during fast moves. |
DEAD_ZONE_Z |
Stability Buffer: Prevents jittering when the target is at the requested depth/distance. |
AeroGuard/
├── src/
│ ├── main.py # Entry Point: Vision loop & Object Analysis
│ ├── tracker.py # Logic: PID Control & State Management
│ ├── utils/
│ │ ├── visualization.py # Graphics: HUD Rendering & Sci-Fi UI
│ │ └── __init__.py
│ └── __init__.py
├── requirements.txt # Core dependencies (OpenCV, Ultralytics, NumPy)
├── README.md # Documentation
└── yolov8s-world.pt # Pre-trained Open-Vocabulary weights
First, ensure you have Python 3.8+ installed. Then, clone the repository and install the required libraries:
git clone https://github.com/cankayafaruk/AeroGuard-CV.git
cd AeroGuard-CV
pip install -r requirements.txtDownload the yolov8s-world.pt model weights to enable offline inference.
To start the simulation environment, run the following command:
python src/main.py-
Tracking: The system will automatically scan for objects defined in
SEARCH_PROMPT. -
Controls: Press
qto safely terminate the process and close all windows.
This project is developed with a focus on high-level software engineering principles and robust control theory:
- Time-Synchronized Integration: The PID controller utilizes a calculated
delta_time (dt)factor. This ensures that tracking speed and responsiveness remain consistent and predictable, regardless of hardware performance or fluctuations in the frame-per-second (FPS) rate. - Texture-Based Validation: To minimize false positives, the system performs real-time texture analysis using local standard deviation and CLAHE (Contrast Limited Adaptive Histogram Equalization). This allows the engine to distinguish between background noise and textured real-world objects.
- Intelligent Memory Management: The integral term in the PID controller is capped (Anti-Windup) and programmed to decay over time when a target is lost. This prevents erratic "snap-back" movements and ensures smooth re-acquisition when the target reappears.
- Modular Software Architecture: The project follows a clean deployment structure using a full Python package hierarchy with
__init__.pyfiles, ensuring scalability and maintainability.
Developed as a high-fidelity simulation environment for autonomous robotic systems.