Virtual Fence Benchmark Suite

Unified pipeline for the Virtual Fence project—detect, track, and count people entering a protected zone using YOLO, OmniVLM, and RT-DETR. This README consolidates the specifications under specs/ and the task briefs in task/ into a single reference, including hands-on guidance for dataset sourcing, CVAT annotation, Nexa SDK setup, and model benchmarking.

Project Specs & Task Overview

📄 Specifications: specs/virtual_fence_task_spec.md defines objectives, datasets, metrics, and success criteria.
📥 Task briefs: task/VirtualFence.md and task/Person_Counting_Task.pdf describe the zone-counting deliverable.
✅ Key requirements:
- Combine CrowdHuman, MOT17, and custom annotated clips.
- Implement YOLO, OmniVLM, and a custom detector (RT-DETR).
- Produce annotated output video with zone overlay and entry counter.
- Benchmark all methods on the same clips: mAP, IDF1/MOTA, counting MAE, FPS.

Environment & Tooling

Prerequisites:

Python 3.13 (conda recommended).
NVIDIA GPU + CUDA toolkit for accelerated training (optional but preferred).
Nexa SDK for OmniVLM workflows.
Docker (for CVAT) and FFmpeg for video processing.

Set up the base environment:

conda create -n fence python=3.13 -y
conda activate fence
pip install -r requirements.txt

# Install CUDA-enabled torch if you have an RTX GPU
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

# Windows FFmpeg install
choco install ffmpeg
ffmpeg -version

Additional tooling:

Nexa SDK installer: https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#install-option-1-executable-installer. Add the install folder to PATH.

CVAT setup:

git clone https://github.com/cvat-ai/cvat.git
cd cvat
docker compose up -d

Colab workflow is available in notebooks/virtual_fence.py (downloads datasets, prepares YOLO splits, trains, and exports models back to Drive).

Dataset Preparation

Custom video ingestion (optional but recommended):

Source 50–100 permissively licensed clips (5–10 s) from YouTube or Pexels. Suggested keywords: crowded street, pedestrian crosswalk, festival crowd.
Record URLs in data/custom/custom_data.csv—include source, url, license, and notes columns. For YouTube, use yt-dlp; for Pexels, download MP4 assets directly.

Extract frames with FFmpeg (2 FPS baseline) to stabilize annotation quality:

ffmpeg -i input.mp4 -vf "fps=2,scale=w=min(1280\,iw):h=-2" frames/frame_%05d.jpg

Consolidate downloads via helper scripts:

python scripts/download_custom_videos.py `
  --csv data/custom/custom_data.csv `
  --output-dir data/custom/videos `
  --zip-path data/custom/custom_clips.zip

python scripts/prep_custom_clips.py `
  --input-dir data/custom/videos `
  --frames-dir data/custom/frames `
  --zip-dir data/custom/zips `
  --fps 2 --max-dim 1280

python scripts/rename_custom_zips.py --zip-dir data/custom/zips

Mirror public datasets (CrowdHuman, MOT17) using the automated setup:

python scripts/data_setup.py `
  --crowdhuman-dir d:\datasets\crowdhuman `
  --mot17-dir d:\datasets\mot17 `
  --custom-zip-root d:\Projects\VirtualFence\data\annotations `
  --custom-output d:\datasets\custom_data

Unify into YOLO layout for consistent training/validation splits across YOLOv8n, RT-DETR-L, and OmniVLM prompt conditioning:

python YOLO/prepare_yolo_dataset.py `
  --output-dir data/yolo_data `
  --custom-dir data/custom/custom_data `
  --custom-ratio 0.8 0.1 0.1 `
  --crowdhuman-root data/crowdhuman `
  --crowdhuman-ratio 0.8 0.1 0.1 `
  --mot-root data/MOT17 `
  --mot-ratio 0.9 0.1 0.0 `
  --seed 42

Colab note: run the same commands in /content/VirtualFence, train on GPU (--device 0), then zip YOLO/runs/detect/... back to Drive for local evaluation.

Annotation Workflow (CVAT)

Spin up CVAT locally (Docker Compose) or connect to an existing instance with RBAC enabled.
Import media: upload FFmpeg-extracted frames (*.jpg) or the raw videos for interpolation-assisted labeling.
Label configuration:
- Create a single person class with attributes occluded and truncated.
- Use the Rectangle tool with automatic interpolation between keyframes.
Annotation formats:
- Export detection labels as COCO 1.0 for YOLO/RT-DETR fine-tuning.
- Export tracking labels as MOT 1.1 to evaluate IDF1/MOTA and to bootstrap ByteTrack/Hungarian matchers.
Quality checks: leverage CVAT’s review mode to flag low-confidence annotations; sync reviewed exports to data/annotations/cvat_exports/.
Dataset sync: run python scripts/data_setup.py --custom-zip-root data/annotations/cvat_exports to fold reviewed annotations into the unified dataset.

Model Workflows

YOLO Pipeline

Training (GPU recommended):

python YOLO/train_yolo.py `
  --model yolov8n.pt `
  --data YOLO/virtual_fence.yaml `
  --epochs 100 `
  --imgsz 640 `
  --device 0 `
  --name virtfence_yolov8n

Evaluation, metrics, and visualizations:

python YOLO/evaluate_and_visualize.py `
  --model YOLO/runs/virtfence_yolov8n/weights/best.pt `
  --data YOLO/virtual_fence.yaml `
  --split val `
  --output-dir YOLO/reports/yolov8n `
  --device 0 `
  --n-samples 20

Zone-count inference (outputs annotated MP4 with live counter):

python YOLO/zone_counter.py `
  --model YOLO/runs/virtfence_yolov8n/weights/best.pt `
  --source data/input.mp4 `
  --zone-config config/fence_zone.yaml `
  --output results/yolo/output_annotated.mp4 `
  --device 0

Export ONNX for edge deployment:

python YOLO/export_yolo_to_onnx.py `
  --weights YOLO/runs/virtfence_yolov8n/weights/best.pt `
  --output YOLO/exports/virtfence_yolov8n.onnx `
  --imgsz 640 `
  --dynamic

RT-DETR (PaddleDetection)

python scripts/rtdetr_eval.py `
  --config specs/rtdetr/rtdetr_config.yml `
  --weights d:/models/rtdetr_weights.pdparams `
  --overrides EvalReader.dataset.dataset_dir=data/yolo_data `
  --device gpu `
  --output results/rtdetr/metrics.json

Install the appropriate paddlepaddle-gpu or paddlepaddle wheel per https://www.paddlepaddle.org.cn/. Use --device cpu if only CPU wheels are available.

Results Summary

Method	Dataset Split	mAP@0.5	mAP@0.5:0.95	Precision	Recall	FPS	Notes
YOLOv8n	`data/yolo_data/val`	0.7728	0.4792	0.8144	0.7004	95.0	ByteTrack tracking; MP4:`results/yolo/output_annotated.mp4`
RT-DETR-L	`data/yolo_data/val`	0.5630	0.3034	0.6248	0.5692	30.9	Transformer detector; MP4:`results/rtdetr/output_annotated.mp4`
OmniVLM (WIP)	`omnivlm_multiset`	n/a	n/a	n/a	n/a	—	JSON predictions via Nexa SDK; integrate with tracker for zone analytics

Reproduce the table by running python results/compare_models.py --yolo-metrics YOLO/reports/yolov8n/metrics.json --rtdetr-metrics RT_DETR/reports/rtdetrl/metrics.json --output results/comparison.

Commands Cheat Sheet

Environment

conda create -n fence python=3.13 -y
conda activate fence
pip install -r requirements.txt

Torch GPU setup

pip uninstall -y torch torchvision torchaudio
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

Data consolidation

python scripts/data_setup.py --crowdhuman-dir d:/datasets/crowdhuman --mot17-dir d:/datasets/mot17 --custom-zip-root data/custom/annotations --custom-output d:/datasets/custom_data
python YOLO/prepare_yolo_dataset.py --output-dir data/yolo_data --custom-dir data/custom/custom_data --crowdhuman-root data/crowdhuman --mot-root data/MOT17

Evaluation helpers

python scripts/yolo_eval.py --model yolov8n.pt --data d:/datasets/virtual_fence/virtual_fence.yaml --split val --output results/yolo/yolov8n_metrics.json
python scripts/rtdetr_eval.py --config specs/rtdetr/rtdetr_config.yml --weights d:/models/rtdetr_weights.pdparams --output results/rtdetr/metrics.json --overrides EvalReader.dataset.dataset_dir=data/yolo_data
python OmniVLM/generate_manifest.py --custom-root data/custom/custom_data --crowdhuman-root data/crowdhuman --mot17-root data/MOT17 --custom-annotations data/custom/annotations --output OmniVLM/omnivlm_multiset.jsonl

Nexa CLI basics

nexa pull NexaAI/OmniVLM-968M --model-type vlm
nexa serve --host 127.0.0.1:18181
nexa run NexaAI/OmniVLM-968M

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Virtual Fence Benchmark Suite

Table of Contents

Project Specs & Task Overview

Environment & Tooling

Dataset Preparation

Annotation Workflow (CVAT)

Model Workflows

YOLO Pipeline

RT-DETR (PaddleDetection)

Results Summary

Commands Cheat Sheet

References

About

Uh oh!

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
RT_DETR		RT_DETR
YOLO		YOLO
config		config
results		results
scripts		scripts
task		task
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

MrAliAmani/VirtualFence

Folders and files

Latest commit

History

Repository files navigation

Virtual Fence Benchmark Suite

Table of Contents

Project Specs & Task Overview

Environment & Tooling

Dataset Preparation

Annotation Workflow (CVAT)

Model Workflows

YOLO Pipeline

RT-DETR (PaddleDetection)

Results Summary

Commands Cheat Sheet

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages