Skip to content

Dev -> Main#68

Open
brohoya wants to merge 71 commits intomainfrom
dev
Open

Dev -> Main#68
brohoya wants to merge 71 commits intomainfrom
dev

Conversation

@brohoya
Copy link
Copy Markdown
Contributor

@brohoya brohoya commented Mar 29, 2026

No description provided.

EHxuban11 and others added 30 commits March 21, 2026 00:25
Fixes MPS seg training (grid_sample border padding), migrates from
removed rfdetr.main.Model to rfdetr.detr._build_model_context, imports
PostProcess from rfdetr.models instead of rfdetr.models.lwdetr.
When label files use segmentation format (polygon vertices instead of
cx/cy/w/h), compute the bounding box from the polygon hull. This lets
model.val() produce correct box mAP on segmentation datasets.
Same fix as yolo_coco_api.py — when label lines have >5 columns
(segmentation polygon format), compute bbox from polygon hull
instead of misreading polygon vertices as cx/cy/w/h.
- Export: auto-detect seg models (3 outputs) and name them properly
  (boxes, scores, masks) with correct dynamic axes
- Backend: parse mask output from ONNX, resize to original resolution,
  threshold, and wrap in Masks object
- Backend: draw masks when saving annotated images
- Metadata: embed segmentation=true flag in exported ONNX models
- Detection models unchanged (2 outputs, single "output" name)

Verified: PT→ONNX roundtrip preserves masks (same shape, same count).
All existing ONNX detection tests still pass (5/5).
- Warn when tiled inference is used with a segmentation model (masks
  will be None since tile merging doesn't support masks yet)
- Detect segmentation from filename suffix first (-seg) before falling
  back to loading the checkpoint, avoiding a redundant 135MB torch.load
- TestPolygonLabelParsing: verify both parsers (yolo_coco_api and
  YOLODataset) correctly derive bounding boxes from polygon vertices
- TestDetectSegmentation: verify _detect_segmentation from checkpoint
  keys and filename-based detection from -seg suffix
Trains 3 epochs, stops, resumes from checkpoint to 5 epochs.
Verifies checkpoint has seg keys after both phases and model
still produces masks after resume.
…-rf-detr-and-yolo9

39 add instance segmentation to rf detr and yolo9
Implement detector-agnostic ByteTrack (Zhang et al., ECCV 2022) for
multi-object tracking across video frames.

New files:
- libreyolo/tracking/config.py: TrackConfig dataclass
- libreyolo/tracking/kalman_filter.py: 8D Kalman filter (XYAH state)
- libreyolo/tracking/matching.py: IoU distance, score fusion, linear assignment
- libreyolo/tracking/strack.py: Single track with lifecycle management
- libreyolo/tracking/tracker.py: ByteTracker with 3-stage association
- tests/unit/test_tracking.py: 30 unit tests

Modified files:
- Results: added optional track_id tensor (backward compatible)
- BaseModel: added track() generator for video tracking
- __init__.py: lazy imports for ByteTracker, TrackConfig
- draw_boxes() now accepts optional track_ids: labels show "#N" in yellow
  with confidence in gray, colored per track ID instead of per class
- model.track() gains save and save_dir params: saves annotated frames
  with bounding boxes and track IDs to runs/track/<video_stem>/
- e2e test downloads people-walking.mp4 from Roboflow CDN (cached at
  ~/.cache/libreyolo/tracking/), validates YOLOX, YOLO9, and RF-DETR
  tracking: ID assignment, consistency, uniqueness, positivity
- add scipy to pyproject.toml as optional dependency [tracking]
- add TrackConfig.__post_init__ validation (frame_rate>0, thresholds
  in [0,1], track_buffer>=0, minimum_consecutive_frames>=1)
- validate video file exists before opening in model.track()
- simplify minimum_consecutive_frames logic: single activate() call,
  conditionally set is_activated=False (was redundantly calling
  activate then immediately contradicting it)
- rename conf → track_conf in model.track() to avoid confusion with
  detection confidence (it controls track_high_thresh, not detector)
- rename save_dir → output_path to match predict() naming convention
- add track_high_thresh >= track_low_thresh validation to TrackConfig
- protect scipy imports with try/except and actionable error message
  guiding users to pip install libreyolo[tracking]
- add e2e test for save=True (validates frames are created as JPEGs)
- add 6 unit tests for TrackConfig validation: rejects zero frame_rate,
  negative thresholds, thresholds > 1, high < low, negative buffer,
  zero minimum_consecutive_frames
- add e2e test for FileNotFoundError on missing video
- cache ImageFont via lru_cache (was re-probing 3 font paths per frame)
- print save directory path when save=True so user knows where frames go
Users can now pass a video file to model() or model.predict():
- model("video.mp4", stream=True) yields per-frame Results
- model("video.mp4") collects all Results into a list (with RAM warning)
- vid_stride=N processes every Nth frame
- save=True writes annotated output video
- show=True displays frames in a cv2 window
- Results.frame_idx tracks the source frame index

Supported in both PyTorch models (InferenceRunner) and export
backends (BaseBackend). All existing kwargs (conf, iou, classes,
max_det, imgsz) pass through unchanged.

Includes 18 unit tests covering VideoSource, VideoWriter,
is_video_file detection, and Results.frame_idx.
17 tests covering: stream/list modes, vid_stride, save output,
conf/classes filters, frame_idx, orig_shape, detection counts.
Uses the same test video as the tracking branch.
Video uploaded to huggingface.co/datasets/LibreYOLO/test-assets so
we control the asset and don't depend on third-party CDN availability.
Tests now run against YOLOX (n), YOLO9 (t), and RF-DETR (n).
YOLO9 gets the most thorough coverage (stride, save, filters, list mode).
YOLOX and RF-DETR get core smoke tests (stream, detections, stride, save).
RF-DETR tests are gated behind requires_rfdetr and skip gracefully.
- Extract shared video inference loop into run_video_inference() in
  utils/video.py, eliminating ~120 lines of duplication between
  InferenceRunner and BaseBackend
- Move resolve_video_save_path() to utils/video.py as shared utility
- Add context manager protocol (__enter__/__exit__) to VideoSource
  and VideoWriter for reliable resource cleanup
- Add re-iteration guard on VideoSource (raises RuntimeError)
- Wrap release() internals in try/finally for safety
- Log warning on frame decode failures instead of silent skip
- Release VideoCapture on __init__ failure path
- Fix type hints to use Union[] style matching codebase convention
- Remove unused imports (Optional, Tuple)
- Fix test_save_auto_path to use tmp_path instead of global filesystem
- Replace hardcoded frame counts with dynamic query in e2e tests
- Add unit tests: vid_stride > total, re-iteration, context managers,
  double release safety
BaseBackend.__call__ was passing color_format to _predict_video which
does not accept it, causing TypeError on every backend video call.
Video frames are always BGR->RGB converted inside the shared loop,
so color_format is not needed.
brohoya added 30 commits March 29, 2026 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants