Conversation
Fixes MPS seg training (grid_sample border padding), migrates from removed rfdetr.main.Model to rfdetr.detr._build_model_context, imports PostProcess from rfdetr.models instead of rfdetr.models.lwdetr.
When label files use segmentation format (polygon vertices instead of cx/cy/w/h), compute the bounding box from the polygon hull. This lets model.val() produce correct box mAP on segmentation datasets.
Same fix as yolo_coco_api.py — when label lines have >5 columns (segmentation polygon format), compute bbox from polygon hull instead of misreading polygon vertices as cx/cy/w/h.
- Export: auto-detect seg models (3 outputs) and name them properly (boxes, scores, masks) with correct dynamic axes - Backend: parse mask output from ONNX, resize to original resolution, threshold, and wrap in Masks object - Backend: draw masks when saving annotated images - Metadata: embed segmentation=true flag in exported ONNX models - Detection models unchanged (2 outputs, single "output" name) Verified: PT→ONNX roundtrip preserves masks (same shape, same count). All existing ONNX detection tests still pass (5/5).
- Warn when tiled inference is used with a segmentation model (masks will be None since tile merging doesn't support masks yet) - Detect segmentation from filename suffix first (-seg) before falling back to loading the checkpoint, avoiding a redundant 135MB torch.load
- TestPolygonLabelParsing: verify both parsers (yolo_coco_api and YOLODataset) correctly derive bounding boxes from polygon vertices - TestDetectSegmentation: verify _detect_segmentation from checkpoint keys and filename-based detection from -seg suffix
Trains 3 epochs, stops, resumes from checkpoint to 5 epochs. Verifies checkpoint has seg keys after both phases and model still produces masks after resume.
…-rf-detr-and-yolo9 39 add instance segmentation to rf detr and yolo9
Implement detector-agnostic ByteTrack (Zhang et al., ECCV 2022) for multi-object tracking across video frames. New files: - libreyolo/tracking/config.py: TrackConfig dataclass - libreyolo/tracking/kalman_filter.py: 8D Kalman filter (XYAH state) - libreyolo/tracking/matching.py: IoU distance, score fusion, linear assignment - libreyolo/tracking/strack.py: Single track with lifecycle management - libreyolo/tracking/tracker.py: ByteTracker with 3-stage association - tests/unit/test_tracking.py: 30 unit tests Modified files: - Results: added optional track_id tensor (backward compatible) - BaseModel: added track() generator for video tracking - __init__.py: lazy imports for ByteTracker, TrackConfig
- draw_boxes() now accepts optional track_ids: labels show "#N" in yellow with confidence in gray, colored per track ID instead of per class - model.track() gains save and save_dir params: saves annotated frames with bounding boxes and track IDs to runs/track/<video_stem>/ - e2e test downloads people-walking.mp4 from Roboflow CDN (cached at ~/.cache/libreyolo/tracking/), validates YOLOX, YOLO9, and RF-DETR tracking: ID assignment, consistency, uniqueness, positivity
- add scipy to pyproject.toml as optional dependency [tracking] - add TrackConfig.__post_init__ validation (frame_rate>0, thresholds in [0,1], track_buffer>=0, minimum_consecutive_frames>=1) - validate video file exists before opening in model.track() - simplify minimum_consecutive_frames logic: single activate() call, conditionally set is_activated=False (was redundantly calling activate then immediately contradicting it)
- rename conf → track_conf in model.track() to avoid confusion with detection confidence (it controls track_high_thresh, not detector) - rename save_dir → output_path to match predict() naming convention - add track_high_thresh >= track_low_thresh validation to TrackConfig - protect scipy imports with try/except and actionable error message guiding users to pip install libreyolo[tracking] - add e2e test for save=True (validates frames are created as JPEGs)
- add 6 unit tests for TrackConfig validation: rejects zero frame_rate, negative thresholds, thresholds > 1, high < low, negative buffer, zero minimum_consecutive_frames - add e2e test for FileNotFoundError on missing video - cache ImageFont via lru_cache (was re-probing 3 font paths per frame) - print save directory path when save=True so user knows where frames go
Users can now pass a video file to model() or model.predict():
- model("video.mp4", stream=True) yields per-frame Results
- model("video.mp4") collects all Results into a list (with RAM warning)
- vid_stride=N processes every Nth frame
- save=True writes annotated output video
- show=True displays frames in a cv2 window
- Results.frame_idx tracks the source frame index
Supported in both PyTorch models (InferenceRunner) and export
backends (BaseBackend). All existing kwargs (conf, iou, classes,
max_det, imgsz) pass through unchanged.
Includes 18 unit tests covering VideoSource, VideoWriter,
is_video_file detection, and Results.frame_idx.
17 tests covering: stream/list modes, vid_stride, save output, conf/classes filters, frame_idx, orig_shape, detection counts. Uses the same test video as the tracking branch.
Video uploaded to huggingface.co/datasets/LibreYOLO/test-assets so we control the asset and don't depend on third-party CDN availability.
Tests now run against YOLOX (n), YOLO9 (t), and RF-DETR (n). YOLO9 gets the most thorough coverage (stride, save, filters, list mode). YOLOX and RF-DETR get core smoke tests (stream, detections, stride, save). RF-DETR tests are gated behind requires_rfdetr and skip gracefully.
- Extract shared video inference loop into run_video_inference() in utils/video.py, eliminating ~120 lines of duplication between InferenceRunner and BaseBackend - Move resolve_video_save_path() to utils/video.py as shared utility - Add context manager protocol (__enter__/__exit__) to VideoSource and VideoWriter for reliable resource cleanup - Add re-iteration guard on VideoSource (raises RuntimeError) - Wrap release() internals in try/finally for safety - Log warning on frame decode failures instead of silent skip - Release VideoCapture on __init__ failure path - Fix type hints to use Union[] style matching codebase convention - Remove unused imports (Optional, Tuple) - Fix test_save_auto_path to use tmp_path instead of global filesystem - Replace hardcoded frame counts with dynamic query in e2e tests - Add unit tests: vid_stride > total, re-iteration, context managers, double release safety
BaseBackend.__call__ was passing color_format to _predict_video which does not accept it, causing TypeError on every backend video call. Video frames are always BGR->RGB converted inside the shared loop, so color_format is not needed.
Fix/segmentation tracking fixes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.