diff --git a/CURSOR_AND_COLORS_DOCUMENTATION.md b/CURSOR_AND_COLORS_DOCUMENTATION.md deleted file mode 100644 index 813ec1d1..00000000 --- a/CURSOR_AND_COLORS_DOCUMENTATION.md +++ /dev/null @@ -1,408 +0,0 @@ -# Spectrogram Cursor and Classification Colors - -This document describes the features added to CV Studio for enhanced visual feedback during video playback with spectrogram analysis and classification. - -## Features - -### 1. Scrolling Spectrogram with Three-Phase Cursor (node_video.py) - -A yellow vertical cursor is displayed on the spectrogram to show the current playback position. The cursor uses a three-phase behavior to provide clear visual feedback throughout the entire video playback. - -#### How It Works - -The cursor behavior has been updated to use **overall video progress** instead of chunk-based progress, ensuring the cursor always reaches the end of the spectrogram when the video completes. - -**Three Phases:** - -- **Phase 1 - Initial Movement (First 1/3 of video)**: Cursor moves from left (0) to 1/3 of width - - Based on overall video progress: `video_progress = current_frame / total_frames` - - When video is 0-33% complete, cursor smoothly moves from 0 to width/3 - -- **Phase 2 - Middle Scrolling (Middle 1/3 of video)**: Cursor behavior within chunks - - When video is 33-67% complete, uses chunk-based scrolling - - Cursor can move within chunks and spectrogram scrolls to show progression - -- **Phase 3 - Final Movement (Last 1/3 of video)**: Cursor moves from 1/3 to the end - - **NEW**: When video is 67-100% complete, cursor moves from width/3 to right edge - - At 100% completion, cursor reaches ~99% of width (near right edge) - - Makes it visually clear when the video playback is complete ✅ - -**Accurate Synchronization**: The cursor position is calculated based on: - - Current video frame number and total frame count - - Video FPS (frames per second) - - Audio chunk duration and step duration - - Spectrogram chunk being displayed - -#### Implementation Details - -The cursor and scrolling are managed by the `_add_playback_cursor_to_spectrogram()` method: - -```python -def _add_playback_cursor_to_spectrogram(self, spectrogram_bgr, node_id, frame_number): - """ - Add a yellow vertical cursor to the spectrogram showing current playback position. - The cursor behavior has three phases: - 1. Initial phase (first 1/3 of video): cursor moves from left (0) to 1/3 of width - 2. Middle phase (middle 1/3 of video): cursor stays fixed at 1/3, spectrogram scrolls left - 3. Final phase (last 1/3 of video): cursor moves from 1/3 to the end (right edge) - """ -``` - -**Cursor Characteristics:** -- **Color**: Yellow (BGR: 0, 255, 255) -- **Thickness**: 3 pixels for better visibility -- **Fixed Position**: 1/3 of the spectrogram width (during middle phase) -- **Scrolling**: Spectrogram content shifts left while cursor remains stationary (middle phase) -- **Position Calculation**: - 1. Calculate overall video progress: `video_progress = (frame_number / fps) / total_duration` - 2. Phase 1 (0-33%): cursor moves from 0 to width/3 - 3. Phase 2 (33-67%): chunk-based scrolling behavior - 4. Phase 3 (67-100%): cursor moves from width/3 to width (end) - -**Visual Example:** -``` -Phase 1 - Initial Movement (0-33% of video): -┌────────────────────────────────┐ -│ Frequency │ -│ ▓▓▓▓|▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ │ <- Cursor moves right (0 to 1/3) -│ ▓▓▓▓|▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ │ -└────────────────────────────────┘ - -Phase 2 - Middle Scrolling (33-67% of video): -┌────────────────────────────────┐ -│ Frequency │ -│ ▓▓▓▓▓▓▓▓|▓▓▓▓▓▓▓▓▓▓ │ <- Cursor stays at 1/3 -│ ▓▓▓▓▓▓▓▓|▓▓▓▓▓▓▓▓▓▓ │ Spectrogram scrolls ← -└────────────────────────────────┘ - -Phase 3 - Final Movement (67-100% of video): -┌────────────────────────────────┐ -│ Frequency │ -│ ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓| │ <- Cursor moves to end ✅ -│ ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓| │ (1/3 to 100%) -└────────────────────────────────┘ -``` - -### 2. Color-Coded Classification Rankings (node_classification.py) - -Classification results now display with different colors based on their ranking position (1st through 5th place and beyond). - -#### Color Scheme - -| Position | Score Rank | Color | BGR Value | -|----------|------------|-------|-----------| -| 1 | Highest | **Red** | (0, 0, 255) | -| 2 | Second | **Yellow** | (0, 255, 255) | -| 3 | Third | **Blue** | (255, 0, 0) | -| 4 | Fourth | **Violet** | (255, 0, 128) | -| 5 | Fifth | **Magenta** | (255, 0, 255) | -| 6+ | Lower | Green | (0, 255, 0) | - -#### How It Works - -The `draw_classification_info()` method has been enhanced in the Classification Node to apply rank-based colors: - -```python -def draw_classification_info(self, image, class_ids, class_scores, class_names): - """ - Override base class method to add color differentiation based on ranking. - Position 1 (index 0, highest score): Red - Position 2 (index 1): Yellow - Position 3 (index 2): Blue - Position 4 (index 3): Violet - Position 5 (index 4): Magenta - """ -``` - -#### Visual Example - -``` -Classification Results Display: -┌────────────────────────────────┐ -│ 12:dog(0.95) <- Red (1st) │ -│ 8:cat(0.87) <- Yellow (2nd)│ -│ 15:bird(0.73) <- Blue (3rd) │ -│ 22:fish(0.42) <- Violet (4th)│ -│ 9:horse(0.31) <- Magenta (5th)│ -│ 5:mouse(0.18) <- Green (6th+)│ -└────────────────────────────────┘ -``` - -#### Supported Models - -This color scheme works with all classification models: -- MobileNetV3 Small -- MobileNetV3 Large -- EfficientNet B0 -- ResNet50 -- **Yolo-cls** (audio classification) - -### 3. Enhanced Classification Display in Concat Node (node_image_concat.py) - -When classification results are displayed in the Image Concat node, they appear with enhanced formatting for better visibility. - -#### Display Characteristics - -- **Size**: Larger text (font scale 1.0 vs 0.6, thickness 3 vs 2) -- **Position**: Bottom left corner instead of top left -- **Colors**: Same rank-based color scheme as classification node -- **Line Spacing**: Increased spacing (35px vs 20px) for better readability - -#### Implementation - -```python -def draw_classification_info(self, image, class_ids, class_scores, class_names): - """ - Override base class method to display classification results - bigger and at the bottom left of the image. - """ - # Larger font size and thicker text - font_scale = 1.0 # Increased from 0.6 - thickness = 3 # Increased from 2 - line_spacing = 35 # Increased from 20 - - # Calculate starting position from bottom - # Position at bottom left with margin -``` - -**Visual Example in Concat View:** -``` -┌─────────────────────────────────────┐ -│ │ -│ Video/Image Display │ -│ │ -│ │ -│ 12:dog(0.95) <- Red (larger) │ -│ 8:cat(0.87) <- Yellow (larger)│ -│ 15:bird(0.73) <- Blue (larger) │ -└─────────────────────────────────────┘ - ↑ Bottom left positioning -``` - -### 4. Audio Storage Feature (node_video.py) - -When a video is loaded and preprocessed, the audio track is automatically extracted and saved as a separate file for reuse. - -#### How It Works - -During video preprocessing in the `_preprocess_video()` method: - -1. **Audio Extraction**: Audio is extracted from the video using librosa -2. **MP3 Conversion**: The extracted audio is converted to MP3 format using ffmpeg -3. **File Storage**: The MP3 file is saved in the same directory as the video with suffix `_audio.mp3` -4. **Fallback**: If MP3 conversion fails, a WAV file is saved instead - -#### Saved File Format - -**Primary format: MP3** -- Filename: `{video_name}_audio.mp3` -- Codec: libmp3lame (high quality) -- Quality: qscale 2 (high quality setting) -- Location: Same folder as the source video - -**Fallback format: WAV** -- Filename: `{video_name}_audio.wav` -- Used when ffmpeg MP3 encoding is unavailable -- Preserves original sample rate and audio data - -#### Benefits - -- **Reusability**: Audio file can be used by other applications without re-extraction -- **Performance**: Avoids repeated audio extraction from video -- **Convenience**: Stored alongside video for easy access -- **Quality**: High-quality MP3 encoding preserves audio fidelity - -#### Example - -When loading a video file: -``` -Video: /path/to/videos/my_video.mp4 -Audio saved as: /path/to/videos/my_video_audio.mp3 -``` - -Console output during preprocessing: -``` -🎵 Extracting audio... -✅ Audio extracted (SR: 22050 Hz, Duration: 30.5s) -💾 Audio saved as MP3: /path/to/videos/my_video_audio.mp3 -``` - -## Usage - -### Enabling the Three-Phase Cursor Spectrogram - -1. Add a **Video** node to your graph -2. Load a video file with audio -3. Enable the "Show Spectrogram" checkbox -4. Play the video -5. Observe the cursor behavior: - - **Phase 1 (0-33%)**: Cursor moves from left to 1/3 position - - **Phase 2 (33-67%)**: Cursor fixed at 1/3, spectrogram scrolls - - **Phase 3 (67-100%)**: Cursor moves from 1/3 to end, clearly showing completion ✅ - -### Accessing Saved Audio Files - -1. Load a video file in the Video node -2. The audio is automatically extracted and saved during preprocessing -3. Check the same folder as your video file -4. Look for `{video_name}_audio.mp3` or `{video_name}_audio.wav` -5. The audio file can be used in other applications or nodes - -### Viewing Color-Coded Classifications - -1. Add a **Classification** node to your graph -2. Connect it to an input source (image, video, webcam) -3. Select a classification model -4. The results will automatically display with rank-based colors - -### Enhanced Display in Concat Node - -1. Add an **Image Concat** node to your graph -2. Connect classification results to one of its inputs -3. Classification results will appear larger and at the bottom left of each image slot - -## Technical Notes - -### Performance - -- **Three-Phase Cursor**: Minimal performance impact (simple array operations and line drawing) -- **Audio Storage**: One-time cost during video preprocessing, no runtime impact -- **Classification Colors**: No performance impact (only changes text color, not computation) -- **Concat Display**: Negligible impact (same rendering, just different position and scale) - -### Compatibility - -- All features are **backward compatible** -- No changes required to existing graphs or configurations -- Works with all existing input sources and models -- Audio files are created automatically without affecting existing functionality - -### Thread Safety - -All features operate on the main update thread and are thread-safe within the CV Studio architecture. - -## Code References - -### Modified Files - -1. **`/node/InputNode/node_video.py`** - - Modified: `_add_playback_cursor_to_spectrogram()` method to implement three-phase cursor behavior - - Added video progress calculation based on total frames - - Added Phase 3 logic for final 1/3 of video (cursor moves to end) - - Modified: `_preprocess_video()` method to add audio storage - - Saves extracted audio as MP3 (primary) or WAV (fallback) - - Files saved in same directory as source video - - Modified: `update()` method to call cursor rendering - -2. **`/node/DLNode/node_classification.py`** - - Modified: `draw_classification_info()` method with extended 5-color ranking system - -3. **`/node/VideoNode/node_image_concat.py`** - - Added: `draw_classification_info()` method override for larger, bottom-left display - -### Testing - -Test scripts validate the features: -- **Custom test script**: Validates three-phase cursor behavior and end-of-video progression -- **`/tests/test_cursor_and_colors.py`**: Validates cursor, scrolling, and color features - -Run tests with: -```bash -python tests/test_cursor_and_colors.py -``` - -## Future Enhancements - -Potential improvements for future versions: - -1. **Configurable Cursor Options**: - - Adjustable cursor color - - Configurable fixed position (currently 1/3) - - Different cursor styles (line, arrow, highlight) - -2. **Custom Color Schemes**: - - User-defined colors for classification rankings - - Theme support (dark mode, light mode) - - Colorblind-friendly palettes - -3. **Advanced Scrolling**: - - Configurable scroll speed - - Smooth scrolling animation - - Multiple scroll modes (fixed cursor, centered cursor, etc.) - -4. **Display Options**: - - Configurable text size and position - - Transparency/opacity controls - - Font selection - -## Examples - -### Example 1: Audio Classification with Three-Phase Cursor - -1. Load a video with audio content -2. Connect Video node → Classification (Yolo-cls) node -3. Enable spectrogram display -4. Observe the three-phase cursor behavior: - - **Phase 1 (0-33%)**: Yellow cursor moves from left to 1/3 position - - **Phase 2 (33-67%)**: Cursor fixed at 1/3, spectrogram scrolls left - - **Phase 3 (67-100%)**: Cursor moves from 1/3 to right edge, showing clear completion ✅ - - Classification results in rank-based colors (red, yellow, blue, violet, magenta) - - Real-time synchronization between audio and visual feedback -5. Check the video folder for the saved audio file (`{video_name}_audio.mp3`) - -### Example 2: Multi-View Classification Comparison - -1. Load multiple images or video frames -2. Connect to Classification nodes with different models -3. Use Image Concat node to display results side-by-side -4. Observe: - - Larger classification text at bottom left of each view - - Easy comparison of classification results across models - - Color-coded rankings for quick visual scanning - -### Example 3: Real-Time Audio Analysis - -1. Use Video node with audio-rich content -2. Connect to Yolo-cls for audio classification -3. Enable spectrogram display -4. Add Image Concat to show both video and spectrogram -5. Observe synchronized audio-visual analysis with enhanced display - -## Troubleshooting - -**Q: The cursor doesn't reach the end of the spectrogram** -- A: This is now fixed! The cursor will reach ~99% at video completion (Phase 3) -- A: Verify the video has proper FPS metadata and frame count - -**Q: The cursor stays fixed in the middle** -- A: This is expected during Phase 2 (middle 33-67% of video) -- A: The cursor will start moving again in Phase 3 (last 33% of video) - -**Q: Spectrogram doesn't scroll** -- A: This is normal during Phase 1 (first 33%) and Phase 3 (last 33%) -- A: Scrolling only occurs during Phase 2 (middle 33-67% of video) -- A: Ensure the video is playing (not paused) - -**Q: Audio file not created** -- A: Check console output for preprocessing errors -- A: Ensure ffmpeg is installed for MP3 conversion -- A: Check write permissions in the video directory -- A: A WAV file should be created if MP3 conversion fails - -**Q: Audio file location** -- A: Audio is saved in the same folder as the source video -- A: Look for `{video_name}_audio.mp3` or `{video_name}_audio.wav` - -**Q: Classification colors don't appear correctly** -- A: Verify you have at least 5 classification results for all colors -- A: Update to the latest version - -**Q: Text in concat node is too large/small** -- A: This is currently fixed at font_scale=1.0; customization coming in future updates - -**Q: Text position is cut off at bottom** -- A: Image resolution may be too small; the positioning accounts for text height - -## License - -These features are part of CV Studio and are licensed under the Apache License 2.0. diff --git a/IMPLEMENTATION_SUMMARY.md b/IMPLEMENTATION_SUMMARY.md deleted file mode 100644 index 63c02a24..00000000 --- a/IMPLEMENTATION_SUMMARY.md +++ /dev/null @@ -1,175 +0,0 @@ -# Implementation Summary: Spectrogram Cursor and Classification Colors - -## Task Completed ✓ - -Successfully implemented two visual enhancement features for CV Studio as requested: - -1. **Yellow cursor on spectrogram** - Shows current video playback position -2. **Color-coded classification rankings** - Different colors for positions 1, 2, 3 - -## Implementation Details - -### Feature 1: Yellow Cursor on Spectrogram - -**File**: `node/InputNode/node_video.py` - -**Method Added**: `_add_playback_cursor_to_spectrogram()` - -**How it works**: -1. Calculates current playback time from frame number and FPS -2. Determines which audio chunk is displayed based on step_duration -3. Calculates cursor position within the chunk -4. Draws a 3-pixel wide yellow vertical line at the calculated position -5. Color: Yellow (BGR: 0, 255, 255) - -**Integration**: -- Called in the `update()` method when spectrogram display is enabled -- Works seamlessly with existing spectrogram pre-processing pipeline -- Minimal performance impact (simple line drawing operation) - -### Feature 2: Color-Coded Classification Rankings - -**File**: `node/DLNode/node_classification.py` - -**Method Added**: `draw_classification_info()` (override) - -**Color Scheme**: -| Position | Rank | Color | BGR Value | -|----------|------|-------|-----------| -| 1 | Highest | Red | (0, 0, 255) | -| 2 | Second | Green | (0, 255, 0) | -| 3 | Third | Blue | (255, 0, 0) | -| 4+ | Lower | Green | (0, 255, 0) | - -**Integration**: -- Overrides base class method to apply rank-based colors -- Works with all classification models (MobileNet, EfficientNet, ResNet50, Yolo-cls) -- Maintains backward compatibility - -## Code Quality - -### Syntax Validation -- ✓ node_video.py syntax valid -- ✓ node_classification.py syntax valid -- ✓ No breaking changes to existing code - -### Testing -- ✓ Created comprehensive test suite (`test_cursor_and_colors.py`) -- ✓ All 5 tests passing -- ✓ Validates both feature implementations -- ✓ Checks integration in update methods - -### Documentation -- ✓ Created detailed documentation (`CURSOR_AND_COLORS_DOCUMENTATION.md`) -- ✓ Includes usage examples -- ✓ Explains technical implementation -- ✓ Provides troubleshooting guide - -## Files Modified - -``` -node/InputNode/node_video.py | +65 lines -node/DLNode/node_classification.py | +45 lines -``` - -## Files Added - -``` -tests/test_cursor_and_colors.py | +187 lines (test suite) -CURSOR_AND_COLORS_DOCUMENTATION.md | +203 lines (documentation) -IMPLEMENTATION_SUMMARY.md | this file -``` - -## Git Commits - -``` -b9ae979 - Add tests and documentation for cursor and color features -920cbf6 - Add yellow cursor on spectrogram and color-coded classification rankings -9f6734a - Initial plan -``` - -## Testing Results - -```bash -$ python tests/test_cursor_and_colors.py - -Running tests for spectrogram cursor and classification colors... - -✓ Spectrogram cursor method exists and is properly integrated -✓ Classification color method exists with correct color definitions -✓ Cursor calculation logic is properly implemented -✓ Color ranking logic is properly implemented -✓ Features are properly integrated in update method - -============================================================ -All tests passed! ✓ -============================================================ - -Implemented features: -1. Yellow cursor on spectrogram showing playback position -2. Color-coded classification rankings: - - Position 1 (highest): Red - - Position 2: Green - - Position 3: Blue -``` - -## Key Design Decisions - -### Cursor Implementation -- **Yellow color chosen**: High visibility against typical spectrogram colors -- **3-pixel thickness**: Balance between visibility and precision -- **Position calculation**: Based on chunk metadata for accurate synchronization -- **Non-destructive**: Uses `.copy()` to avoid modifying original spectrogram - -### Classification Colors -- **Rank-based vs class-based**: Rank-based makes it easy to identify top predictions -- **BGR format**: Consistent with OpenCV conventions -- **Red for #1**: Standard convention for highest importance/value -- **Graceful fallback**: Green for positions beyond top 3 - -## Performance Impact - -- **Cursor rendering**: Negligible (~0.1ms per frame) -- **Color selection**: No measurable impact (only changes text color) -- **Memory**: No additional memory overhead - -## Backward Compatibility - -- ✓ No breaking changes -- ✓ Works with existing graphs -- ✓ Compatible with all existing nodes -- ✓ No configuration changes required - -## Future Enhancements (Optional) - -1. Configurable cursor color -2. Multiple cursor styles (line, arrow, highlight) -3. Custom color schemes for classifications -4. Confidence-based color intensity -5. Multi-cursor support for time context - -## Verification Checklist - -- [x] Spectrogram cursor draws correctly -- [x] Cursor position synchronized with video playback -- [x] Cursor color is yellow (0, 255, 255) -- [x] Classification colors applied correctly -- [x] Red for position 1 (highest score) -- [x] Green for position 2 -- [x] Blue for position 3 -- [x] No syntax errors -- [x] Code structure validated -- [x] Tests created and passing -- [x] Documentation complete -- [x] Changes committed to repository - -## Conclusion - -Both requested features have been successfully implemented with: -- Clean, maintainable code -- Comprehensive testing -- Detailed documentation -- Full backward compatibility -- Minimal performance impact - -The implementation is ready for production use. diff --git a/IMPLEMENTATION_SUMMARY_NEW.md b/IMPLEMENTATION_SUMMARY_NEW.md deleted file mode 100644 index 7d08cc80..00000000 --- a/IMPLEMENTATION_SUMMARY_NEW.md +++ /dev/null @@ -1,167 +0,0 @@ -# Implementation Summary - -## Problem Statement (French) -"premiere frame le cursor bouge, mais ensuite ce sont les images qui doivent glisser ensuite avec le cursor qui reste en place dans node_video.py, ensuite il faut que la position 2, index 1 resultat affiché sur yolo-cls soit en yellow, 4 et 5 tu met en violet et magenta, dans le node concat, les resultats de classification doivent etre plus grosses et en bas a gauche." - -## Translation -- First frame the cursor moves, but then the images should slide with the cursor staying in place in node_video.py -- Position 2 (index 1) result displayed on yolo-cls should be in yellow -- Positions 4 and 5 should be in violet and magenta -- In the concat node, classification results should be bigger and in the bottom left - -## Changes Implemented - -### 1. node_video.py - Scrolling Spectrogram -**File**: `/node/InputNode/node_video.py` - -**Changes**: -- Modified `_add_playback_cursor_to_spectrogram()` method -- Cursor now moves during first 1/3 of playback -- After 1/3, cursor stays fixed at position (width/3) -- Spectrogram content scrolls to the left -- Maintains synchronization with video playback - -**Key Code**: -```python -# Fixed cursor position at 1/3 of the width -fixed_cursor_x = width // 3 - -if cursor_position_ratio <= 1.0 / 3.0: - # First portion: cursor moves - cursor_x = int(cursor_position_ratio * width) - spectrogram_with_cursor = spectrogram_bgr.copy() -else: - # After first portion: cursor fixed, spectrogram scrolls - scroll_ratio = (cursor_position_ratio - 1.0 / 3.0) / (2.0 / 3.0) - scroll_pixels = int(scroll_ratio * (width - fixed_cursor_x)) - # Scroll implementation... - cursor_x = fixed_cursor_x -``` - -### 2. node_classification.py - Extended Color Scheme -**File**: `/node/DLNode/node_classification.py` - -**Changes**: -- Extended rank_colors from 3 to 5 colors -- Position 2 changed from green to yellow -- Added positions 4 and 5 with violet and magenta - -**Color Mapping**: -| Position | Index | Color | BGR Value | Change | -|----------|-------|-------|-----------|--------| -| 1 | 0 | Red | (0, 0, 255) | Unchanged | -| 2 | 1 | Yellow | (0, 255, 255) | Changed from green | -| 3 | 2 | Blue | (255, 0, 0) | Unchanged | -| 4 | 3 | Violet | (255, 0, 128) | New | -| 5 | 4 | Magenta | (255, 0, 255) | New | - -**Key Code**: -```python -rank_colors = [ - (0, 0, 255), # Position 1: Red - (0, 255, 255), # Position 2: Yellow - (255, 0, 0), # Position 3: Blue - (255, 0, 128), # Position 4: Violet - (255, 0, 255), # Position 5: Magenta -] -``` - -### 3. node_image_concat.py - Enhanced Classification Display -**File**: `/node/VideoNode/node_image_concat.py` - -**Changes**: -- Added override of `draw_classification_info()` method -- Increased font scale from 0.6 to 1.0 -- Increased thickness from 2 to 3 -- Changed position from top-left to bottom-left -- Increased line spacing from 20 to 35 pixels - -**Key Code**: -```python -def draw_classification_info(self, image, class_ids, class_scores, class_names): - # Larger font size and thicker text - font_scale = 1.0 # Increased from 0.6 - thickness = 3 # Increased from 2 - line_spacing = 35 # Increased from 20 - - # Calculate starting position from bottom - num_lines = len(class_ids) - start_y = height - 15 - (num_lines - 1) * line_spacing - - # Position at bottom left - y_position = start_y + (index * line_spacing) -``` - -### 4. Tests Updated -**File**: `/tests/test_cursor_and_colors.py` - -**Changes**: -- Updated color checks to include yellow, violet, and magenta -- Updated expected output messages -- All tests passing - -### 5. Documentation Updated -**File**: `/CURSOR_AND_COLORS_DOCUMENTATION.md` - -**Changes**: -- Comprehensive update describing all three features -- Visual examples and diagrams -- Usage instructions -- Technical details -- Troubleshooting guide - -## Testing Results - -### Tests Executed: -1. ✅ `test_cursor_and_colors.py` - All tests passing -2. ✅ `test_yolo_cls_registration.py` - All tests passing -3. ✅ CodeQL security scan - No vulnerabilities found - -### Test Coverage: -- Spectrogram cursor method exists and is properly integrated -- Classification color method exists with correct color definitions -- Cursor calculation logic is properly implemented -- Color ranking logic is properly implemented -- Features are properly integrated in update method - -## Files Modified - -1. `/node/InputNode/node_video.py` - 36 lines modified -2. `/node/DLNode/node_classification.py` - 22 lines modified -3. `/node/VideoNode/node_image_concat.py` - 57 lines added -4. `/tests/test_cursor_and_colors.py` - 22 lines modified -5. `/CURSOR_AND_COLORS_DOCUMENTATION.md` - 212 lines modified - -**Total Changes**: 270 insertions, 79 deletions across 5 files - -## Backward Compatibility - -All changes are backward compatible: -- Existing functionality preserved -- No breaking changes to APIs -- No changes to configuration requirements -- Works with all existing nodes and models - -## Security - -- ✅ No security vulnerabilities introduced (CodeQL scan) -- ✅ No external dependencies added -- ✅ No changes to authentication or authorization -- ✅ No new network calls or file operations - -## Performance Impact - -- **Scrolling Spectrogram**: Minimal (simple array operations) -- **Color Changes**: None (same rendering, different colors) -- **Concat Display**: Negligible (same text rendering, different position/scale) - -## Summary - -All requirements from the problem statement have been successfully implemented: - -1. ✅ Spectrogram cursor stays fixed after initial movement, spectrogram scrolls -2. ✅ Classification position 2 (index 1) is now yellow -3. ✅ Positions 4 and 5 are now violet and magenta -4. ✅ Classification results in concat node are bigger and at bottom left - -The implementation is tested, documented, and secure. diff --git a/__init__.py b/__init__.py index 3dc1f76b..84d429bc 100644 --- a/__init__.py +++ b/__init__.py @@ -1 +1,6 @@ +"""CV Studio - Node-based Computer Vision Application. + +CV Studio is a professional node-based image processing application for +computer vision development, verification, and comparison. +""" __version__ = "0.1.0" diff --git a/main.py b/main.py index 46a4ed77..af629896 100644 --- a/main.py +++ b/main.py @@ -1,5 +1,15 @@ #!/usr/bin/env python # -*- coding: utf-8 -*- +"""CV Studio - Node-based Computer Vision Application. + +This module is the main entry point for CV Studio, a professional node-based +image processing application for computer vision development, verification, +and comparison. + +The application provides a visual node editor powered by DearPyGUI that allows +users to create computer vision pipelines through an intuitive drag-and-drop +interface. +""" import sys import copy import json @@ -22,6 +32,16 @@ def get_args(): + """Parse and return command line arguments. + + Returns + ------- + argparse.Namespace + Parsed command line arguments containing: + - setting: str, path to configuration JSON file + - unuse_async_draw: bool, disable asynchronous drawing if True + - use_debug_print: bool, enable debug logging if True + """ parser = argparse.ArgumentParser() parser.add_argument( @@ -39,6 +59,17 @@ def get_args(): def async_main(node_editor): + """Run the asynchronous main loop for the node editor. + + This function continuously updates all nodes in the graph until + the terminate flag is set. It maintains separate dictionaries for + image, result, and audio data passed between nodes. + + Parameters + ---------- + node_editor : DpgNodeEditor + The node editor instance managing the node graph. + """ node_image_dict = {} node_result_dict = {} node_audio_dict = {} @@ -55,9 +86,25 @@ def update_node_info( node_audio_dict, mode_async=True, ): - editor_width = dpg.get_viewport_client_width() - editor_height = dpg.get_viewport_client_height() - + """Update all nodes in the node graph for one iteration. + + This function processes all nodes in topologically sorted order, + updates their state, and stores the results in the provided dictionaries. + + Parameters + ---------- + node_editor : DpgNodeEditor + The node editor instance managing the node graph. + node_image_dict : dict + Dictionary mapping node IDs to image data. + node_result_dict : dict + Dictionary mapping node IDs to JSON result data. + node_audio_dict : dict + Dictionary mapping node IDs to audio data. + mode_async : bool, optional + If True, errors during node updates are caught and logged. + If False, errors propagate. Default is True. + """ try: dpg.set_item_pos(node_editor.window, [0, 0]) dpg.set_item_width(node_editor.window, dpg.get_viewport_client_width()) @@ -109,6 +156,11 @@ def update_node_info( def main(): + """Main entry point for the CV Studio application. + + This function initializes the application, sets up the node editor, + configures cameras and serial devices, and starts the main event loop. + """ args = get_args() setting = args.setting unuse_async_draw = args.unuse_async_draw diff --git a/node/InputNode/_node_image.py b/node/InputNode/_node_image.py index b33473f7..3a8f1edb 100644 --- a/node/InputNode/_node_image.py +++ b/node/InputNode/_node_image.py @@ -1,5 +1,10 @@ #!/usr/bin/env python # -*- coding: utf-8 -*- +"""Image input node for CV Studio. + +This module provides the Image node that allows users to load and display +static images in the node editor. +""" import cv2 import numpy as np import dearpygui.dearpygui as dpg @@ -7,16 +12,25 @@ from node_editor.util import dpg_get_value, dpg_set_value from node.node_abc import DpgNodeABC -#from node_editor.util import convert_cv_to_dpg from node.basenode import Node class FactoryNode: + """Factory class for creating Image nodes. + + Attributes + ---------- + node_label : str + Human-readable label for the node. + node_tag : str + Unique tag identifier for the node type. + """ node_label = 'Image' node_tag = 'Image' def __init__(self): + """Initialize the FactoryNode.""" pass @@ -29,6 +43,26 @@ def add_node( opencv_setting_dict=None, callback=None, ): + """Add an Image node to the node editor. + + Parameters + ---------- + parent : int + DearPyGUI parent widget ID. + node_id : str + Unique identifier for this node instance. + pos : list[int, int], optional + Initial (x, y) position of the node. Default is [0, 0]. + opencv_setting_dict : dict, optional + Configuration dictionary containing OpenCV and application settings. + callback : callable, optional + Callback function for node events. + + Returns + ------- + ImageNode + The created image node instance. + """ node = ImageNode() @@ -150,6 +184,28 @@ def add_yellow_disabled_button(label, tag): class ImageNode(Node): + """Node for loading and displaying static images. + + This node allows users to select an image file and outputs the image + data to connected nodes. + + Attributes + ---------- + _ver : str + Version string for the node implementation. + node_label : str + Human-readable label for the node. + node_tag : str + Unique tag identifier for the node type. + _opencv_setting_dict : dict + Configuration dictionary for OpenCV settings. + _image : dict + Dictionary storing loaded images keyed by node ID. + _image_filepath : dict + Dictionary storing image file paths keyed by node ID. + _prev_image_filepath : dict + Dictionary storing previous image file paths for change detection. + """ _ver = '0.0.1' node_label = 'Image' @@ -162,6 +218,7 @@ class ImageNode(Node): _prev_image_filepath = {} def __init__(self): + """Initialize the ImageNode.""" super().__init__() # Call parent constructor self.node_label = 'Image' self.node_tag = 'Image' @@ -174,6 +231,27 @@ def update( node_result_dict, node_audio_dict, ): + """Update the node and output the loaded image. + + Parameters + ---------- + node_id : str + Unique identifier for this node instance. + connection_list : list + List of connections to this node. + node_image_dict : dict + Dictionary mapping node IDs to image data. + node_result_dict : dict + Dictionary mapping node IDs to result data. + node_audio_dict : dict + Dictionary mapping node IDs to audio data. + + Returns + ------- + dict + Dictionary with 'image', 'json', and 'audio' keys containing + the loaded image data. + """ tag_node_name = str(node_id) + ':' + self.node_tag output_value01_tag = tag_node_name + ':' + self.TYPE_IMAGE + ':Output01Value' @@ -202,6 +280,13 @@ def update( return {"image": frame, "json": None, "audio": None} def close(self, node_id): + """Clean up resources when the node is closed. + + Parameters + ---------- + node_id : str + Unique identifier for this node instance. + """ pass def get_setting_dict(self, node_id): diff --git a/node/basenode.py b/node/basenode.py index 6982b291..018fb31c 100644 --- a/node/basenode.py +++ b/node/basenode.py @@ -1,5 +1,10 @@ #!/usr/bin/env python # -*- coding: utf-8 -*- +"""Base node implementation for the CV Studio node editor. + +This module provides the base Node class and related data types that all +nodes in the CV Studio node editor inherit from or use. +""" import copy import time import os @@ -13,6 +18,23 @@ class DataType: + """Enumeration of supported data types for node connections. + + Attributes + ---------- + TYPE_BOOLEAN : str + Boolean data type. + TYPE_TEXT : str + Text/string data type. + TYPE_IMAGE : str + Image data type. + TYPE_FLOAT : str + Floating point number data type. + TYPE_TIME_MS : str + Timestamp in milliseconds data type. + TYPE_SOUND : str + Audio/sound data type. + """ TYPE_BOOLEAN = "BOOLEAN" TYPE_TEXT = "TEXT" TYPE_IMAGE = "IMAGE" @@ -22,11 +44,36 @@ class DataType: class PortType: + """Enumeration of port types for node connections. + + Attributes + ---------- + INPUT : str + Input port type. + OUTPUT : str + Output port type. + """ INPUT = "INPUT" OUTPUT = "OUTPUT" class Node: + """Base class for all nodes in the CV Studio node editor. + + This class provides common functionality for all node types including + image conversion, input/output handling, and configuration management. + + Attributes + ---------- + _ver : str + Version string for the node implementation. + node_label : str + Human-readable label for the node. + node_tag : str + Unique tag identifier for the node type. + node_data : Any + Data associated with the node. + """ _ver = "0.0.1" node_label = "BaseNode" node_tag = "BaseNode" @@ -44,14 +91,26 @@ class Node: OUTPUT = "OUTPUT" def __init__(self, node_id=1, connection_dict=None, opencv_setting_dict=None): + """Initialize a new Node instance. + + Parameters + ---------- + node_id : int, optional + Unique identifier for this node instance. Default is 1. + connection_dict : dict, optional + Dictionary defining the node's connections. Default is None. + opencv_setting_dict : dict, optional + Configuration dictionary for OpenCV and application settings. + Default is None. + """ self.id = self.generate_id() self.node_label = "BaseNode" self.node_tag = "BaseNode" self.tag_node_name = f"{node_id}:{self.node_tag}" - # Générer les tags dynamiquement en fonction du dictionnaire + # Generate tags dynamically based on the connection dictionary # self.tags = self.generate_tags(connection_dict) - # Paramètres OpenCV + # OpenCV parameters self._opencv_setting_dict = opencv_setting_dict if opencv_setting_dict else {} self.small_window_w = self._opencv_setting_dict.get("process_width", 640) self.small_window_h = self._opencv_setting_dict.get("process_height", 480) @@ -61,12 +120,31 @@ def __init__(self, node_id=1, connection_dict=None, opencv_setting_dict=None): self.use_gpu = self._opencv_setting_dict.get("use_gpu", False) def generate_id(self): + """Generate a unique ID for the node. + + Returns + ------- + str + A unique UUID string. + """ return str(uuid.uuid4()) def generate_tags(self, connection_dict): + """Generate DearPyGUI tags for node connections. + + Parameters + ---------- + connection_dict : dict + Dictionary mapping connection indices to connection information. + + Returns + ------- + dict + Dictionary of generated tags for inputs and outputs. + """ tags = {} - # Parcours du dictionnaire pour générer les tags + # Iterate through the dictionary to generate tags for index, connection_info in connection_dict.items(): connection_type = connection_info.get("CONNECTION") data_type = connection_info.get("TYPE") @@ -81,12 +159,48 @@ def generate_tags(self, connection_dict): return tags def update(self, node_id, connection_list, node_image_dict, node_result_dict): + """Update the node's state and process data. + + Parameters + ---------- + node_id : str + Unique identifier for this node instance. + connection_list : list + List of connections to this node. + node_image_dict : dict + Dictionary mapping node IDs to image data. + node_result_dict : dict + Dictionary mapping node IDs to result data. + """ pass def close(self, node_id): + """Clean up resources when the node is closed. + + Parameters + ---------- + node_id : str + Unique identifier for this node instance. + """ pass def convert_cv_to_dpg(self, image, width, height): + """Convert an OpenCV image to DearPyGUI texture format. + + Parameters + ---------- + image : numpy.ndarray + OpenCV image in BGR format. + width : int + Target width for the texture. + height : int + Target height for the texture. + + Returns + ------- + numpy.ndarray + Flattened array of normalized RGB values suitable for DearPyGUI. + """ resize_image = cv2.resize(image, (width, height), interpolation=cv2.INTER_AREA) data = np.flip(resize_image, 2) diff --git a/node/node_abc.py b/node/node_abc.py index 7692c4a4..330788eb 100644 --- a/node/node_abc.py +++ b/node/node_abc.py @@ -1,7 +1,39 @@ +"""Abstract Base Class for DearPyGUI Node Editor Nodes. + +This module defines the abstract interface that all node types must implement +in the CV Studio node editor system. +""" from abc import ABCMeta, abstractmethod class DpgNodeABC(metaclass=ABCMeta): + """Abstract base class for all node types in the CV Studio node editor. + + This class defines the interface that all nodes must implement, including + methods for adding nodes to the GUI, updating node state, and managing + node settings. + + Attributes + ---------- + _ver : str + Version string for the node implementation. + node_label : str + Human-readable label displayed in the node editor. + node_tag : str + Unique tag identifier for the node type. + TYPE_INT : str + Constant for integer data type connections. + TYPE_FLOAT : str + Constant for float data type connections. + TYPE_IMAGE : str + Constant for image data type connections. + TYPE_TIME_MS : str + Constant for timestamp data type connections. + TYPE_JSON : str + Constant for JSON data type connections. + TYPE_SOUND : str + Constant for audio data type connections. + """ _ver = '0.0.0' node_label = '' @@ -24,6 +56,28 @@ def add_node( height, opencv_setting_dict, ): + """Add the node to the DearPyGUI interface. + + Parameters + ---------- + parent : int + DearPyGUI parent widget ID. + node_id : str + Unique identifier for this node instance. + pos : tuple[int, int] + Initial (x, y) position of the node in the editor. + width : int + Width of the node editor window. + height : int + Height of the node editor window. + opencv_setting_dict : dict + Configuration dictionary containing OpenCV and application settings. + + Returns + ------- + Node + The node instance that was added. + """ pass @abstractmethod @@ -35,16 +89,71 @@ def update( node_result_dict, node_audio_dict, ): + """Update the node's state and process data. + + This method is called every frame to process input data from connected + nodes and produce output data. + + Parameters + ---------- + node_id : str + Unique identifier for this node instance. + connection_list : list + List of connections to this node from other nodes. + node_image_dict : dict + Dictionary mapping node IDs to image data. + node_result_dict : dict + Dictionary mapping node IDs to JSON result data. + node_audio_dict : dict + Dictionary mapping node IDs to audio data. + + Returns + ------- + dict + Dictionary containing 'image', 'json', and 'audio' keys with + the processed output data. + """ pass @abstractmethod def get_setting_dict(self, node_id): + """Get the current settings for this node. + + Parameters + ---------- + node_id : str + Unique identifier for this node instance. + + Returns + ------- + dict + Dictionary containing the node's current settings. + """ pass @abstractmethod def set_setting_dict(self, node_id, setting_dict): + """Set the settings for this node. + + Parameters + ---------- + node_id : str + Unique identifier for this node instance. + setting_dict : dict + Dictionary containing the settings to apply to the node. + """ pass @abstractmethod def close(self, node_id): + """Clean up resources when the node is closed. + + This method should release any resources held by the node, such as + file handles, network connections, or GPU memory. + + Parameters + ---------- + node_id : str + Unique identifier for this node instance. + """ pass diff --git a/sound.py b/sound.py index 787f525b..3338065e 100644 --- a/sound.py +++ b/sound.py @@ -1,14 +1,19 @@ +"""Simple audio playback utility module. + +This module generates and plays a simple sine wave tone using sounddevice. +It serves as a basic example of audio generation and playback. +""" import sounddevice as sd import numpy as np -# Génère un son simple : un sinus de 440 Hz pendant 2 secondes -samplerate = 44100 # échantillons par seconde -duration = 2 # secondes -frequency = 440 # Hz (La4) +# Generate a simple sound: a 440 Hz sine wave for 2 seconds +samplerate = 44100 # samples per second +duration = 2 # seconds +frequency = 440 # Hz (A4 note) t = np.linspace(0, duration, int(samplerate * duration), endpoint=False) my_audio_array = 0.5 * np.sin(2 * np.pi * frequency * t) -# Lecture du son +# Play the sound sd.play(my_audio_array, samplerate=samplerate) -sd.wait() # attend la fin de la lecture +sd.wait() # wait until playback is finished diff --git a/tests/demo_fps_speed_timing.py b/tests/demo_fps_speed_timing.py deleted file mode 100644 index 95a0e958..00000000 --- a/tests/demo_fps_speed_timing.py +++ /dev/null @@ -1,114 +0,0 @@ -#!/usr/bin/env python -# -*- coding: utf-8 -*- -""" -Demo script to test FPS and Speed control features -This script creates a simple test to verify the frame timing calculations -""" - -import time - - -def test_frame_timing(): - """Test the frame timing calculation logic""" - - print("=" * 60) - print("Video Node FPS and Speed Control - Timing Test") - print("=" * 60) - print() - - test_cases = [ - # (fps, speed, expected_interval) - (24, 1.0, 0.042), # Standard 24fps - (24, 0.5, 0.083), # Half speed - (24, 2.0, 0.021), # Double speed - (24, 0.25, 0.167), # Quarter speed - (24, 4.0, 0.010), # 4x speed - (30, 1.0, 0.033), # 30fps standard - (60, 1.0, 0.017), # 60fps standard - (60, 0.5, 0.033), # 60fps half speed - ] - - print("Frame Interval Calculations:") - print("-" * 60) - print(f"{'FPS':<6} {'Speed':<8} {'Interval (s)':<14} {'Interval (ms)':<14}") - print("-" * 60) - - for fps, speed, expected in test_cases: - # This is the actual calculation from the code - frame_interval = (1.0 / fps) / speed if fps > 0 and speed > 0 else 0 - interval_ms = frame_interval * 1000 - - print(f"{fps:<6} {speed:<8.2f} {frame_interval:<14.3f} {interval_ms:<14.1f}") - - # Verify calculation is correct - assert abs(frame_interval - expected) < 0.001, \ - f"Expected {expected}, got {frame_interval}" - - print("-" * 60) - print("✓ All calculations correct") - print() - - # Simulate frame timing - print("Simulating Frame Timing (5 frames at 24 FPS, 1.0x speed):") - print("-" * 60) - - target_fps = 24 - playback_speed = 1.0 - frame_interval = (1.0 / target_fps) / playback_speed - - last_frame_time = None - frames_displayed = 0 - - start_time = time.time() - - for i in range(5): - current_time = time.time() - - # Check if enough time has passed - should_read_frame = (last_frame_time is None) or \ - ((current_time - last_frame_time) >= frame_interval) - - if should_read_frame: - elapsed = current_time - start_time if last_frame_time else 0 - print(f"Frame {i+1} displayed at {elapsed:.3f}s") - last_frame_time = current_time - frames_displayed += 1 - - # Wait for next frame - time.sleep(frame_interval) - - total_time = time.time() - start_time - actual_fps = frames_displayed / total_time if total_time > 0 else 0 - - print("-" * 60) - print(f"Total time: {total_time:.3f}s") - print(f"Frames displayed: {frames_displayed}") - print(f"Actual FPS: {actual_fps:.1f}") - print(f"Target FPS: {target_fps}") - print() - - # Edge case testing - print("Edge Case Testing:") - print("-" * 60) - - # Zero FPS - result = (1.0 / 0) / 1.0 if 0 > 0 and 1.0 > 0 else 0 - print(f"Zero FPS (0, 1.0x): {result} (should be 0)") - assert result == 0, "Zero FPS should result in 0 interval" - - # Zero speed - result = (1.0 / 24) / 0 if 24 > 0 and 0 > 0 else 0 - print(f"Zero speed (24, 0x): {result} (should be 0)") - assert result == 0, "Zero speed should result in 0 interval" - - print("-" * 60) - print("✓ Edge cases handled correctly") - print() - - print("=" * 60) - print("✓ All tests passed!") - print("=" * 60) - - -if __name__ == '__main__': - test_frame_timing() diff --git a/tests/demo_resnet_spectrogram_integration.py b/tests/demo_resnet_spectrogram_integration.py deleted file mode 100644 index 9f48e486..00000000 --- a/tests/demo_resnet_spectrogram_integration.py +++ /dev/null @@ -1,128 +0,0 @@ -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -""" -Demonstration of ResNet50 processing spectrogram images from audio connections. - -This script demonstrates the complete integration: -1. Video node generates a spectrogram from audio -2. Spectrogram is passed via AUDIO type connection -3. Classification node (ResNet50) processes the spectrogram -4. Results are classified using ImageNet classes -""" - -import sys -import os - -# Add parent directory to path for imports -sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..')) - - -def main(): - print("="*70) - print("ResNet50 Spectrogram Integration Demo") - print("="*70) - print() - - print("This feature enables the following workflow:\n") - - print("1. VIDEO NODE (node_video.py)") - print(" └─> Reads video file with audio track") - print(" └─> Generates mel-spectrogram from audio") - print(" └─> Returns: {'image': frame, 'audio': spectrogram_bgr}") - print() - - print("2. CONNECTION TYPE") - print(" └─> AUDIO type connection (TYPE_AUDIO)") - print(" └─> Carries spectrogram as BGR image") - print(" └─> Stored in node_audio_dict") - print() - - print("3. CLASSIFICATION NODE (node_classification.py)") - print(" └─> Accepts both IMAGE and AUDIO connections") - print(" └─> Calls get_input_frame(connection_list, node_image_dict, node_audio_dict)") - print(" └─> Retrieves spectrogram from node_audio_dict") - print() - - print("4. RESNET50 MODEL (resnet50.py)") - print(" └─> Receives BGR spectrogram image") - print(" └─> Converts BGR → RGB (standard preprocessing)") - print(" └─> Resizes to 224x224") - print(" └─> Runs inference") - print(" └─> Returns top-K classification results") - print() - - print("="*70) - print("Key Changes Made:") - print("="*70) - print() - - print("✓ node/DLNode/node_classification.py") - print(" - Line 209: Changed condition to accept AUDIO connections") - print(" - Before: if connection_type == self.TYPE_IMAGE:") - print(" - After: if connection_type == self.TYPE_IMAGE or connection_type == self.TYPE_AUDIO:") - print() - - print("✓ This minimal change enables:") - print(" - Recognition of AUDIO type connections") - print(" - Proper source node name extraction") - print(" - Seamless integration with existing infrastructure") - print() - - print("="*70) - print("Technical Details:") - print("="*70) - print() - - print("Spectrogram Format:") - print(" - Shape: (height, width, 3) - BGR color image") - print(" - Type: numpy array, uint8") - print(" - Channels: Blue-Green-Red (OpenCV standard)") - print() - - print("ResNet50 Processing:") - print(" - Input: BGR image (any size)") - print(" - Preprocessing: Resize to 224x224, BGR→RGB") - print(" - Output: Top-K ImageNet class predictions") - print() - - print("Benefits:") - print(" ✓ Enables audio-to-visual classification") - print(" ✓ Works with all classification models (MobileNetV3, EfficientNet, ResNet50)") - print(" ✓ No changes needed to model inference code") - print(" ✓ Maintains backward compatibility") - print() - - print("="*70) - print("Example Use Cases:") - print("="*70) - print() - - print("1. Music Genre Classification") - print(" Video → Audio → Spectrogram → ResNet50 → Genre Prediction") - print() - - print("2. Speech Pattern Recognition") - print(" Audio Recording → Spectrogram → Classification → Speech Patterns") - print() - - print("3. Sound Event Detection") - print(" Environmental Audio → Spectrogram → ResNet50 → Event Classification") - print() - - print("="*70) - print("Testing:") - print("="*70) - print() - - print("Run the comprehensive tests:") - print(" $ python tests/test_resnet_spectrogram.py") - print(" $ python tests/test_spectrogram_to_classification.py") - print() - - print("="*70) - print("✓ Feature is fully implemented and tested!") - print("="*70) - - -if __name__ == '__main__': - main() diff --git a/tests/demo_spectrogram_colormap.py b/tests/demo_spectrogram_colormap.py deleted file mode 100644 index 77e4b7f9..00000000 --- a/tests/demo_spectrogram_colormap.py +++ /dev/null @@ -1,151 +0,0 @@ -#!/usr/bin/env python -# -*- coding: utf-8 -*- -""" -Demonstration script for spectrogram colormap feature. - -This script generates synthetic audio signals, computes spectrograms, -applies different colormaps, and saves the results for visual comparison. -""" - -import sys -import os -import numpy as np -import cv2 -import librosa -import tempfile -import soundfile as sf - -# Add parent directory to path -sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) - -from node.InputNode.spectrogram_utils import apply_colormap_to_spectrogram - - -def generate_test_signal(duration=3.0, sr=22050): - """ - Generate a test signal with multiple frequency components. - - Args: - duration: Duration in seconds - sr: Sample rate in Hz - - Returns: - np.ndarray: Audio signal - """ - t = np.linspace(0, duration, int(sr * duration)) - - # Create a signal with multiple frequency components - signal = np.zeros_like(t) - - # Add a chirp (frequency sweep) - f_start = 200 # Hz - f_end = 2000 # Hz - chirp = np.sin(2 * np.pi * (f_start + (f_end - f_start) * t / duration) * t) - signal += chirp * 0.5 - - # Add some harmonic tones - for freq in [440, 880, 1320]: # A4 and harmonics - segment_start = int(sr * 0.5) - segment_end = int(sr * 2.0) - signal[segment_start:segment_end] += 0.3 * np.sin(2 * np.pi * freq * t[segment_start:segment_end]) - - # Add some noise - signal += 0.05 * np.random.randn(len(t)) - - # Normalize - signal = signal / np.max(np.abs(signal)) - - return signal - - -def compute_spectrogram(signal, sr=22050): - """ - Compute a spectrogram from audio signal using librosa. - - Args: - signal: Audio signal - sr: Sample rate - - Returns: - np.ndarray: 2D spectrogram in dB scale - """ - # Compute STFT - D = librosa.stft(signal, n_fft=2048, hop_length=512) - - # Convert to dB scale - S_db = librosa.amplitude_to_db(np.abs(D), ref=np.max) - - return S_db - - -def main(): - """Generate and save colored spectrograms with different colormaps.""" - print("Generating test signal...") - signal = generate_test_signal(duration=3.0) - sr = 22050 - - # Save audio file for reference - audio_path = os.path.join(tempfile.gettempdir(), 'demo_signal.wav') - sf.write(audio_path, signal, sr) - print(f"Saved test audio to: {audio_path}") - - print("\nComputing spectrogram...") - spectrogram = compute_spectrogram(signal, sr) - print(f"Spectrogram shape: {spectrogram.shape}") - - # Test different colormaps - colormaps = ['INFERNO', 'VIRIDIS', 'JET', 'MAGMA', 'PLASMA', 'HOT'] - - print("\nApplying colormaps and saving images:") - for cmap_name in colormaps: - print(f" - {cmap_name}...", end=' ') - - # Apply colormap - colored_img = apply_colormap_to_spectrogram( - spectrogram, - method='cv2', - cmap=cmap_name - ) - - # Verify output - assert colored_img.shape[2] == 3, "Should have 3 channels (RGB)" - assert colored_img.dtype == np.uint8, "Should be uint8" - - # Check that channels are not identical (truly colored) - r, g, b = colored_img[..., 0], colored_img[..., 1], colored_img[..., 2] - assert not np.all(r == g), "Red and Green channels should differ" - - # Convert RGB to BGR for saving with OpenCV - colored_bgr = cv2.cvtColor(colored_img, cv2.COLOR_RGB2BGR) - - # Save image - output_path = os.path.join( - tempfile.gettempdir(), - f'demo_spectrogram_{cmap_name.lower()}.png' - ) - cv2.imwrite(output_path, colored_bgr) - - print(f"✓ Saved to {output_path}") - - print("\n" + "="*70) - print("Demo completed successfully!") - print("="*70) - print("\nKey features demonstrated:") - print(" ✓ Synthetic audio signal generation with multiple frequency components") - print(" ✓ Spectrogram computation using librosa STFT") - print(" ✓ Colormap application with multiple OpenCV colormaps") - print(" ✓ RGB output validation (shape, dtype, non-grayscale)") - print(" ✓ Image file export for visual inspection") - print("\nColormap comparison:") - print(" - INFERNO: Perceptually uniform, good for data visualization") - print(" - VIRIDIS: Colorblind-friendly, perceptually uniform") - print(" - JET: Classic rainbow colormap (not perceptually uniform)") - print(" - MAGMA: Similar to INFERNO but with more purple tones") - print(" - PLASMA: Bright, high-contrast perceptually uniform colormap") - print(" - HOT: Red-yellow-white progression, good for thermal-like data") - print("\nYou can now visually compare the spectrograms in:") - print(f" {tempfile.gettempdir()}/demo_spectrogram_*.png") - - -if __name__ == '__main__': - main() diff --git a/tests/demo_spectrogram_scrolling.py b/tests/demo_spectrogram_scrolling.py deleted file mode 100644 index dabce7f9..00000000 --- a/tests/demo_spectrogram_scrolling.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python -# -*- coding: utf-8 -*- -""" -Demonstration of spectrogram scrolling behavior -This script simulates the scrolling logic without requiring actual video files -""" - -import numpy as np - - -def simulate_spectrogram_scrolling(): - """Simulate and demonstrate the spectrogram scrolling logic""" - - # Simulate a 5-minute video - fps = 30 - duration = 300 # 5 minutes - sr = 22050 - hop_length = 512 - - # Calculate total spectrogram dimensions - total_samples = duration * sr - total_columns = int(total_samples / hop_length) - - # Display window size - window_width = 240 - half_window = window_width // 2 - - print("=" * 70) - print("SPECTROGRAM SCROLLING DEMONSTRATION") - print("=" * 70) - print(f"\nVideo Configuration:") - print(f" Duration: {duration} seconds ({duration/60:.1f} minutes)") - print(f" Frame rate: {fps} FPS") - print(f" Total frames: {fps * duration}") - print(f"\nAudio Configuration:") - print(f" Sample rate: {sr} Hz") - print(f" Hop length: {hop_length} samples") - print(f"\nSpectrogram Dimensions:") - print(f" Total columns: {total_columns}") - print(f" Display window: {window_width} columns") - print(f" Compression ratio (old): {total_columns / window_width:.1f}:1") - print(f" Compression ratio (new): 1:1 (no compression!)") - print("\n" + "=" * 70) - - # Simulate playback at different positions - test_positions = [ - (0, "Video start"), - (30, "30 seconds in"), - (150, "2.5 minutes in"), - (270, "4.5 minutes in"), - (299, "Near end"), - ] - - for time_seconds, description in test_positions: - current_frame = int(time_seconds * fps) - current_sample = int(time_seconds * sr) - spectrogram_col = int(current_sample / hop_length) - - # Calculate window boundaries (same logic as in node_video.py) - start_col = max(0, spectrogram_col - half_window) - end_col = min(total_columns, start_col + window_width) - - # Adjust start if at the end - if end_col == total_columns: - start_col = max(0, end_col - window_width) - - # Calculate indicator position within window - indicator_col = spectrogram_col - start_col - - # Check if padding is needed - window_cols = end_col - start_col - needs_padding = window_cols < window_width - pad_width = window_width - window_cols if needs_padding else 0 - - print(f"\n{description} (t={time_seconds}s, frame {current_frame}):") - print(f" Spectrogram position: column {spectrogram_col}/{total_columns}") - print(f" Window range: [{start_col} - {end_col}]") - print(f" Indicator position in window: column {indicator_col}") - print(f" Padding needed: {'Yes' if needs_padding else 'No'}", end="") - if needs_padding: - print(f" ({pad_width} columns on {'left' if start_col > 0 else 'right'})") - else: - print() - - # Visual representation - progress_pct = (spectrogram_col / total_columns) * 100 - window_pct = (start_col / total_columns) * 100 - print(f" Progress: {progress_pct:.1f}%") - - # Simple ASCII visualization - bar_width = 60 - bar_pos = int((spectrogram_col / total_columns) * bar_width) - window_start = int((start_col / total_columns) * bar_width) - window_end = min(bar_width, int((end_col / total_columns) * bar_width)) - - bar = ['-'] * bar_width - for i in range(window_start, window_end): - bar[i] = '█' - if 0 <= bar_pos < bar_width: - bar[bar_pos] = '|' - - print(f" Visual: [{''.join(bar)}]") - print(f" {' ' * window_start}^{' ' * max(0, bar_pos - window_start)}^") - spacing = max(0, bar_pos - window_start - 12) - print(f" {' ' * window_start}Window start{' ' * spacing}Indicator") - - print("\n" + "=" * 70) - print("\nKEY FEATURES:") - print(" ✓ Window follows playback position smoothly") - print(" ✓ Indicator stays centered (except at edges)") - print(" ✓ Full resolution: 1:1 pixel mapping, no compression") - print(" ✓ Frame-by-frame updates for smooth scrolling") - print("=" * 70) - - -if __name__ == '__main__': - simulate_spectrogram_scrolling() diff --git a/tests/dummy_servers/IMPLEMENTATION_SUMMARY.md b/tests/dummy_servers/IMPLEMENTATION_SUMMARY.md deleted file mode 100644 index c84a94cd..00000000 --- a/tests/dummy_servers/IMPLEMENTATION_SUMMARY.md +++ /dev/null @@ -1,288 +0,0 @@ -# Test Servers Implementation Summary - -## Overview - -Created a comprehensive testing infrastructure with dummy servers for API, WebSocket, and WebRTC input nodes in CV_Studio. - -## Files Created - -### Core Server Files (3 files) -1. **api_server.py** (3,978 bytes) - - HTTP REST API server - - Endpoints: `/image`, `/float`, `/status` - - Serves random PNG images (640x480) and float values (0-100) - -2. **websocket_server.py** (4,635 bytes) - - WebSocket streaming server - - Supports both image and float streaming - - Configurable data type and interval - - Images: 320x240 PNG (base64 encoded) - - Floats: JSON with value and timestamp - -3. **webrtc_server.py** (5,714 bytes) - - WebRTC peer-to-peer server - - Supports video streaming and data channels - - Requires aiohttp and aiortc libraries - - Implements signaling via HTTP POST /offer endpoint - -### Utility Scripts (4 files) -4. **run_servers.py** (10,417 bytes) - - Master launcher for all servers - - Supports selective server launching - - Built-in basic testing capability - - Process management and monitoring - -5. **test_servers.py** (10,152 bytes) - - Comprehensive integration test suite - - Tests all server endpoints and functionality - - Supports quick test mode and full unittest mode - - Automatic server lifecycle management - -6. **demo.py** (9,046 bytes) - - Interactive demonstration script - - Shows all servers in action - - Displays received data statistics - - Saves example images to /tmp/ - -7. **launch.sh** (1,086 bytes) - - Bash helper script for easy launching - - Interactive menu for server selection - - Shortcuts for common tasks - -### Documentation and Config (3 files) -8. **README.md** (6,499 bytes) - - Comprehensive usage documentation - - API references for all servers - - Examples and troubleshooting guide - - Integration instructions for CV_Studio - -9. **requirements.txt** (320 bytes) - - Optional dependencies list - - Separate from main project requirements - - Includes numpy, Pillow, websockets, aiohttp, aiortc - -10. **__init__.py** (111 bytes) - - Python package initialization - -## Features Implemented - -### API Server -- ✅ GET /status - Server status and endpoint list -- ✅ GET /float - Random float values with timestamp -- ✅ GET /image - Random PNG images (640x480) -- ✅ CORS headers for cross-origin requests -- ✅ Proper HTTP status codes and error handling - -### WebSocket Server -- ✅ Support for image streaming (320x240 PNG, base64) -- ✅ Support for float streaming -- ✅ Configurable interval between messages -- ✅ Welcome message on connection -- ✅ Proper connection management -- ✅ JSON message format - -### WebRTC Server -- ✅ WebRTC signaling server -- ✅ Video track with random frames -- ✅ Data channel for float values -- ✅ Connection state management -- ✅ HTTP endpoints for offer/answer exchange - -### Test Infrastructure -- ✅ Integration tests for API endpoints -- ✅ WebSocket connection and streaming tests -- ✅ Multiple concurrent request tests -- ✅ Import validation tests -- ✅ Quick test mode for rapid verification -- ✅ Full unittest suite with automatic server management - -### Demo and Usability -- ✅ Interactive demonstration script -- ✅ Statistical analysis of received data -- ✅ Image saving and validation -- ✅ Launch helper script with menu -- ✅ Comprehensive README with examples - -## Testing Results - -### API Server Tests -``` -✓ Status endpoint returns correct format -✓ Float endpoint returns values in range [0, 100] -✓ Image endpoint returns valid PNG files -✓ Multiple concurrent requests work correctly -✓ Images are approximately 900KB (640x480 PNG) -``` - -### WebSocket Server Tests -``` -✓ Connection establishes successfully -✓ Welcome message received correctly -✓ Float values stream at configured interval -✓ Image data streams successfully (320x240 PNG) -✓ Images are approximately 230KB (320x240 PNG) -✓ JSON format is valid and contains expected fields -``` - -### Demo Script Output -``` -✓ All servers start successfully -✓ API server responds to all endpoints -✓ 5 random float samples retrieved and analyzed -✓ Random images retrieved and saved -✓ WebSocket float stream received (10 values) -✓ WebSocket image stream received (3 images) -✓ Statistics calculated correctly -✓ All servers stop gracefully -``` - -## Usage Examples - -### Quick Start -```bash -# Install dependencies -pip install numpy Pillow websockets - -# Run the demo -cd tests/dummy_servers -python demo.py -``` - -### Individual Server Usage -```bash -# Start API server -python api_server.py --port 8080 - -# Start WebSocket server (images) -python websocket_server.py --type image --port 8765 - -# Start WebSocket server (floats) -python websocket_server.py --type float --port 8766 --interval 0.5 -``` - -### Launch All Servers -```bash -# Interactive menu -./launch.sh - -# Command line -python run_servers.py -python run_servers.py --test # With testing -``` - -### Run Tests -```bash -# Quick test (API only) -python test_servers.py --quick - -# Full test suite -python test_servers.py -``` - -## Integration with CV_Studio - -The servers can be used to test CV_Studio input nodes: - -1. **API Node**: Configure to use: - - `http://localhost:8080/image` for images - - `http://localhost:8080/float` for floats - -2. **WebSocket Node**: Configure to connect to: - - `ws://localhost:8765` for image stream - - `ws://localhost:8766` for float stream - -3. **WebRTC Node**: Configure to connect to: - - `http://localhost:8081` for signaling - -## Technical Details - -### Dependencies -- **Required**: Python 3.7+, numpy, Pillow -- **WebSocket**: websockets >= 10.0 -- **WebRTC**: aiohttp >= 3.8.0, aiortc >= 1.3.0 -- **Testing**: pytest >= 7.0.0 - -### Port Configuration -- API Server: 8080 (default) -- WebSocket Image: 8765 (default) -- WebSocket Float: 8766 (default) -- WebRTC: 8081 (default) - -All ports are configurable via command-line arguments. - -### Data Formats - -**API Float Response:** -```json -{ - "value": 42.42, - "timestamp": 1234567890.123 -} -``` - -**WebSocket Image Message:** -```json -{ - "type": "image", - "data": "base64_encoded_png...", - "format": "png", - "width": 320, - "height": 240, - "timestamp": 1234567890.123 -} -``` - -**WebSocket Float Message:** -```json -{ - "type": "float", - "value": 42.42, - "timestamp": 1234567890.123 -} -``` - -## Known Limitations - -1. **WebRTC Server**: Requires additional dependencies (aiohttp, aiortc) that may not be available in all environments -2. **Image Size**: WebSocket images limited to 320x240 to avoid message size limits -3. **No Authentication**: Servers do not implement authentication (for testing only) -4. **Single Client**: WebRTC server supports single peer connections -5. **No Persistence**: All data is generated randomly, no storage - -## Future Enhancements - -- [ ] Add authentication support -- [ ] Implement server configuration files -- [ ] Add more data types (video streams, audio) -- [ ] Create Docker containers for easy deployment -- [ ] Add performance metrics and monitoring -- [ ] Implement data replay from files -- [ ] Add SSL/TLS support - -## Files Summary - -| File | Lines | Size | Purpose | -|------|-------|------|---------| -| api_server.py | 130 | 3.9KB | HTTP REST API | -| websocket_server.py | 134 | 4.6KB | WebSocket streaming | -| webrtc_server.py | 172 | 5.6KB | WebRTC P2P | -| run_servers.py | 290 | 10KB | Server launcher | -| test_servers.py | 282 | 10KB | Integration tests | -| demo.py | 257 | 8.9KB | Interactive demo | -| launch.sh | 49 | 1.1KB | Bash helper | -| README.md | 241 | 6.9KB | Documentation | -| requirements.txt | 14 | 320B | Dependencies | -| __init__.py | 3 | 111B | Package init | -| **TOTAL** | **1,572** | **51KB** | **10 files** | - -## Conclusion - -Successfully implemented a complete testing infrastructure for CV_Studio input nodes with: -- ✅ 3 fully functional dummy servers (API, WebSocket, WebRTC) -- ✅ Comprehensive test suite with integration tests -- ✅ Interactive demonstration script -- ✅ Helper utilities for easy server management -- ✅ Complete documentation with examples -- ✅ Verified functionality through testing - -All servers are production-ready for testing CV_Studio nodes and can be easily extended or modified as needed.