Skip to content

[WIP] Fix tensor conversion issue in preprocessing method#4

Merged
NickScherbakov merged 1 commit intomainfrom
copilot/fix-preprocess-frame-error
Dec 20, 2025
Merged

[WIP] Fix tensor conversion issue in preprocessing method#4
NickScherbakov merged 1 commit intomainfrom
copilot/fix-preprocess-frame-error

Conversation

Copy link
Contributor

Copilot AI commented Dec 20, 2025

Fix NumPy 2.x Compatibility in _preprocess_frame

Plan:

  • Replace self.transform() usage with manual tensor conversion in _preprocess_frame() method
  • Remove dependency on torchvision's ToTensor() transform
  • Implement manual conversion: numpy → float32 → normalize → CHW format → torch tensor
  • Apply ImageNet normalization manually (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
  • Remove unused self.transform initialization from __init__ method
  • Remove unused PIL Image import (no longer needed)
  • Run existing test to verify the fix works
  • Ensure no other methods are affected by the change
Original prompt

Проблема НЕ РЕШЕНА

Предыдущий PR (#2) добавил конвертацию в PIL Image, но ошибка осталась:

TypeError: expected np.ndarray (got numpy.ndarray)

Проблема в том, что torchvision.transforms.functional.to_tensor() внутри ToTensor() несовместим с NumPy 2.x. Даже PIL Image конвертируется обратно в numpy array внутри torchvision, и там происходит ошибка.

Текущий код (НЕ РАБОТАЕТ)

def _preprocess_frame(self, frame):
    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    pil_image = Image.fromarray(frame_rgb)
    input_tensor = self.transform(pil_image)  # <-- ОШИБКА ЗДЕСЬ
    return input_tensor

Требуемое решение

Полностью заменить использование self.transform() на ручную конвертацию в методе _preprocess_frame() в файле src/segmentation.py:

def _preprocess_frame(self, frame):
    """
    Preprocess a frame for model input.
    
    Args:
        frame: Frame as numpy array (BGR format from OpenCV)
        
    Returns:
        Preprocessed tensor
    """
    # Convert BGR to RGB
    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    
    # Manual tensor conversion (bypasses torchvision ToTensor for NumPy 2.x compatibility)
    img = frame_rgb.astype('float32') / 255.0
    img = img.transpose((2, 0, 1))  # HWC -> CHW format
    input_tensor = torch.from_numpy(img.copy()).float()
    
    # Apply ImageNet normalization manually
    mean = torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
    std = torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1)
    input_tensor = (input_tensor - mean) / std

    return input_tensor

Важно

  1. НЕ использовать self.transform() вообще
  2. НЕ использовать ToTensor() из torchvision
  3. Делать всю конвертацию вручную через numpy и torch
  4. Использовать img.copy() чтобы избежать проблем с памятью
  5. Сохранить ImageNet нормализацию (mean/std) как в оригинале

This pull request was created from Copilot chat.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@NickScherbakov NickScherbakov marked this pull request as ready for review December 20, 2025 21:53
Copilot AI review requested due to automatic review settings December 20, 2025 21:53
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@NickScherbakov NickScherbakov merged commit f9c0f7b into main Dec 20, 2025
1 check failed
Copilot AI requested a review from NickScherbakov December 20, 2025 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants