Skip to content

Conversation

@yondonfu
Copy link
Contributor

Summary

  • Adds automatic binary patching of torch_python.dll at Python startup to fix OverflowError in torch.compile with reduce-overhead mode on Windows
  • Extracts shared find_package_path() utility to _utils.py for reuse between cudnn and static_cuda_launcher patches
  • Patches format specifier from l (signed long) to K (unsigned long long) for CUDA stream parsing

Fixes: pytorch/pytorch#162430

Test plan

  • Fresh uv sync installs patches.pth to site-packages
  • First Python startup automatically patches torch_python.dll
  • Subsequent startups detect "already patched" and skip
  • Server starts without errors

🤖 Generated with Claude Code

Fixes unsigned long issue on Windows when using torch.compile

Signed-off-by: Yondon Fu <yondon.fu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants