-
Notifications
You must be signed in to change notification settings - Fork 21
feat: device registration #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ee4e27b to
2e2b4c3
Compare
c60d854 to
ab3274a
Compare
f0b0494 to
2b5ad33
Compare
…tics - Split internal implementation headers into a separate include group - Drop redundant explicit default initialization for Device - Add `impl` suffix to CUDA guard implementation files - Unify Arange initialization via DeviceGuardImpl
eaacac4 to
7a43321
Compare
1f1247f to
e2c91ef
Compare
| break; | ||
| } | ||
| } | ||
| impl->MemcpyAsync(tensor->DataPtr(), buffer.data(), num_elements * sizeof(float), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
意识到一个问题,这里的 buffer.data() 是 cpu 端的,如果用 cudaMemcpyAsync 的话,确实有可能在实际 memcpy 之前,buffer.data() 就已被释放?H2D/D2H 的 memcpy 是不是都显式同步比较合适?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
新提 pr 统一修复框架内 d2h/h2d 的 host buffer 生命周期问题
976a497 to
6613c02
Compare
6613c02 to
a6e41fb
Compare
- Drop legacy hardware-specific branching - Convert DeviceGuardImpl base methods to fatal-only fallbacks - Explicitly implement supported CPU runtime behaviors - Validate CUDA device type and index bounds in CudaGuardImpl - Widen DeviceCount return type to prevent truncation
a6e41fb to
4f0fa84
Compare
chen2021673
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM

device 注册设计文档:https://gxtctab8no8.feishu.cn/docx/F0CzdkVXCoaxRgxYc3AcFewtnOc?from=from_copylink
Device 现在是一个只维护 type/index 的简单类型,所有运行时方法都由 DeviceGuard/DeviceGuardImpl 提供。
本次 pr 涉及主要改动: