Skip to content

Latest commit

 

History

History
33 lines (14 loc) · 921 Bytes

File metadata and controls

33 lines (14 loc) · 921 Bytes

https://www.youtube.com/watch?v=1FI135IJ4vw&list=PLAwxTw4SYaPnFKojVQrmyOGFCqHTxfdv2&index=21

TODO

presentation

什么是Block cluster

什么是cooperative group.

nvswitch 为啥快?

A100 有nvswitch吗? 速度是多少? CPU 到GPU速度是多少?

H100 CPU 到GPU 带宽是900GB/s吗? 每个GPU都有一个CPU吗? 还是只有DGX machine 有? 还是只有grace hopper架构有?

CPU和GPU是有同一个虚拟内存吗? host内存和显存可以放在空一个地址空间吗?

https://developer.nvidia.com/blog/nvidia-hopper-architecture-in-depth/

NVLink Switch System technology is not currently available with H100 systems, but systems and availability will be announced. 现在发布了吗?

新的FP8 data type好像不太行.

一个SM同一个时刻可以执行多个warp,这取决于warp scheduler的数量。

数据搬移: 900 GB/s of total bandwidth, 7x faster than PCIe Gen5