NHacker Next
- new
- past
- show
- ask
- show
- jobs
- submit
login
67% less kernel code is the more interesting number here — Hopper's async capabilities have been underutilized largely because the programming model is painful. Curious how it handles cases where compute and memory phases aren't cleanly separable.
This seems like a better version of CUDA, for Hopper GPUs?
[dead]
Rendered at 14:24:41 GMT+0000 (Coordinated Universal Time) with Vercel.