HN Reader
Top
New
Best
Ask
Show
Jobs
Top
New
Best
Ask
Show
Jobs
Implementing a Fast Tensor Core Matmul on the Ada Architecture
2
skidrow
1
7/18/2025, 8:26:22 AM
spatters.ca ↗
Comments (1)
jhlee525
· 8h ago
This is incredibly useful. Thanks for making the kernels public.
I'm curious if anyone has tried generalizing this to batched matmuls or to sparse inputs on Ada?
[-] Collapse
I'm curious if anyone has tried generalizing this to batched matmuls or to sparse inputs on Ada?