Implementing a Fast Tensor Core Matmul on the Ada Architecture

2 skidrow 1 7/18/2025, 8:26:22 AM spatters.ca โ†—

Comments (1)

jhlee525 ยท 8h ago
This is incredibly useful. Thanks for making the kernels public.

I'm curious if anyone has tried generalizing this to batched matmuls or to sparse inputs on Ada?