Show HN: Built a 9MB GPU kernel achieving 43M ops/SEC with deterministic replay

1 TacosInMyPocket 0 6/8/2025, 5:46:54 PM
I've developed a custom GPU kernel that handles 40+ million parallel agent operations per second while maintaining apparently deterministic results across runs - something typically considered impossible with GPU parallel processing.

Performance demo: https://youtu.be/Y3Jg8RCZ65c Determinism proof: https://youtu.be/fk7NMNGcfSY

The entire runtime is under 10MB. Open to discussing potential applications!

autoscriptlabs@gmail.com

Comments (0)

No comments yet