Recurrence-Duplication: Deterministic Parallelisation of Non-Affine Scalar Loops

1 top256 1 6/4/2025, 6:14:30 PM deviantabstraction.com ↗

Comments (1)

top256 · 2d ago
TL;DR A loop that carries any pure scalar state can be strip-mined across p threads by having each thread privately replay ≤ p(p-1)/2 “warm-up” updates before its first public iteration. No closed-form skip-ahead, no speculation, and a few extra machine instructions in code-gen.