Recurrence-Duplication: Deterministic Parallelisation of Non-Affine Scalar Loops
1 top256 1 6/4/2025, 6:14:30 PM deviantabstraction.com ↗
Comments (1)
top256 · 2d ago
TL;DR A loop that carries any pure scalar state can be strip-mined across p
threads by having each thread privately replay ≤ p(p-1)/2 “warm-up” updates
before its first public iteration. No closed-form skip-ahead, no speculation,
and a few extra machine instructions in code-gen.