A subtle bug with Go's errgroup

13 broken_broken_ 6 8/9/2025, 12:20:16 PM gaultier.github.io ↗

Comments (6)

dantillberg · 12m ago
I'd say it's fair to call this a footgun, though not a bug. The context is really only intended to apply to the goroutines. And Wait has to cancel the context to prevent a resource leak.

I suggest in general using function scoping to drive the lifetime of contexts, etc. This works also for defers and tracing spans in addition to the canonically shadowed `ctx` variable.

There is an old issue in the tracker proposing changes to alleviate this: https://github.com/golang/go/issues/34510. The original author (bcmills) of the errgroup package shared insight into the design choices/tradeoffs he made.

masklinn · 7m ago
> Shadowing is another concept that made this issue less visible.

There is no shadowing here, Go can only shadow in different scopes, `ctx` is just rebound (overwritten). Zig does not prevent reusing variables that I know. However I believe zig parameters are implicitly `const`, which would have prevented thoughtlessly updating the local in-place. Same in Rust.

> If you've ever heard of linear types, and never saw their utility, that's actually exactly what they are good for: a variable gets 'consumed' by a function, and the type system prevents us from using it after that point. Conceptually, g.Wait(ctx) consumes ctx and there is no point using this ctx afterwards.

I don't believe linear types would prevent this issue per-se since `errgroup.WithContext` specifically creates a derived context.

Depending on the affordance they might make the issue clearer but they also might not: a linear errgroup.WithContext could consume its input leaving you only with the sub-context and thinking little more of it, so you'd have to think to create derived context on the caller side. Would you think of this issue when doing that?

TBH this seems more of a design issue in errgroup's API: why is the context returned from the errgroup instead of being an attribute? That would limit namespace pollution and would make the relationship (and possible effects) much clearer. But I don't know what drove the API to be designed as it is.

nasretdinov · 1h ago
So it's really the issue due a partial variable shadowing during assignment, not errgroup-specific. Got bitten by it multiple times unfortunately, but I don't think there's an easy lint or vet check that could prevent such errors from happening. Having said that, overwriting a variable during := assignment is typically only useful for errors, so potentially you might want to have a lint check that complains when you've overwritten something else. There's already a check that the value is unused so when re-assigning the error value you're less likely to miss anything
eddythompson80 · 33m ago
No? Its about errgroup ctx being automatically cancelled once the group is done (in success or err). It kinda makes sense, and its documented. I understand how it can be surprising, but if you think about it for a second, it makes sense. The context created is meant to manage the lifecycle of the errgroup. Once the group is done, the ctx is done.
yubblegum · 43m ago
> So it's really the issue due a partial variable shadowing during assignment, not errgroup-specific.

Did we read the same article? The "alternative fix" is to paper over documented bahavior of a context. Of course it makes sense that if wait returns context should be canceled. The one character fix is throwing away information ..

dondraper36 · 42m ago
That's why I often use eg, egCtx :)