Fix 2090 child workflow goroutine leak#2200
Conversation
|
|
|
Hi maintainers — this fork PR has “1 workflow awaiting approval”. Local verification:
Thanks! |
|
Pinned go.uber.org/goleak to v1.1.12 to match CI’s module resolution (oldstable/stable) and avoid go: updates to go.mod needed. |
|
Thanks for the contribution! The fix makes sense to me. But I don't think we want to add goleak as a dependency to our root |
|
Thanks Andrew — totally agreed. I’ll remove goleak from the root go.mod/go.sum and instead add the leak regression test under test/ (using test/go.mod where goleak already exists). Also agreed on consolidation: since CI doesn’t run -tags goleak, I’ll drop the split/tagged variant and keep a single test in test/ that reproduces the child-workflow-by-name path and verifies no goroutine leaks via goleak. Next push will:
|
|
Changes have been pushed. Please let me know if anything else should be adjusted. |
yuandrew
left a comment
There was a problem hiding this comment.
Fix and test looks good to me! Thanks again for the contribution 😃
What was changed
Fix a goroutine leak in the test workflow environment when executing child workflows by registered type name.
Made doneChannel close idempotent using sync.Once.
Ensured doneChannel is closed from both:
startMainLoop() (deferred in the root environment)
Complete() (explicit close on workflow completion)
Consolidated leak regression coverage into a single test under test/ (using test/go.mod, which already includes go.uber.org/goleak).
Removed the previous internal/* leak tests and removed goleak from the root go.mod/go.sum.
Why?
When executing a child workflow by registered type name, a child testWorkflowEnvironmentImpl could remain blocked in startMainLoop() waiting on doneChannel.
In some completion paths, doneChannel was not guaranteed to be closed for the child environment, causing leaked goroutines.
This change ensures:
doneChannel is always closed exactly once
No race or double-close panic
Child workflow test environments exit cleanly
Regression coverage verifies no goroutines remain after workflow completion
Checklist
go test ./... -count=1
(cd test && go test -run TestChildWorkflowByName_NoGoroutineLeak -count=1 -v)
Verified no startMainLoop goroutine remains after workflow completion.
No.