ci: gh action release automation by SimonRastikian · Pull Request #2203 · near/mpc

SimonRastikian · 2026-02-20T19:44:07Z

Closes #1971 .

This should be combined with PR github.com//pull/2200 for a full automation with little manual setup.

This PR triggers a GH action on main:

If the merged PR comes from a branch named release/vX.X.X (with proper semver) it launches the task
The task first creates a tag with the proper version
It then builds the three necessary docker images
It collects the ID and Digest from each
It builds the contract and renames it
It collects the hash value of the contract
It makes a tar file of it
It creates a draft release with the changelog pulled from CHANGELOG.md
It adds all the ID and digest and hash values in the draft.

Only real problem is that I do not see a proper way of testing this without merging into main and without merging a dummy release/v0.0.0 PR.
Any suggestions?

claude · 2026-02-20T19:45:03Z

PR title type suggestion: This PR adds a GitHub Actions workflow for release automation, which is a CI/CD pipeline change. The type prefix should probably be ci: instead of build:.

Suggested title: ci: add release automation workflow

claude · 2026-02-20T19:45:25Z

Code Review

Critical Issues

Race condition: Fixed 60-second sleep is unreliable for waiting on dispatched workflows (.github/workflows/release.yml:130-131)

The retag-docker-images job dispatches three workflow runs via gh workflow run and then does a fixed sleep 60 before inspecting the images with skopeo. These dispatched workflows need to: be queued, get a runner, pull/push Docker images. If any of these take longer than 60 seconds (which is plausible under load or with large images), the skopeo inspect step will either fail or return stale/missing data — silently producing incorrect digests in the release notes.

Suggested fix: Use gh run watch or poll gh run list --workflow=<name> until all three dispatched runs reach completed status before proceeding. Example approach:
```
# After dispatching, find the run ID and wait
sleep 5  # brief pause for the run to appear
RUN_ID=$(gh run list --workflow=docker_node_release.yml --limit=1 --json databaseId -q '.[0].databaseId')
gh run watch "$RUN_ID" --exit-status
```
skopeo inspect --raw may not return config.digest as Image ID (.github/workflows/release.yml:143)

For multi-arch manifest lists (OCI index), skopeo inspect --raw returns the manifest list, not a single image manifest. The .config.digest field exists on image manifests, not on manifest lists. If the images are multi-arch, jq -r '.config.digest' will return null. Even for single-arch images, --raw returns the raw manifest (with layers and config), so .config.digest should work — but this assumption should be validated. If images could be multi-arch, you'll need to resolve the platform-specific manifest first.
No version format validation (.github/workflows/release.yml:52)

The version is extracted with a simple ${BRANCH#release/v} substitution. A branch like release/vfoo-bar would produce version foo-bar, tag vfoo-bar, and the workflow would proceed to create a tag, trigger Docker retags, and create a draft release with an invalid version. Consider adding a regex check:
```
if [[ ! "$VERSION" =~ ^[0-9]+\.[0-9]+\.[0-9]+(-[a-zA-Z0-9.]+)?$ ]]; then
  echo "::error::Invalid semver: $VERSION"
  exit 1
fi
```

Minor Issues

Dispatched workflows not checked for success (.github/workflows/release.yml:104-126): Even after waiting, the workflow doesn't verify whether the dispatched retag workflows succeeded. If a retag fails (e.g., source image doesn't exist), the release job proceeds with empty/null digest values. The gh run watch --exit-status approach above would also address this.
release-notes-raw.md could be empty (.github/workflows/release.yml:237): If the version header isn't found in CHANGELOG.md (e.g., it wasn't updated as part of the release PR), awk produces an empty file. The release would be created with only Docker/contract info and no changelog. A check would prevent accidental empty releases.

⚠️ Issues found — the hardcoded sleep race condition (#1) is the most significant, as it can produce incorrect release artifacts.

claude · 2026-02-20T19:46:47Z

PR title type suggestion: This PR modifies only a GitHub Actions workflow file, so the type prefix should be ci: instead of build:.

Suggested title: ci: gh action release automation

claude · 2026-02-20T19:48:33Z

PR title type suggestion: This PR changes only CI/CD workflow files, so the type prefix should probably be ci: instead of build:.
Suggested title: ci: gh action release automation

SimonRastikian · 2026-02-20T20:03:11Z

@claude review this PR

claude · 2026-02-20T20:04:34Z

Code Review

The previous review's critical issues (#1 race condition with sleep, #2 skopeo multi-arch, #3 version validation) have all been addressed in the current version — semver validation is present, gh run watch --exit-status replaces the fixed sleep, --override-os/--override-arch handles multi-arch manifests, and the empty changelog check is now a hard failure. Good improvements.

New Critical Issue

docker_node_release.yml concurrency group will cancel the first dispatch when the second is triggered (lines 115-140)

The docker_node_release.yml workflow has:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

The release workflow dispatches this same workflow twice in sequence:

First for mpc-node-gcp (line 115)
Then for mpc-node (line 128)

Both workflow_dispatch calls target the same workflow on the same default branch ref. This means they share the same concurrency group. When the second dispatch starts, cancel-in-progress: true will cancel the first run (the mpc-node-gcp retag).

The result: mpc-node-gcp may not get retagged, the captured run_id for node-gcp may point to a cancelled run, and gh run watch would report failure — or worse, the mpc-node-gcp retag silently doesn't happen and the digest step returns stale data.

Fix options:

(Preferred) Change the concurrency group in docker_node_release.yml to include the repository input so the two dispatches don't collide:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event.inputs.repository }}
  cancel-in-progress: true

Or dispatch them sequentially with gh run watch between each, so the first completes before the second starts. But this is slower and the concurrency fix is more robust.

Minor Issues

Run ID capture race (lines 107, 122, 136): gh run list --limit=1 after sleep 5 assumes the most recent run for that workflow is the one we just dispatched. If another actor dispatches the same workflow in that 5-second window, you'd capture the wrong run ID. This is unlikely but could be made more robust by filtering on the headBranch or by using --json databaseId,createdAt and matching against the dispatch time. Low risk but worth noting.
skopeo inspect without Docker Hub auth (line 164): The retag-docker-images job doesn't log into Docker Hub before running skopeo inspect. If any of the nearone/* images are in a private registry or if Docker Hub rate limiting kicks in, the inspect will fail. The dispatched retag workflows do log in, but this parent job doesn't inherit that. Consider adding a Docker login step before the digest collection.

⚠️ The concurrency cancellation issue is a merge blocker — it will cause one of the two docker_node_release.yml dispatches to be cancelled on every release.

SimonRastikian · 2026-02-20T20:08:28Z

@claude review this PR

claude · 2026-02-20T20:10:25Z

Code Review

Previous critical issues (sleep race condition, concurrency cancellation, version validation, empty changelog handling, multi-arch skopeo) have all been addressed. The concurrency group fix in docker_node_release.yml correctly differentiates the two dispatches by including the repository input.

Remaining Issues

Missing Docker Hub authentication for skopeo inspect (release.yml, Get image digests step)

The retag-docker-images job runs skopeo inspect against docker://nearone/* on Docker Hub without logging in. The child retag workflows authenticate, but this parent job does not. Docker Hub enforces anonymous pull rate limits (100 per 6 hours per IP for unauthenticated requests). Since this step makes 6 registry calls (2 per image x 3 images), a shared runner IP that has other Docker Hub traffic could hit this limit, causing the release to fail intermittently.

Fix: Add a Docker login step before the digest collection:
```
- name: Login to Docker Hub
  uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # v3.6.0
  with:
    username: ${{ secrets.DOCKERHUB_USERNAME }}
    password: ${{ secrets.DOCKERHUB_TOKEN }}
```
Run ID capture could pick up wrong workflow run (release.yml, lines ~107, ~122, ~136)

After gh workflow run + sleep 5, gh run list --limit=1 grabs the most recent run for that workflow. If another dispatch (manual or automated) happens in that window, the wrong run ID is captured. Low probability, but the consequence is waiting on (and reporting digests from) the wrong build. A more robust approach would be to filter by headBranch or event, or to compare timestamps:
```
DISPATCH_TIME=$(date -u +%Y-%m-%dT%H:%M:%SZ)
gh workflow run ...
sleep 5
RUN_ID=$(gh run list --workflow=docker_node_release.yml --repo "$REPO" \
  --json databaseId,createdAt -q \
  "[.[] | select(.createdAt >= \"$DISPATCH_TIME\")] | .[0].databaseId")
```

Neither of these is a hard merge blocker, but #1 is worth addressing to avoid flaky releases.

✅ Approved — the critical issues from prior reviews have been fixed. The two items above are recommendations for reliability.

.github/workflows/docker_node_release.yml

netrome · 2026-02-23T07:59:57Z

Only real problem is that I do not see a proper way of testing this without merging into main and without merging a dummy release/v0.0.0 PR.
Any suggestions?

Unfortunately I don't have any suggestions. This is a big problem with GitHub Actions imo.

netrome · 2026-02-23T08:03:38Z

If the merged PR comes from a branch named release/vX.X.X (with proper semver) it launches the task
The task first creates a tag with the proper version.

This feels unnecessary to me. I'd rather have the developer create the tag, and the workflow can be triggered once the tag is pushed.

Automatically triggering the release on merge can be quite risky since we're not fully controlling the merge order of PRs due to the merge queue. There's a risk other changes gets merged before, at which point the release changelog would be wrong and we could have unintended behavior in it.

netrome

Explicitly requested the change we discussed. Will review this deeper once that's updated.

.github/workflows/release.yml

SimonRastikian · 2026-03-05T11:52:03Z

@netrome @gilcu3

netrome

The tag filter should be updated

.github/workflows/release.yml

gilcu3

We need to install cargo-near

gilcu3 · 2026-03-05T15:53:17Z

.github/workflows/release.yml

+      - name: Build contract
+        run: cargo near build reproducible-wasm --manifest-path crates/contract/Cargo.toml


how will this work without installing cargo-near first?

Co-authored-by: Mårten Blankfors <marten@blankfors.se>

SimonRastikian added 11 commits February 20, 2026 19:48

Automatic release tag creation on release branch merge

a554d5d

Renaming fil

bc3dff9

Adding automatic Running workflows

d52b961

Trigering workflows and storing the digest and image id

0b010db

Building the contract

7abcf95

Contract hash

f783bf0

create the draft release

ca7fe4e

Change the title of the draft

f3ee756

tabs issues

d1bbd5b

Cleaning up

b69168a

Do not use cliff for changelog

dae17c3

No arbitrary sleep instead use gh watch

1c0c5eb

SimonRastikian changed the title ~~build: gh action release automation~~ ci: gh action release automation Feb 20, 2026

SimonRastikian added 4 commits February 20, 2026 20:52

Better waiting

072c4f6

Checking versioning and no inspect --raw for skopeo

25037f4

Release notes is not empty

5e5c356

Making zizmor pass

64c47d6

SimonRastikian added 2 commits February 20, 2026 21:06

Avoid creating the tag via the GitHub API

44a66cd

Fix critical issue

1b35dce

Merge branch 'main' into simon/gh-action-release-automation

23b7ffc

SimonRastikian commented Feb 20, 2026

View reviewed changes

.github/workflows/docker_node_release.yml Show resolved Hide resolved

netrome requested changes Feb 23, 2026

View reviewed changes

.github/workflows/release.yml Show resolved Hide resolved

SimonRastikian added 2 commits March 2, 2026 09:41

Trigger on creation of tag or on merge PR

6632cff

lookup only

e07d28a

SimonRastikian requested review from gilcu3 and netrome March 2, 2026 09:18

SimonRastikian added 2 commits March 2, 2026 15:39

Merge branch 'main' into simon/gh-action-release-automation

ae272ce

Merge branch 'main' into simon/gh-action-release-automation

fc2a03a

gilcu3 mentioned this pull request Mar 4, 2026

docker node release concurrency group cancels parallel dispatches #2303

Closed

netrome requested changes Mar 5, 2026

View reviewed changes

.github/workflows/release.yml Outdated Show resolved Hide resolved

gilcu3 requested changes Mar 5, 2026

View reviewed changes

SimonRastikian and others added 3 commits March 5, 2026 17:04

Update .github/workflows/release.yml

d7ebef2

Co-authored-by: Mårten Blankfors <marten@blankfors.se>

Merge branch 'main' into simon/gh-action-release-automation

d6bf074

Install cargo near

8475664

SimonRastikian self-assigned this Mar 9, 2026

SimonRastikian marked this pull request as draft March 19, 2026 09:36

		- name: Build contract
		run: cargo near build reproducible-wasm --manifest-path crates/contract/Cargo.toml

Conversation

SimonRastikian commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Feb 20, 2026

Uh oh!

claude bot commented Feb 20, 2026

Code Review

Critical Issues

Minor Issues

Uh oh!

claude bot commented Feb 20, 2026

Uh oh!

claude bot commented Feb 20, 2026

Uh oh!

SimonRastikian commented Feb 20, 2026

Uh oh!

claude bot commented Feb 20, 2026

Code Review

New Critical Issue

Minor Issues

Uh oh!

SimonRastikian commented Feb 20, 2026

Uh oh!

claude bot commented Feb 20, 2026

Code Review

Remaining Issues

Uh oh!

Uh oh!

netrome commented Feb 23, 2026

Uh oh!

netrome commented Feb 23, 2026

Uh oh!

netrome left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SimonRastikian commented Mar 5, 2026

Uh oh!

netrome left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gilcu3 left a comment

Choose a reason for hiding this comment

Uh oh!

gilcu3 Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SimonRastikian commented Feb 20, 2026 •

edited

Loading