Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 7 additions & 162 deletions .github/workflows/gpu-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,27 +9,13 @@ on:
- 'lib/**'
- 'src/**'
- 'test/**'
- 'ci/**'
- 'CMakeLists.txt'
workflow_dispatch:
inputs:
skip_build:
description: 'Skip build step (use existing artifacts)'
type: boolean
default: false
workflow_call:
inputs:
skip_build:
description: 'Skip build step (packages already built by caller)'
type: boolean
default: true

jobs:
# Build CUDA package for testing
# Skipped when called from release.yml (packages already built)
build-cuda-package:
name: Build linux-x64-cuda
if: ${{ inputs.skip_build != true }}
if: ${{ github.repository == 'lloyal-ai/lloyal.node' }}
runs-on: ubuntu-22.04
Comment on lines 13 to 19
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip_build is still exposed as a workflow_dispatch input, but the workflow no longer has a workflow_call path and there is no mechanism here to provide “existing artifacts” when build-cuda-package is skipped. As-is, dispatching with skip_build: true will likely leave gpu-integration without the package-linux-x64-cuda artifact. Consider removing this input, or adding logic to fetch artifacts from a known source when build is skipped.

Copilot uses AI. Check for mistakes.

steps:
Expand Down Expand Up @@ -83,155 +69,14 @@ jobs:
retention-days: 1
compression-level: 0

# GPU Integration Tests via Cloud Run
# Runs real GPU tests on NVIDIA L4
#
# L4 GPU Requirements (as of 2024):
# - Driver: 535.216.03 (supports CUDA 12.2.2 max)
# - Minimum: 4 CPU, 16 GiB memory
# - Regions: us-central1, us-east4, europe-west1, europe-west4, asia-southeast1
# - Quota: 3 L4 GPUs per region (default)
# GPU Integration Tests via Cloud Run (L4)
# Infrastructure details are in the private lloyal-infra repo
gpu-integration:
name: GPU Tests (L4)
needs: build-cuda-package
runs-on: ubuntu-latest
# Run if build succeeded OR was skipped (packages from caller)
if: ${{ !cancelled() && (needs.build-cuda-package.result == 'success' || needs.build-cuda-package.result == 'skipped') }}

if: ${{ github.repository == 'lloyal-ai/lloyal.node' && needs.build-cuda-package.result == 'success' }}
uses: lloyal-ai/lloyal-infra/.github/workflows/gpu-integration.yml@main
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow calls a reusable workflow from lloyal-ai/lloyal-infra pinned to @main. To avoid unexpected behavior changes and reduce supply-chain risk, pin to a tag or commit SHA instead of a moving branch reference.

Suggested change
uses: lloyal-ai/lloyal-infra/.github/workflows/gpu-integration.yml@main
uses: lloyal-ai/lloyal-infra/.github/workflows/gpu-integration.yml@v1

Copilot uses AI. Check for mistakes.
secrets: inherit
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

secrets: inherit forwards all secrets to the reusable workflow. If only specific secrets are required, map them explicitly to reduce unnecessary secret exposure to the called workflow.

Suggested change
secrets: inherit
secrets:
# TODO: Restrict this list to only the secrets required by
# lloyal-ai/lloyal-infra/.github/workflows/gpu-integration.yml
# Example mappings (replace with actual required secrets):
# CLOUD_RUN_SERVICE_ACCOUNT_KEY: ${{ secrets.CLOUD_RUN_SERVICE_ACCOUNT_KEY }}
# GCP_PROJECT_ID: ${{ secrets.GCP_PROJECT_ID }}
# GCP_REGION: ${{ secrets.GCP_REGION }}

Copilot uses AI. Check for mistakes.
permissions:
contents: read
id-token: write # Required for Workload Identity Federation

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Authenticate to GCP
uses: google-github-actions/auth@v2
with:
workload_identity_provider: ${{ secrets.GCP_WIF_PROVIDER }}
service_account: ${{ secrets.GCP_SA_EMAIL }}

- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@v2

- name: Configure Docker for Artifact Registry
run: gcloud auth configure-docker us-east4-docker.pkg.dev --quiet

- name: Download package artifact
uses: actions/download-artifact@v4
with:
name: package-linux-x64-cuda
path: packages/package-linux-x64-cuda

- name: Build GPU test image
run: |
IMAGE="us-east4-docker.pkg.dev/${{ secrets.GCP_PROJECT_ID }}/lloyal-ci/gpu-tests:${{ github.sha }}-cuda"
docker build \
-f ci/Dockerfile.gpu-tests \
-t "$IMAGE" .
docker push "$IMAGE"
echo "IMAGE=$IMAGE" >> $GITHUB_ENV

- name: Deploy Cloud Run Job
run: |
JOB_NAME="lloyal-gpu-test-cuda"

# Check if job exists
if gcloud run jobs describe $JOB_NAME --region=us-east4 2>/dev/null; then
gcloud run jobs update $JOB_NAME \
--region=us-east4 \
--image="${IMAGE}" \
--service-account="${{ secrets.GCP_SA_EMAIL }}" \
--set-env-vars=LLOYAL_GPU=cuda,LLOYAL_NO_FALLBACK=1 \
--task-timeout=20m \
--no-gpu-zonal-redundancy
else
gcloud run jobs create $JOB_NAME \
--region=us-east4 \
--image="${IMAGE}" \
--service-account="${{ secrets.GCP_SA_EMAIL }}" \
--set-env-vars=LLOYAL_GPU=cuda,LLOYAL_NO_FALLBACK=1 \
--task-timeout=20m \
--gpu=1 \
--gpu-type=nvidia-l4 \
--memory=16Gi \
--cpu=4 \
--max-retries=0 \
--no-gpu-zonal-redundancy
fi

- name: Run GPU tests
run: |
JOB_NAME="lloyal-gpu-test-cuda"
REGION="us-east4"

# Launch job asynchronously so we can stream logs
EXEC=$(gcloud run jobs execute $JOB_NAME \
--region=$REGION \
--async \
--format='value(metadata.name)')

echo "Execution: $EXEC"
echo "Streaming logs (container startup may take ~30s)..."
echo ""

# Filter for this specific execution's logs
LOG_FILTER="resource.type=\"cloud_run_job\" AND resource.labels.job_name=\"$JOB_NAME\" AND labels.\"run.googleapis.com/execution_name\"=\"$EXEC\""

# Poll loop: stream new log lines + check for completion
SEEN=0
while true; do
# Check if execution has completed
COMPLETION=$(gcloud run jobs executions describe "$EXEC" \
--region="$REGION" \
--format='value(status.completionTime)' 2>/dev/null || true)

# Fetch all logs for this execution in chronological order
LOGS=$(gcloud logging read "$LOG_FILTER" \
--limit=10000 \
--order=asc \
--format='value(textPayload)' 2>/dev/null || true)

# Print only lines we haven't seen yet
if [ -n "$LOGS" ]; then
TOTAL=$(echo "$LOGS" | wc -l | tr -d ' ')
if [ "$TOTAL" -gt "$SEEN" ]; then
echo "$LOGS" | tail -n +$((SEEN + 1))
SEEN=$TOTAL
fi
fi

# If done, do one final fetch for stragglers then break
if [ -n "$COMPLETION" ]; then
sleep 5
LOGS=$(gcloud logging read "$LOG_FILTER" \
--limit=10000 \
--order=asc \
--format='value(textPayload)' 2>/dev/null || true)
if [ -n "$LOGS" ]; then
TOTAL=$(echo "$LOGS" | wc -l | tr -d ' ')
if [ "$TOTAL" -gt "$SEEN" ]; then
echo "$LOGS" | tail -n +$((SEEN + 1))
fi
fi
break
fi

sleep 10
done

# Determine pass/fail from execution status
SUCCEEDED=$(gcloud run jobs executions describe "$EXEC" \
--region="$REGION" \
--format=json 2>/dev/null | \
jq -r '.status.conditions[] | select(.type == "Completed") | .status')

if [ "$SUCCEEDED" = "True" ]; then
echo ""
echo "✅ GPU Tests Passed"
else
echo ""
echo "❌ GPU Tests Failed"
exit 1
fi
id-token: write
6 changes: 3 additions & 3 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -307,15 +307,15 @@ jobs:
path: packages/${{ matrix.package }}/
retention-days: 1

# GPU Integration Tests (reusable workflow)
# GPU Integration Tests (reusable workflow from private infra repo)
gpu-tests:
name: GPU Tests
needs: build-and-test
uses: ./.github/workflows/gpu-test.yml
uses: lloyal-ai/lloyal-infra/.github/workflows/gpu-integration.yml@main
secrets: inherit
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

secrets: inherit passes all repository/environment secrets to the called workflow. If the infra workflow only needs a small set (e.g., GCP project/service account/provider), prefer explicitly mapping only the required secrets to minimize blast radius if the called workflow changes.

Suggested change
secrets: inherit
secrets:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Copilot uses AI. Check for mistakes.
permissions:
contents: read
id-token: write # Required for GCP Workload Identity Federation
id-token: write
Comment on lines +310 to +318
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reusable workflow is referenced from another repository using @main. This is brittle (upstream changes can break releases unexpectedly) and increases supply-chain risk. Prefer pinning the reusable workflow to a tagged release or a specific commit SHA and updating intentionally.

Copilot uses AI. Check for mistakes.

publish:
name: Publish all packages
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,9 @@ jobs:

- name: Build from submodules
run: npm run build
env:
# Force CPU — GitHub Actions paravirtual Metal GPU has driver bugs
LLOYAL_GPU: cpu
# This runs scripts/build.js which:
# 1. Builds llama.cpp from llama.cpp/
# 2. Builds liblloyal from liblloyal/
Expand Down
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,7 @@ Thumbs.db

tmp/

packages/darwin-arm64
packages/darwin-arm64

# CI infra scripts (injected from lloyal-infra during CI)
ci/
53 changes: 0 additions & 53 deletions ci/Dockerfile.gpu-tests

This file was deleted.

Loading
Loading