Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
6f13233
Add new workflow to run python decentralized tests in the ci
IAvecilla Nov 14, 2025
f1633fd
Cancel in progress actions
IAvecilla Nov 27, 2025
63342f8
Spawn multiple clients with different gpus
IAvecilla Dec 1, 2025
c806ecd
integration tests: check for eval of client
dsocolobsky Jan 6, 2026
e3b3da8
add separate docker compose for python tests
dsocolobsky Jan 7, 2026
f5cf326
Merge branch 'main' into self-hosted-runner
IAvecilla Jan 23, 2026
91c2b5b
Fix use gpu check for the decentralized tests
IAvecilla Jan 23, 2026
23c9dde
Merge branch 'main' into self-hosted-runner
IAvecilla Jan 23, 2026
2bb9053
Add garnix cache as substituer in nix build step
IAvecilla Jan 23, 2026
1f59402
Revert "Add garnix cache as substituer in nix build step"
IAvecilla Jan 23, 2026
b3fa409
Allow build client image in self hosted
IAvecilla Jan 23, 2026
80daccb
Add Torchtitan test
IAvecilla Jan 24, 2026
0cffffa
Merge branch 'main' into self-hosted-runner
IAvecilla Jan 24, 2026
875cde7
Fix compilation errors
IAvecilla Jan 24, 2026
33467e5
Clean test
IAvecilla Jan 24, 2026
999f4f1
Add torchtitan test to the CI
IAvecilla Jan 24, 2026
819bbbd
Fix barrier for torchtitan parameters dist
IAvecilla Jan 26, 2026
23e45b6
Add logs on parameter sharing with ranks
IAvecilla Jan 26, 2026
c7a96fe
Merge branch 'main' into dy/ci-test-py-client-run-eval
dsocolobsky Jan 26, 2026
e4c5489
Merge branch 'self-hosted-runner' into dy/ci-test-py-client-run-eval
dsocolobsky Jan 26, 2026
2688884
Add more logs
IAvecilla Jan 26, 2026
a82613c
fix clippy warnings
dsocolobsky Jan 26, 2026
668c3a8
Fix order of parameters in distributed model
IAvecilla Jan 26, 2026
d6491c4
Fix test checks
IAvecilla Jan 26, 2026
0f275fa
Merge branch 'torchtitan-test-fix' into self-hosted-runner
IAvecilla Jan 26, 2026
f230799
Remove debug comments
IAvecilla Jan 26, 2026
544af58
Merge branch 'main' into self-hosted-runner
IAvecilla Jan 26, 2026
3cf44bc
Fix compilation after merge
IAvecilla Jan 27, 2026
d55a5f3
Refactor for decentralized python tests and utils
IAvecilla Jan 27, 2026
b22c0ae
Merge branch 'main' into self-hosted-runner
IAvecilla Jan 27, 2026
eabe619
Merge branch 'self-hosted-runner' into dy/ci-test-py-client-run-eval
dsocolobsky Jan 27, 2026
cbb8772
Fix missing use_proxies
IAvecilla Jan 27, 2026
b4d202f
Merge branch 'self-hosted-runner' into dy/ci-test-py-client-run-eval
dsocolobsky Jan 27, 2026
e91359b
fix justfile
dsocolobsky Jan 27, 2026
b125443
restore some files from self-hosted-runner branch
dsocolobsky Jan 27, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 13 additions & 2 deletions .github/actions/wait-for-garnix/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,17 @@ inputs:
runs:
using: 'composite'
steps:
- name: Install GitHub CLI
shell: bash
run: |
if ! command -v gh &> /dev/null; then
echo "Installing GitHub CLI..."
sudo apt update
sudo apt install gh -y
else
echo "GitHub CLI already installed"
fi

- name: Wait for All Garnix checks
shell: bash
env:
Expand All @@ -26,13 +37,13 @@ runs:
for i in $(seq 1 $TOTAL_ATTEMPTS); do
if [ -z "$GARNIX_SUITE_ID" ]; then
GARNIX_SUITE_ID=$(gh api repos/${{ github.repository }}/commits/$SHA/check-suites --jq '.check_suites[] | select(.app.name == "Garnix CI") | .id')

if [ -z "$GARNIX_SUITE_ID" ]; then
echo "No Garnix CI check suite found yet, waiting... (attempt $i/$TOTAL_ATTEMPTS)"
sleep 10
continue
fi

echo "Found Garnix CI check suite: $GARNIX_SUITE_ID"
fi

Expand Down
67 changes: 67 additions & 0 deletions .github/workflows/solana-integration-test-run.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@ on:
branches: [main]
pull_request:
branches: [main, '**']

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:
# First, build the validator image and cache it
build-validator:
Expand Down Expand Up @@ -36,3 +41,65 @@ jobs:
with:
test-name: ${{ matrix.test-name }}
secrets: inherit

decentralized-integration-python-test:
runs-on: self-hosted
needs: build-validator

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Get Validator Image from cache
id: cache-validator
uses: actions/cache/restore@v4
with:
path: validator-image.tar.gz
key: validator-image-${{ runner.os }}-${{ hashFiles('shared/coordinator/src/coordinator.rs', 'architectures/decentralized/solana-coordinator/**/*.rs', 'architectures/decentralized/solana-coordinator/**/*.toml', 'architectures/decentralized/solana-coordinator/Cargo.lock', 'architectures/decentralized/solana-authorizer/**/*.rs', 'architectures/decentralized/solana-authorizer/**/*.toml', 'architectures/decentralized/solana-authorizer/Cargo.lock', 'docker/test/psyche_solana_validator_entrypoint.sh', 'nix/docker.nix', 'flake.lock') }}
fail-on-cache-miss: true

- name: Load Validator Image
run: |
echo "Loading validator image from cache"
docker load < validator-image.tar.gz
docker images | grep psyche-solana-test-validator

echo "Disk usage after loading validator"
df -h

- name: Clean up validator tar file
run: |
# Remove the compressed validator image to free up disk space
rm -f validator-image.tar.gz
echo "Disk usage after removing validator tar"
df -h

- name: Download Solana Test Client Python Image
run: |
echo "Disk space before client build"
df -h

sleep 500
# Calculate the derivation hash
echo "Calculating derivation path"
DRV_PATH=$(nix eval --raw .#docker-psyche-solana-test-client.drvPath)
echo "Derivation path: $DRV_PATH"

OUT_PATH=$(nix derivation show $DRV_PATH | jq -r '.[].outputs.out.path')
echo "Output path: $OUT_PATH"

# download from Garnix cache first
echo "Attempting to fetch from Garnix cache"
nix-store --realise $OUT_PATH --option substitute true

# Load the image into Docker
$OUT_PATH | docker load

echo "Disk space after client build"
df -h

- name: Run decentralized integration test
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
nix develop .#dev-python --command bash -c "cargo test --release --features python,parallelism -p psyche-decentralized-testing --test integration_tests -- --nocapture test_big_model_with_sidecars test_torchtitan_p2p_model_share"
2 changes: 1 addition & 1 deletion architectures/decentralized/solana-authorizer/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions architectures/decentralized/solana-coordinator/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading