-
Notifications
You must be signed in to change notification settings - Fork 8
CUBE-92 - Add cloud compose and deployment workflow #111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
dafd9d0 to
809d2a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should depend on images publication workflow
hal/buildroot/qemu.sh
Outdated
| -fsdev local,id=cert_fs,path=$CERTS_PATH,security_model=mapped \ | ||
| -device virtio-9p-pci,fsdev=cert_fs,mount_tag=certs_share \ | ||
| -device vhost-vsock-pci,guest-cid=6 \ | ||
| -device vhost-vsock-pci,guest-cid=42 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need or use vsock
hal/buildroot/qemu.sh
Outdated
| -fsdev local,id=cert_fs,path=$CERTS_PATH,security_model=mapped \ | ||
| -device virtio-9p-pci,fsdev=cert_fs,mount_tag=certs_share \ | ||
| -device vhost-vsock-pci,guest-cid=6 \ | ||
| -device vhost-vsock-pci,guest-cid=42 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need or use vsock
| CONFIG_VIRTIO_BLK=y | ||
| CONFIG_VIRTIO_NET=y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the same config as cocos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
df3a318 to
f5472d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buildroot changes are not expected in this pr
cc32851 to
ce96647
Compare
...linux/board/cube/overlay/etc/systemd/system/multi-user.target.wants/mount-host-certs.service
Outdated
Show resolved
Hide resolved
hal/buildroot/linux/board/cube/overlay/usr/lib/systemd/system/mount-host-certs.service
Outdated
Show resolved
Hide resolved
hal/buildroot/linux/board/cube/overlay/usr/local/bin/mount-host-certs.sh
Outdated
Show resolved
Hide resolved
| print_help | ||
| exit 1 | ||
| ;; | ||
| esac No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| esac | |
| esac | |
same for any other instances
| -machine q35 \ | ||
| -enable-kvm \ | ||
| -netdev user,id=vmnic,hostfwd=tcp::6190-:22,hostfwd=tcp::6191-:80,hostfwd=tcp::6192-:443,hostfwd=tcp::6193-:6193,dns=8.8.8.8 \ | ||
| -netdev user,id=vmnic,hostfwd=tcp::6190-:22,hostfwd=tcp::6191-:80,hostfwd=tcp::6192-:443,hostfwd=tcp::6193-:7001,hostfwd=tcp::6194-:11434,hostfwd=tcp::6195-:8000,dns=8.8.8.8 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this same config is used in our cvm make sure no one can ssh to our cvm, for examples root password
hal/terraform/azure/main.tf
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do not need to duplicate the terraform scripts since we can reuse the ones on cocos-infra, the only thing that needs to change is cloud init script
| priority = 1 | ||
| [http.routers.cube-ui.tls] | ||
| certResolver = "letsEncrypt" | ||
| [[http.routers.cube-ui.tls.domains]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dynamic.toml file uses CUBE_DOMAIN placeholders (lines 9, 16, 19), but no substitution mechanism exists in the codebase. The main compose.yaml uses dynamic.yaml (without placeholders), but cloud-compose.yaml mounts dynamic.toml directly. This will cause deployment failures. Either:
Add a CUBE_DOMAIN variable to .env and preprocess dynamic.toml before deployment (using envsubst in a Makefile target or entrypoint script), or
Create a properly templated dynamic.toml with actual domains for cloud deployments
same applies for other placeholders added throughout the code
hal/buildroot/cvm-monitor.sh
Outdated
| CHECK_INTERVAL=30 | ||
| LOG_DIR="/tmp/cube-logs" | ||
| LOG_FILE="$LOG_DIR/cube-cvm-monitor.log" | ||
| QEMU_SCRIPT="/home/washington/cube/hal/buildroot/qemu.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix this line (we work on a public repo), also document this script
hal/buildroot/cvm-monitor.sh
Outdated
| pgrep -f "qemu-system-x86_64" -l | head -5 || echo "No QEMU processes found" | ||
|
|
||
| if [ -f "$PIDFILE" ]; then | ||
| local pid=$(cat "$PIDFILE" 2>/dev/null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uses local pid=$(...) inside the case statement's "status" block, which is not within a function. This will cause a runtime error. Either move this logic to a function or use a regular variable.
docker/traefik/letsencrypt/.gitkeep
Outdated
| @@ -0,0 +1,2 @@ | |||
| # Copyright (c) Ultraviolet | |||
| # SPDX-License-Identifier: Apache-2.0 No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New line
docker/traefik/ssl/certs/.gitkeep
Outdated
| @@ -0,0 +1,2 @@ | |||
| # Copyright (c) Ultraviolet | |||
| # SPDX-License-Identifier: Apache-2.0 No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New line
.github/workflows/deploy-cloud.yaml
Outdated
| type: choice | ||
| options: | ||
| - dev | ||
| - staging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prod and staging env are supposed to be k8s
| jobs: | ||
| deploy-cloud: | ||
| runs-on: ubuntu-latest | ||
| environment: ${{ inputs.environment || 'dev' }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this environment being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, when triggering the pipeline manually from github.
docker/.env
Outdated
| UV_CUBE_PROXY_SERVER_CERT= | ||
| UV_CUBE_PROXY_SERVER_KEY= | ||
| UV_CUBE_AGENT_URL=http://cube-agent:8901 | ||
| UV_CUBE_AGENT_URL=https://10.172.192.41:6193 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
| push: | ||
| branches: | ||
| - main | ||
| - cube-92 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove before merge
.github/workflows/deploy-cloud.yaml
Outdated
| fi | ||
| # Initialize acme.json if it doesn't exist | ||
| # if [ ! -f "traefik/ssl/certs/acme.json" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conditional logic is commented out, so acme.json is overwritten with {} on every deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this cloud init file should be the same one used in ubuntu dir, we don't need a tf directory as well, this can simply be a section in docs
hal/ubuntu/cube-agent-config.yml
Outdated
| content: | | ||
| UV_CUBE_AGENT_LOG_LEVEL=info | ||
| UV_CUBE_AGENT_HOST=0.0.0.0 | ||
| UV_CUBE_AGENT_PORT=7001 | ||
| UV_CUBE_AGENT_INSTANCE_ID=cube-agent-01 | ||
| UV_CUBE_AGENT_TARGET_URL=http://localhost:11434 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include all possible agent configs even if they will be left empty, same for writing tls/mtls certs into cloud image, same goes for a mechanism for custom models, allow for these to be configurable
|
|
||
| # Pull default model | ||
| echo "Pulling tinyllama model..." | ||
| /usr/local/bin/ollama pull tinyllama:1.1b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as in hal pull the base models supported in cube
hal/ubuntu/cube-agent-config.yml
Outdated
|
|
||
| # Install Ollama | ||
| - echo "Installing Ollama..." | ||
| - curl -fsSL https://ollama.com/install.sh | sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pin to a specific version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conditonally support vllm since we support both ollama and vllm or create another cloud init script if needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pin to a specific version
ollama download there is no specific version to pin to.
- Add virtio-9p runtime certificate mounting (not baked into image) - Configure QEMU with certificate sharing via virtio-9p-pci device - Update TDX guest CID from 6 to 42 to avoid conflicts - Add kernel support for VIRTIO_BLK and VIRTIO_NET - Enable systemd-networkd and systemd-resolved for automatic network config - Add buildroot overlay configuration for network and systemd services - Implement proven cocos approach with fstab-based 9p mounting - Add BR2_PACKAGE_9PFS for 9p filesystem support in userspace - Create post-build script for automatic mount point creation - Fix buildroot overlay path to include /linux/ directory This enables automatic network configuration and runtime certificate mounting via virtio-9p for secure mTLS communication without baking certificates into the VM image. Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> remove legacy config Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> update network config Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> update agent url Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> test deployment Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> test deployment Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> test deployment Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> test deployment Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> test deployment Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> test deployment Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> test deployment Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> test deployment Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> test deployment Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
- Update SMQ_RELEASE_TAG from latest to v0.18.3 across all compose files - Add fallback version to docker image tags in compose files - Ensures consistent SuperMQ version deployment Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> update agent target url Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> Add cvm service Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> Add cvm service Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> squash Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
7582770 to
955a7ce
Compare
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
hal/buildroot/README.md
Outdated
| cd cocos-infra/gcp | ||
| cat > terraform.tfvars <<EOF | ||
| cloud_init_config = "/path/to/cube/hal/ubuntu/cube-agent-config.yml" | ||
| # ... other variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be specific a user needs to rely on this readme without looking into the code
hal/ubuntu/qemu.sh
Outdated
| - chown -R ollama:ollama /home/ollama | ||
| - curl -fsSL https://ollama.com/install.sh | sh | ||
| - git clone https://github.com/ultravioletrs/cube.git /tmp/cube | ||
| - cd /tmp/cube && git fetch origin pull/88/head:pr-88 && git checkout pr-88 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix before merge
hal/buildroot/README.md
Outdated
| --allow=tcp:7001 \ | ||
| --source-ranges=0.0.0.0/0 \ | ||
| --target-tags=cube-ai-cvm-01 \ | ||
| --project=valued-base-354714 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a specific project, should not be exposed
hal/buildroot/README.md
Outdated
| gcloud compute firewall-rules create allow-cube-agent-7001 \ | ||
| --allow=tcp:7001 \ | ||
| --source-ranges=0.0.0.0/0 \ | ||
| --target-tags=cube-ai-cvm-01 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there instructions to set this tag on the terraform scripts?
hal/ubuntu/cube-agent-config.yml
Outdated
| sed -i "s|__VLLM_MODEL__|$VLLM_MODEL|g" /etc/systemd/system/vllm.service | ||
| echo "Installing vLLM..." | ||
| pip3 install vllm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pin to a specific vllm version, there was also a comment simillarly for ollama which was not resolved
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> test Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> update deployment workflow Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
ae52cb6 to
b798aaa
Compare
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
docker/.env
Outdated
| UV_CUBE_AGENT_CLIENT_CERT=/etc/cube/certs/client.crt | ||
| UV_CUBE_AGENT_CLIENT_KEY=/etc/cube/certs/client.key | ||
| UV_CUBE_AGENT_CLIENT_CA=/etc/cube/certs/ca.pem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be templated or overridden, these files are not tracked in repo or mounted with docker
docker/.env
Outdated
| CUBE_AI_ATTESTATION_URL=http://cube-proxy:8900 | ||
| CUBE_AUDIT_LOGS_URL=http://opensearch:9200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
audit logs should be fetched from proxy same for attestation this only needs one env variable
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
| CUBE_AI_ATTESTATION_URL=http://cube-proxy:${UV_CUBE_PROXY_PORT} | ||
| CUBE_AUDIT_LOGS_URL=http://cube-proxy:${UV_CUBE_PROXY_PORT} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should all be a single variable open an issue on ui repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
What type of PR is this?
This is a feature
What does this do?
Which issue(s) does this PR fix/relate to?
Have you included tests for your changes?
Did you document any new/modified features?
Notes