-
Notifications
You must be signed in to change notification settings - Fork 15
feat: Add first docker-compose deployment for stability #92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Docker Compose configuration for stable SGLang deployment, introducing a router-based architecture for improved reliability and consistent serving infrastructure.
- Added Docker Compose configurations for SGLang server and router components
- Updated environment variable naming and cache configurations for consistency
- Enhanced documentation with Docker Compose usage instructions and build processes
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/USAGE.sglang.md | Updated SGLang commit hash and corrected environment variable names |
| docker-compose/sglang-server.yaml | New Docker Compose service definition for SGLang server with health checks |
| docker-compose/sglang-router.yaml | New Docker Compose service definition for SGLang router with load balancing |
| README.md | Added Docker Compose usage documentation and updated build instructions |
| .env.example | Updated environment variable names and added new cache directory configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| - --prometheus-port | ||
| - "10091" | ||
| ports: | ||
| - 10090:10090 |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ports mapping is incomplete. The prometheus port 10091 is configured in the command but not exposed in the ports section, which will prevent external access to monitoring metrics.
README.md
Outdated
| docker build -t deepauto/hip-attention:latest -t deepauto/hip-attention:latest-sglang -t deepauto/hip-attention:${tag_git_short} -t deepauto/hip-attention:${tag_hip_attention}-sglang -f Dockerfile.sglang . | ||
|
|
||
| # Publish sglang server image | ||
| docker push deepauto/hip-attention:latest | ||
| docker push deepauto/hip-attention:latest-sglang | ||
| docker push deepauto/hip-attention:$(git rev-parse --short HEAD)-sglang | ||
| docker push deepauto/hip-attention:v$(uv run python -c 'import importlib.metadata; print(importlib.metadata.version("hip-attn"))')-sglang | ||
| docker push deepauto/hip-attention:${tag_git_short} | ||
| docker push deepauto/hip-attention:${tag_hip_attention}-sglang |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tag format is inconsistent. The variable ${tag_hip_attention} already includes the -sglang suffix, but it's being appended again with -sglang, resulting in a double suffix like v1.2.9-sglang-sglang.
README.md
Outdated
| docker build -t deepauto/hip-attention:latest -t deepauto/hip-attention:latest-sglang -t deepauto/hip-attention:${tag_git_short} -t deepauto/hip-attention:${tag_hip_attention}-sglang -f Dockerfile.sglang . | ||
|
|
||
| # Publish sglang server image | ||
| docker push deepauto/hip-attention:latest | ||
| docker push deepauto/hip-attention:latest-sglang | ||
| docker push deepauto/hip-attention:$(git rev-parse --short HEAD)-sglang | ||
| docker push deepauto/hip-attention:v$(uv run python -c 'import importlib.metadata; print(importlib.metadata.version("hip-attn"))')-sglang | ||
| docker push deepauto/hip-attention:${tag_git_short} | ||
| docker push deepauto/hip-attention:${tag_hip_attention}-sglang |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same tag format issue as above. The variable ${tag_hip_attention} already contains -sglang, so appending -sglang again creates an incorrect tag format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| source: sglang-cache | ||
| target: /root/.cache | ||
| - type: bind | ||
| source: ${HF_HOME:?error} |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message 'error' is too generic and unhelpful. Consider using a more descriptive message like 'HF_HOME environment variable is required but not set'.
| source: ${HF_HOME:?error} | |
| source: ${HF_HOME:?HF_HOME environment variable is required but not set} |
| - --prometheus-port | ||
| - "10091" | ||
| ports: | ||
| - 10090:10090 |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ports configuration is incomplete. The router is configured to listen on port 10090 and also expose Prometheus metrics on port 10091, but only port 10090 is exposed. Add port mapping for 10091:10091 to make Prometheus metrics accessible.
| - http://h100-80-1:10080 | ||
| # - http://h100-80-2:10080 |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded hostname 'h100-80-1' makes this configuration environment-specific and less portable. Consider using environment variables or Docker service names for worker URLs.
| - http://h100-80-1:10080 | |
| # - http://h100-80-2:10080 | |
| - ${WORKER_URLS:-http://sglang-worker:10080} | |
| # - ${WORKER_URLS_2:-http://sglang-worker-2:10080} |
README.md
Outdated
|
|
||
| ### Docker Compose | ||
|
|
||
| Docker compose examples are available in [`docker-compose`](#docker-compose) folder. |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The markdown link reference '#docker-compose' appears to be incorrect as there's no corresponding anchor or section with that exact ID in the document.
| Docker compose examples are available in [`docker-compose`](#docker-compose) folder. | |
| Docker compose examples are available in [`docker-compose`](docker-compose/) folder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| - http://h100-80-1:10080 | ||
| # - http://h100-80-2:10080 |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hardcoded hostname 'h100-80-1' should be documented or made configurable via environment variables for better portability across different deployment environments.
| - http://h100-80-1:10080 | |
| # - http://h100-80-2:10080 | |
| - http://${WORKER_HOSTNAME:-h100-80-1}:10080 # Set WORKER_HOSTNAME env var to override | |
| # - http://${WORKER_HOSTNAME_2:-h100-80-2}:10080 # Example for a second worker |
| source: sglang-cache | ||
| target: /root/.cache | ||
| - type: bind | ||
| source: ${HF_HOME:?error} |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message 'error' is not descriptive. Consider using a more helpful message like 'HF_HOME environment variable is required'.
| source: ${HF_HOME:?error} | |
| source: ${HF_HOME:?HF_HOME environment variable is required} |
| ports: | ||
| # Router port | ||
| - 10090:10090 | ||
| # Monitoring port | ||
| - 10091:10091 |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Port mappings are redundant when using 'network_mode: host'. The host network mode makes all container ports directly accessible on the host, so explicit port mappings are unnecessary and could cause confusion.
| ports: | |
| # Router port | |
| - 10090:10090 | |
| # Monitoring port | |
| - 10091:10091 |
gmlwns2000
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Add docker-compose deployment.