Skip to content

Conversation

@kbumsik
Copy link
Contributor

@kbumsik kbumsik commented Oct 10, 2025

Add docker-compose deployment.

  • Use stable and consistent Docker image.
  • Add automatic health-check & restart.
  • Add sglang-router for stable serving.
    • This allows individual LLM servers to restart without interruption.

@kbumsik kbumsik requested a review from gmlwns2000 October 10, 2025 01:34
@kbumsik kbumsik self-assigned this Oct 10, 2025
Copilot AI review requested due to automatic review settings October 10, 2025 01:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Docker Compose configuration for stable SGLang deployment, introducing a router-based architecture for improved reliability and consistent serving infrastructure.

  • Added Docker Compose configurations for SGLang server and router components
  • Updated environment variable naming and cache configurations for consistency
  • Enhanced documentation with Docker Compose usage instructions and build processes

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
docs/USAGE.sglang.md Updated SGLang commit hash and corrected environment variable names
docker-compose/sglang-server.yaml New Docker Compose service definition for SGLang server with health checks
docker-compose/sglang-router.yaml New Docker Compose service definition for SGLang router with load balancing
README.md Added Docker Compose usage documentation and updated build instructions
.env.example Updated environment variable names and added new cache directory configuration

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

- --prometheus-port
- "10091"
ports:
- 10090:10090
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ports mapping is incomplete. The prometheus port 10091 is configured in the command but not exposed in the ports section, which will prevent external access to monitoring metrics.

Copilot uses AI. Check for mistakes.
README.md Outdated
Comment on lines 229 to 235
docker build -t deepauto/hip-attention:latest -t deepauto/hip-attention:latest-sglang -t deepauto/hip-attention:${tag_git_short} -t deepauto/hip-attention:${tag_hip_attention}-sglang -f Dockerfile.sglang .

# Publish sglang server image
docker push deepauto/hip-attention:latest
docker push deepauto/hip-attention:latest-sglang
docker push deepauto/hip-attention:$(git rev-parse --short HEAD)-sglang
docker push deepauto/hip-attention:v$(uv run python -c 'import importlib.metadata; print(importlib.metadata.version("hip-attn"))')-sglang
docker push deepauto/hip-attention:${tag_git_short}
docker push deepauto/hip-attention:${tag_hip_attention}-sglang
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tag format is inconsistent. The variable ${tag_hip_attention} already includes the -sglang suffix, but it's being appended again with -sglang, resulting in a double suffix like v1.2.9-sglang-sglang.

Copilot uses AI. Check for mistakes.
README.md Outdated
Comment on lines 229 to 235
docker build -t deepauto/hip-attention:latest -t deepauto/hip-attention:latest-sglang -t deepauto/hip-attention:${tag_git_short} -t deepauto/hip-attention:${tag_hip_attention}-sglang -f Dockerfile.sglang .

# Publish sglang server image
docker push deepauto/hip-attention:latest
docker push deepauto/hip-attention:latest-sglang
docker push deepauto/hip-attention:$(git rev-parse --short HEAD)-sglang
docker push deepauto/hip-attention:v$(uv run python -c 'import importlib.metadata; print(importlib.metadata.version("hip-attn"))')-sglang
docker push deepauto/hip-attention:${tag_git_short}
docker push deepauto/hip-attention:${tag_hip_attention}-sglang
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same tag format issue as above. The variable ${tag_hip_attention} already contains -sglang, so appending -sglang again creates an incorrect tag format.

Copilot uses AI. Check for mistakes.
@kbumsik kbumsik requested a review from Copilot October 10, 2025 04:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

source: sglang-cache
target: /root/.cache
- type: bind
source: ${HF_HOME:?error}
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message 'error' is too generic and unhelpful. Consider using a more descriptive message like 'HF_HOME environment variable is required but not set'.

Suggested change
source: ${HF_HOME:?error}
source: ${HF_HOME:?HF_HOME environment variable is required but not set}

Copilot uses AI. Check for mistakes.
- --prometheus-port
- "10091"
ports:
- 10090:10090
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ports configuration is incomplete. The router is configured to listen on port 10090 and also expose Prometheus metrics on port 10091, but only port 10090 is exposed. Add port mapping for 10091:10091 to make Prometheus metrics accessible.

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +15
- http://h100-80-1:10080
# - http://h100-80-2:10080
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded hostname 'h100-80-1' makes this configuration environment-specific and less portable. Consider using environment variables or Docker service names for worker URLs.

Suggested change
- http://h100-80-1:10080
# - http://h100-80-2:10080
- ${WORKER_URLS:-http://sglang-worker:10080}
# - ${WORKER_URLS_2:-http://sglang-worker-2:10080}

Copilot uses AI. Check for mistakes.
README.md Outdated

### Docker Compose

Docker compose examples are available in [`docker-compose`](#docker-compose) folder.
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markdown link reference '#docker-compose' appears to be incorrect as there's no corresponding anchor or section with that exact ID in the document.

Suggested change
Docker compose examples are available in [`docker-compose`](#docker-compose) folder.
Docker compose examples are available in [`docker-compose`](docker-compose/) folder.

Copilot uses AI. Check for mistakes.
@kbumsik kbumsik requested a review from Copilot October 10, 2025 04:31
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +14 to +15
- http://h100-80-1:10080
# - http://h100-80-2:10080
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded hostname 'h100-80-1' should be documented or made configurable via environment variables for better portability across different deployment environments.

Suggested change
- http://h100-80-1:10080
# - http://h100-80-2:10080
- http://${WORKER_HOSTNAME:-h100-80-1}:10080 # Set WORKER_HOSTNAME env var to override
# - http://${WORKER_HOSTNAME_2:-h100-80-2}:10080 # Example for a second worker

Copilot uses AI. Check for mistakes.
source: sglang-cache
target: /root/.cache
- type: bind
source: ${HF_HOME:?error}
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message 'error' is not descriptive. Consider using a more helpful message like 'HF_HOME environment variable is required'.

Suggested change
source: ${HF_HOME:?error}
source: ${HF_HOME:?HF_HOME environment variable is required}

Copilot uses AI. Check for mistakes.
Comment on lines +39 to +43
ports:
# Router port
- 10090:10090
# Monitoring port
- 10091:10091
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Port mappings are redundant when using 'network_mode: host'. The host network mode makes all container ports directly accessible on the host, so explicit port mappings are unnecessary and could cause confusion.

Suggested change
ports:
# Router port
- 10090:10090
# Monitoring port
- 10091:10091

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@gmlwns2000 gmlwns2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kbumsik kbumsik merged commit 5f37a7f into deepauto/dev Oct 12, 2025
1 check failed
@kbumsik kbumsik deleted the feat/docker-compose branch October 12, 2025 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants