Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.PHONY: help test lint format clean install ci-test all
.PHONY: help test test-cov lint format format-check type-check clean install ci-test ci-local ci-matrix all pre-commit

# Default target
help:
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ pipx install --pip-args="--pre" "vllm-cli[vllm]"

```bash
# Interactive mode - menu-driven interface
vllm-cl
vllm-cli
# Serve a model
vllm-cli serve --model openai/gpt-oss-20b

Expand Down
4 changes: 2 additions & 2 deletions docs/profiles.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Seven carefully designed profiles cover most common use cases and hardware confi
## General Purpose Profiles

### `standard` - Minimal configuration with smart defaults
Uses vLLM's defaults configuration. Perfect for most models and hardware setups.
Uses vLLM's default configuration. Perfect for most models and hardware setups.

**Use Case:** Starting point for any model, general inference tasks
**Configuration:** No additional arguments - uses vLLM defaults
Expand Down Expand Up @@ -182,7 +182,7 @@ Common environment variables used in profiles:

| Variable | Purpose | Values |
|----------|---------|---------|
| `VLLM_ATTENTION_BACKEND` | Attention computation backend | `FLASH_ATTN`, `XFORMERS`, `TRITON` |
| `VLLM_ATTENTION_BACKEND` | Attention computation backend | `FLASH_ATTN`, `XFORMERS`, `TRITON_ATTN_VLLM_V1` |
| `VLLM_USE_TRITON_FLASH_ATTN` | Enable Triton flash attention | `0`, `1` |
| `VLLM_ENABLE_FUSED_MOE_ACTIVATION_CHUNKING` | MoE activation chunking | `0`, `1` |
| `VLLM_USE_FLASHINFER_MXFP4_BF16_MOE` | BF16 precision for MoE | `0`, `1` |
Expand Down
15 changes: 13 additions & 2 deletions scripts/test_ci_locally.sh
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,10 @@ fi
run_test "Unit Tests" "pytest tests/ -v --tb=short"

# 4. Check Test Coverage (optional but informative)
if command -v pytest-cov &> /dev/null; then
if python -c "import pytest_cov" 2>/dev/null; then
run_test "Test Coverage" "pytest tests/ --cov=src/vllm_cli --cov-report=term-missing --cov-fail-under=50"
else
echo -e "${YELLOW}⚠️ Skipping coverage (pytest-cov not installed)${NC}\n"
fi

# 5. Linting with flake8
Expand Down Expand Up @@ -90,7 +92,16 @@ run_test "CLI Help Test" "python -m vllm_cli --help > /dev/null"

# 12. Validate pyproject.toml
if [ -f "pyproject.toml" ]; then
run_test "Validate pyproject.toml" "python -c 'import toml; toml.load(\"pyproject.toml\"); print(\"pyproject.toml is valid\")'"
run_test "Validate pyproject.toml" "python -c '
try:
import tomllib
with open(\"pyproject.toml\", \"rb\") as f:
tomllib.load(f)
except ImportError:
import toml
toml.load(\"pyproject.toml\")
print(\"pyproject.toml is valid\")
'"
fi

# Summary
Expand Down
Loading