Skip to content

CI/CD: Add incremental test coverage for Plugin Mode (vLLM/SGLang) #255

@sunway513

Description

@sunway513

Context

PR #126 introduces a new plugin mode that enables ATOM to work as an out-of-tree (OOT) plugin for vLLM and SGLang. This is a significant architectural addition (~2,300 lines of new plugin code), but currently has zero automated CI coverage for plugin mode.

The existing CI (atom-test.yaml) only tests ATOM in server mode. We need incremental test coverage to make plugin mode sustainable.

Proposed CI Enhancement — 3 Tiers

Tier 1: CPU-Only Unit Tests (P0)

Trigger: Every PR to main
Runner: ubuntu-latest (no GPU needed)
Purpose: Validate plugin registration, config generation, and wiring logic without GPU

Test File Coverage Target
tests/test_plugin_prepare.py is_vllm(), is_sglang(), is_plugin_mode(), _set_framework_backbone(), invalid framework error
tests/test_plugin_config.py PluginConfig dataclass init, _generate_atom_config_from_vllm_config() with mock vllm config
tests/test_plugin_vllm_register.py register_model() skip when disabled, register_platform() returns correct path, set_attn_cls() switches Attention class correctly
tests/test_plugin_vllm_platform.py ATOMPlatform is None when disabled, inherits RocmPlatform when enabled
tests/conftest.py update Add stubs for vllm/sglang mock imports

Estimated: ~280 lines

Tier 2: GPU Integration Test (P0)

Trigger: Every PR to main (non-draft)
Runner: atom-mi355-1gpu or atom-mi355-8gpu (reuse existing runners)
Purpose: Smoke test that vLLM + ATOM plugin can start and produce valid inference output

Component Description
.github/workflows/atom-plugin-test.yaml New workflow: install vLLM + ATOM → launch vllm serve with ATOM plugin → run simple inference → validate non-empty output
.github/scripts/atom_plugin_test.sh Launch/inference/accuracy script using vllm serve instead of python -m atom.entrypoints.openai_server

Estimated: ~200 lines

Tier 3: Nightly E2E (P1 — Follow-up)

Trigger: Nightly schedule + manual dispatch
Runner: atom-mi355-8gpu
Purpose: Full accuracy validation and SGLang coverage

Component Description
Accuracy validation Qwen3-235B-FP8 under vLLM plugin mode → gsm8k accuracy test with golden baseline
SGLang smoke test Placeholder test for SGLang plugin path (activate when SGLang support is complete)
Multi-TP configs Test TP=8 + EP configurations
Performance regression Compare plugin mode throughput against ATOM server mode baseline

Estimated: ~210 lines

Scope Clarification

  • AITER / MIOpen do NOT need CI changes for this — ATOM is a pure consumer of their APIs. If AITER breaks an API, ATOM's existing CI (which builds AITER from source) will catch it.
  • SGLang tests should start as placeholders and be activated when SGLang plugin support is complete.

Implementation Plan

Related

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions