CI/CD: Add incremental test coverage for Plugin Mode (vLLM/SGLang)

## Context

PR #126 introduces a new plugin mode that enables ATOM to work as an out-of-tree (OOT) plugin for vLLM and SGLang. This is a significant architectural addition (~2,300 lines of new plugin code), but **currently has zero automated CI coverage for plugin mode**.

The existing CI (`atom-test.yaml`) only tests ATOM in server mode. We need incremental test coverage to make plugin mode sustainable.

## Proposed CI Enhancement — 3 Tiers

### Tier 1: CPU-Only Unit Tests (P0)
**Trigger:** Every PR to `main`
**Runner:** `ubuntu-latest` (no GPU needed)
**Purpose:** Validate plugin registration, config generation, and wiring logic without GPU

| Test File | Coverage Target |
|-----------|----------------|
| `tests/test_plugin_prepare.py` | `is_vllm()`, `is_sglang()`, `is_plugin_mode()`, `_set_framework_backbone()`, invalid framework error |
| `tests/test_plugin_config.py` | `PluginConfig` dataclass init, `_generate_atom_config_from_vllm_config()` with mock vllm config |
| `tests/test_plugin_vllm_register.py` | `register_model()` skip when disabled, `register_platform()` returns correct path, `set_attn_cls()` switches Attention class correctly |
| `tests/test_plugin_vllm_platform.py` | `ATOMPlatform is None` when disabled, inherits `RocmPlatform` when enabled |
| `tests/conftest.py` update | Add stubs for vllm/sglang mock imports |

**Estimated: ~280 lines**

### Tier 2: GPU Integration Test (P0)
**Trigger:** Every PR to `main` (non-draft)
**Runner:** `atom-mi355-1gpu` or `atom-mi355-8gpu` (reuse existing runners)
**Purpose:** Smoke test that vLLM + ATOM plugin can start and produce valid inference output

| Component | Description |
|-----------|-------------|
| `.github/workflows/atom-plugin-test.yaml` | New workflow: install vLLM + ATOM → launch `vllm serve` with ATOM plugin → run simple inference → validate non-empty output |
| `.github/scripts/atom_plugin_test.sh` | Launch/inference/accuracy script using `vllm serve` instead of `python -m atom.entrypoints.openai_server` |

**Estimated: ~200 lines**

### Tier 3: Nightly E2E (P1 — Follow-up)
**Trigger:** Nightly schedule + manual dispatch
**Runner:** `atom-mi355-8gpu`
**Purpose:** Full accuracy validation and SGLang coverage

| Component | Description |
|-----------|-------------|
| Accuracy validation | Qwen3-235B-FP8 under vLLM plugin mode → gsm8k accuracy test with golden baseline |
| SGLang smoke test | Placeholder test for SGLang plugin path (activate when SGLang support is complete) |
| Multi-TP configs | Test TP=8 + EP configurations |
| Performance regression | Compare plugin mode throughput against ATOM server mode baseline |

**Estimated: ~210 lines**

## Scope Clarification

- **AITER / MIOpen do NOT need CI changes** for this — ATOM is a pure consumer of their APIs. If AITER breaks an API, ATOM's existing CI (which builds AITER from source) will catch it.
- **SGLang tests** should start as placeholders and be activated when SGLang plugin support is complete.

## Implementation Plan

- **P0 (Tier 1 + Tier 2):** Should be included in PR #126 or an immediate follow-up before the next plugin mode PR
- **P1 (Tier 3):** Can be a separate follow-up PR once plugin mode is merged and stable

## Related

- PR #126: `[1/N][feat] Make ATOM work with vLLM and SGLang`
- Issue #201: RFC for ATOM OOT Plugin Platform

Test File	Coverage Target
`tests/test_plugin_prepare.py`	`is_vllm()`, `is_sglang()`, `is_plugin_mode()`, `_set_framework_backbone()`, invalid framework error
`tests/test_plugin_config.py`	`PluginConfig` dataclass init, `_generate_atom_config_from_vllm_config()` with mock vllm config
`tests/test_plugin_vllm_register.py`	`register_model()` skip when disabled, `register_platform()` returns correct path, `set_attn_cls()` switches Attention class correctly
`tests/test_plugin_vllm_platform.py`	`ATOMPlatform is None` when disabled, inherits `RocmPlatform` when enabled
`tests/conftest.py` update	Add stubs for vllm/sglang mock imports

Component	Description
`.github/workflows/atom-plugin-test.yaml`	New workflow: install vLLM + ATOM → launch `vllm serve` with ATOM plugin → run simple inference → validate non-empty output
`.github/scripts/atom_plugin_test.sh`	Launch/inference/accuracy script using `vllm serve` instead of `python -m atom.entrypoints.openai_server`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI/CD: Add incremental test coverage for Plugin Mode (vLLM/SGLang) #255

Context

Proposed CI Enhancement — 3 Tiers

Tier 1: CPU-Only Unit Tests (P0)

Tier 2: GPU Integration Test (P0)

Tier 3: Nightly E2E (P1 — Follow-up)

Scope Clarification

Implementation Plan

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Component	Description
Accuracy validation	Qwen3-235B-FP8 under vLLM plugin mode → gsm8k accuracy test with golden baseline
SGLang smoke test	Placeholder test for SGLang plugin path (activate when SGLang support is complete)
Multi-TP configs	Test TP=8 + EP configurations
Performance regression	Compare plugin mode throughput against ATOM server mode baseline

CI/CD: Add incremental test coverage for Plugin Mode (vLLM/SGLang) #255

Description

Context

Proposed CI Enhancement — 3 Tiers

Tier 1: CPU-Only Unit Tests (P0)

Tier 2: GPU Integration Test (P0)

Tier 3: Nightly E2E (P1 — Follow-up)

Scope Clarification

Implementation Plan

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions