Skip to content

[Feat] Support Additional Pre-deployed Inference Stacks #38

@wangchen615

Description

@wangchen615

Description

Enhance fmperf to support various pre-deployed distributed inference stacks, enabling users to benchmark different production-grade deployment solutions.

Target Stacks

  • Dynamo AIBrix
  • vLLM Production Stack
  • LLM-D
  • Other distributed inference solutions

Implementation Requirements

  1. Stack Configuration

    • Define stack specifications for each deployment type
    • Support for distributed deployment configurations
    • Handle different service discovery mechanisms
  2. Deployment Management

    • Integration with existing deployment orchestration
    • Support for multi-node deployments
    • Handle different scaling configurations
  3. Monitoring & Metrics

    • Handle logging of existing stacks

Expected Benefits

  • Enable benchmarking of production-grade distributed deployments
  • Support for more realistic deployment scenarios
  • Better comparison between different deployment solutions

Related Components

  • fmperf/Cluster.py
  • fmperf/StackSpec.py
  • Deployment configuration files

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions