Skip to content

Determine optimal number of model copies for Triton #15

@EndlessReform

Description

@EndlessReform

User Story:

As a developer, I want to profile and determine the optimal number of model copies that can be loaded into Triton before hitting diminishing returns on performance.

Acceptance Criteria:

Perform load testing on Triton with varying numbers of model copies on server
Document performance metrics like latency and throughput.

Definition of Done:

Optimal number of model copies determined.
Profiling metrics documented.

Metadata

Metadata

Assignees

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions