User Story:
As a developer, I want to profile and determine the optimal number of model copies that can be loaded into Triton before hitting diminishing returns on performance.
Acceptance Criteria:
Perform load testing on Triton with varying numbers of model copies on server
Document performance metrics like latency and throughput.
Definition of Done:
Optimal number of model copies determined.
Profiling metrics documented.