Evaluation: Check Evaluation Quality of Life Metrics

Do a stress testing of text evals by monitoring
1. Max batch processing CSV size (line items)
2. P95 batch processing time of CSV files of various line item counts e.g 20, 50,100, 500,1000 etc.
3. Average indicative ballpark cost figures of each batch process.
4. Server side errors if any including but not limited to
   4.1 OpenAI rate limiting 
   4.2 Langfuse rate limiting
   4.3 Kaapi proxy server rate limiting etc
   4.4 Timeout/server unavailabe errors (504, 529,429 status codes)
5. Processing bottleneck at which service i.e Langfuse traces? OpenAI embeddings? Kaapi backend?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation: Check Evaluation Quality of Life Metrics #41

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Evaluation: Check Evaluation Quality of Life Metrics #41

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions