Skip to content

Conversation

@MuenYu
Copy link
Collaborator

@MuenYu MuenYu commented Feb 8, 2026

This branch is based on #8, check this after #8 gets merged

MuenYu and others added 10 commits February 6, 2026 15:58
- Introduced `evaluation_mode` and `evaluation_criteria` fields in the
prompts table and related schemas.
- Updated frontend components to support selection and input of
evaluation mode and criteria.
- Modified API endpoints to handle new fields during prompt creation and
import.
- Implemented LLM evaluation logic in the test runner, allowing for
dynamic evaluation based on selected mode.
- Enhanced validation to ensure criteria are provided when using LLM
evaluation mode.
- Updated the API to accept an optional `evaluationModel` parameter in the test run request.
- Enhanced the frontend to allow users to select an evaluation model when running tests.
- Implemented logic to handle the evaluation model in the test runner, ensuring proper evaluation during LLM evaluation mode.
- Added validation for the evaluation model selection in the test run schema.
- Implemented a new function to format and display the evaluation model in the test run results.
- Updated the TestResults interface to include an optional evaluationModel property.
- Enhanced the test runner to pass the evaluation model information when running tests.
- Modified the frontend to conditionally render the evaluation model label for completed test runs.
- Added conditional rendering for the evaluation model label in the test run results.
- Removed redundant evaluation model display logic from previous run information.
- Ensured the evaluation model is displayed only when available, improving clarity in the UI.
- Introduced optional parameters for optimization in the test run API, including `optimizationMaxIterations`, `optimizationThreshold`, and `optimizationModel`.
- Updated the frontend to allow users to configure optimization settings, including UI elements for selecting optimization models and setting thresholds.
- Enhanced the test runner to handle optimization logic, including tracking optimization history and integrating with LLM evaluation.
- Added validation for new optimization parameters in the test run schema.
@MuenYu
Copy link
Collaborator Author

MuenYu commented Feb 9, 2026

check #10 instead

@MuenYu MuenYu closed this Feb 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants