GitHub actions utilizing the Circuit Breaker Labs API through its Python client.
Each action in this repository maps directly to one of the four primary evaluation endpoints exposed by the API:
singleturn-evaluate-system-prompt→/singleturn_evaluate_system_promptsingleturn-evaluate-openai-finetune→/singleturn_evaluate_openai_finetunemultiturn-evaluate-system-prompt→/multiturn_evaluate_system_promptmultiturn-evaluate-openai-finetune→/multiturn_evaluate_openai_finetune
The inputs to each action serve as arguments for the relevant API request.
Action inputs and their descriptions are available in the corresponding
action.yml files.
Here, we're using the
singleturn-evaluate-system-prompt
action, which calls the
/singleturn_evaluate_system_prompt
endpoint.
name: Evaluate System Prompt
on:
workflow_dispatch:
jobs:
evaluate:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Run system prompt evaluation
uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@v1
with:
fail-action-threshold: "0.80"
fail-case-threshold: "0.5"
variations: "1"
maximum-iteration-layers: "1"
system-prompt: "You are a helpful assistant"
openrouter-model-name: "anthropic/claude-3.7-sonnet"
circuit-breaker-labs-api-key: ${{ secrets.CBL_API_KEY }}This action effectively translates to the following POST request
curl -X 'POST' \
'https://api.circuitbreakerlabs.ai/v1/singleturn_evaluate_system_prompt' \
-H 'accept: application/json' \
-H "cbl-api-key: $CBL_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"threshold": 0.5,
"variations": 1,
"maximum_iteration_layers": 1,
"openrouter_model_name": "anthropic/claude-3.7-sonnet",
"system_prompt": "You are a helpful assistant"
}'More Thorough Example
Here, we read the system prompt and model name from a model_config.json file,
and only run the action if this file changes.
name: Evaluate System Prompt
on:
push:
paths:
- "model_config.json"
workflow_dispatch:
jobs:
evaluate:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Read configuration from JSON
id: read-config
run: |
PROMPT=$(jq -r '.system_prompt' model_config.json)
MODEL=$(jq -r '.model' model_config.json)
echo "prompt<<EOF" >> $GITHUB_OUTPUT
echo "$PROMPT" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
echo "model=$MODEL" >> $GITHUB_OUTPUT
- name: Run system prompt evaluation
uses: circuitbreakerlabs/actions/singleturn-evaluate-system-prompt@main
with:
fail-action-threshold: "0.80"
fail-case-threshold: "0.5"
variations: "1"
maximum-iteration-layers: "1"
system-prompt: ${{ steps.read-prompt.outputs.prompt }}
openrouter-model-name: ${{ steps.read-config.outputs.model }}
circuit-breaker-labs-api-key: ${{ secrets.CBL_API_KEY }}