-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Context
ReplicantX currently supports two scenario “levels” with YAML-driven execution (Level 1 basic, Level 2 agent). The next step is to support end-to-end user workflows that combine:
- Agentic / non-deterministic conversational steps (Replicant behaves like a human user)
- Deterministic REST steps (Replicant mimics a web/mobile app making API calls)
This capability must be generic: it should work for any product that exposes a chat-like endpoint and/or REST APIs, without requiring the product to add new “artifact” fields to responses.
Instead of asserting on assistant prose, Journey tests should validate:
- response structure (preferably via OpenAPI schemas and JSONPath checks), and
- system state via deterministic API calls.
Goals
- Add a new scenario level:
level: journey - Allow mixing chat and REST steps in a single
steps:list. - Support OpenAPI ingestion to:
- resolve endpoints by
operationId - validate request bodies/params
- validate response shapes when schemas exist
- resolve endpoints by
- Provide robust assertions based on structured responses and downstream state, not keyword matches.
- Be CI-friendly: deterministic where possible, bounded where not.
Non-goals (initial release)
- Full browser automation (Playwright/Selenium).
- Requiring target services to change response payloads (no “artifacts” requirement).
- Perfect determinism of LLM behaviour.
High-level design
A Journey is a step-by-step orchestration with a shared runtime variable context (vars). Steps may include:
- chat/agentic message sending
- HTTP API calls (OpenAPI operationId or method+path)
- polling (only for eventual consistency / async processes)
- extraction of IDs/values from responses
- assertions on responses and system state
Important: HTTP calls already block until they return a response. A poll step is only required when the system’s state changes asynchronously after a successful response (e.g. background jobs, eventual consistency, approvals).
YAML: Repo-Conventional Shape
Top-level schema (MVP)
name: "..."
level: journey
base_url: "{{ env.BASE_URL }}" # existing convention (primary host)
# Optional: allow different base for REST calls; default = base_url host
http_base_url: "{{ env.HTTP_BASE_URL }}"
auth:
provider: noop|jwt|supabase
# existing auth config
replicant:
# reuse existing Replicant config patterns (LLM settings, facts, payload_format, session_mode, etc.)
# existing documented options should remain valid
openapi:
source: "https://.../openapi.json" # or "file:./openapi.json"
validate_requests: true
validate_responses: true
vars:
# optional scenario inputs (can also be extracted during run)
steps:
- ...Variable templating
Support {{ env.* }} (already used across ReplicantX) and add {{ vars.* }} for extracted values.
Resolution priority:
env.*vars.*run.*(generated values likerun.id,run.timestamp)steps.<step_id>.extract.*(optional aliasing to vars)
Step Types (MVP)
1) Chat step (chat)
Two modes:
A. Fixed user message
- id: chat_trip_request
chat:
user: "Trip to Madrid on Friday returning Monday."
expect:
status: 200
# If chat endpoint is in OpenAPI:
schema: "openapi:ask:200"
jsonpath_exists:
- "$.message"B. Agentic user message (Replicant generates)
- id: chat_add_hotel
chat:
agentic: true
instruction: "Ask to add a hotel in Madrid near the centre for the same dates."
expect:
status: 200
schema: "openapi:ask:200"
jsonpath_exists:
- "$.message"Chat assertions (preferred)
Replace simplistic expect_contains with structured assertions:
expect.status(required)expect.schema(OpenAPI validation, if available)expect.jsonpath_*checks (minimal stable checks)
expect_contains and expect_regex may remain as fallback for plain-text APIs, but should not be the recommended default for Journey tests.
Chat extraction
Allow extracting from:
response.json(JSONPath)response.headersresponse.text(regex fallback)
extract:
conversation_id:
from: response.headers
key: "x-conversation-id"2) HTTP step (http)
Preferred: OpenAPI operationId. Fallback: method + path.
- id: create_trip
http:
operationId: "createTrip"
json:
title: "Madrid Trip {{ run.id }}"
expect:
status: 201
schema: "openapi:createTrip:201"
jsonpath_exists:
- "$.id"
extract:
trip_id:
from: response.json
jsonpath: "$.id"Fallback example:
- id: submit_trip
http:
method: POST
path: "/trips/{{ vars.trip_id }}/submit"
json: {}
expect:
status: 200HTTP assertions
statusschema(OpenAPI response schema, if defined)- JSONPath checks to assert critical fields / invariants
3) Poll step (poll)
Use only when async / eventual consistency exists.
- id: poll_trip_status
poll:
every_seconds: 2
max_attempts: 20
http:
operationId: "getTrip"
path_params:
trip_id: "{{ vars.trip_id }}"
until:
jsonpath_in:
"$.status": ["APPROVED", "BOOKED"]OpenAPI Integration
Inputs
openapi.source supports:
- URL:
https://.../openapi.json - file:
file:./openapi.json
Behaviours
-
Parse OpenAPI and index:
operationId → {method, path, request schema, response schemas}
-
If
http.operationIdis used:- resolve method/path automatically
-
Validate requests (optional but recommended):
path_params,query, andjsonagainst request schema
-
Validate responses (optional but recommended):
- validate body against the response schema for the returned status code
Schema references in YAML
Use a stable reference string:
openapi:<operationId>:<status>
Example:openapi:createTrip:201
When OpenAPI response schemas are incomplete
Some APIs do not declare response models fully; OpenAPI responses can be {} or missing.
Recommendation (ReplicantX optional feature)
Add an opt-in schema capture mode:
-
CLI flag:
--capture-schemas -
If OpenAPI response schema is missing/empty:
- generate a lightweight JSON Schema from sample response
- cache in
.replicantx/schema-cache/<operationId>/<status>.schema.json
-
On subsequent runs:
- optionally validate responses loosely against the cached shape
- use cached shape to power better JSONPath error messages and IDE hints
This remains generic and requires no changes to the target API.
Determinism strategy for mixed-mode journeys
Agentic steps
-
Use bounded behaviour:
- max turns
- timeouts
- optional retries (small)
-
Assert primarily on shape, not exact text
-
Validate outcomes via deterministic REST steps
Deterministic steps
- Prefer OpenAPI operationIds
- Assert on IDs and state transitions via JSONPath
- Use
pollonly for async completion
Reporting & Debugging (Must-have)
Per step, capture:
- step id, type
- request summary (URL/method, redacted headers, payload preview)
- response status, latency, preview
- OpenAPI validation results
- extracted variables (keys by default; values under
--debug)
Redaction rules:
- redact headers: Authorization, Cookie, anything matching
/token|secret|key/i - redact JSON fields matching
/password|token|secret/i
CLI additions (minimal)
replicantx run <journey.yml>(same entrypoint as other levels)replicantx validate <journey.yml>(YAML + template sanity)- optional:
replicantx openapi pull --url ... --out ... - optional:
--capture-schemasflag
Example: Generic “Madrid Trip” Journey YAML (Mixed Chat + REST)
NOTE: operationIds are placeholders; replace with your API’s operationIds.
name: "Journey - Madrid trip (chat + REST)"
level: journey
base_url: "https://{{ env.REPLICANTX_TARGET }}"
http_base_url: "https://{{ env.REPLICANTX_TARGET }}"
auth:
provider: jwt
token: "{{ env.JWT_TOKEN }}"
openapi:
source: "https://{{ env.REPLICANTX_TARGET }}/openapi.json"
validate_requests: true
validate_responses: true
replicant:
goal: "Arrange a trip to Madrid and submit it for approval"
facts:
traveler_name: "Test User"
payload_format: openai
session_mode: auto
vars:
depart_date: "2026-01-30"
return_date: "2026-02-02"
steps:
# 1) Agentic chat: request trip
- id: chat_trip_request
chat:
agentic: true
instruction: "Ask to travel to Madrid on Friday and return Monday."
expect:
status: 200
schema: "openapi:ask:200"
jsonpath_exists:
- "$.message"
extract:
conversation_id:
from: response.headers
key: "x-conversation-id"
# 2) REST: create a draft booking (deterministic)
- id: create_draft_booking
http:
operationId: "createDraftBooking"
json:
destination: "Madrid"
depart_date: "{{ vars.depart_date }}"
return_date: "{{ vars.return_date }}"
conversation_id: "{{ vars.conversation_id }}"
expect:
status: 201
schema: "openapi:createDraftBooking:201"
jsonpath_exists:
- "$.id"
extract:
booking_id:
from: response.json
jsonpath: "$.id"
# 3) REST: create trip
- id: create_trip
http:
operationId: "createTrip"
json:
title: "Madrid Trip {{ run.id }}"
start_date: "{{ vars.depart_date }}"
end_date: "{{ vars.return_date }}"
expect:
status: 201
schema: "openapi:createTrip:201"
jsonpath_exists:
- "$.id"
extract:
trip_id:
from: response.json
jsonpath: "$.id"
# 4) REST: add booking to trip
- id: add_booking_to_trip
http:
operationId: "addBookingToTrip"
json:
trip_id: "{{ vars.trip_id }}"
booking_id: "{{ vars.booking_id }}"
expect:
status: 200
schema: "openapi:addBookingToTrip:200"
# 5) Agentic chat: add hotel
- id: chat_add_hotel
chat:
agentic: true
instruction: "Ask to add a hotel in Madrid near the centre for the same dates."
expect:
status: 200
schema: "openapi:ask:200"
jsonpath_exists:
- "$.message"
# 6) REST: submit trip
- id: submit_trip
http:
operationId: "submitTrip"
path_params:
trip_id: "{{ vars.trip_id }}"
expect:
status: 200
schema: "openapi:submitTrip:200"
# 7) Poll only if approval/booking completion is async
- id: poll_trip_status
poll:
every_seconds: 2
max_attempts: 20
http:
operationId: "getTrip"
path_params:
trip_id: "{{ vars.trip_id }}"
until:
jsonpath_in:
"$.status": ["APPROVED", "BOOKED"]Acceptance Criteria
-
level: journeyscenarios can be executed via existing CLI (replicantx run ...) alongsidebasicandagent. -
A journey can alternate
chatandhttpsteps in one scenario. -
http.operationIdresolves via OpenAPI and validates request/response schemas (when present). -
Structured
expectblock supported for chat and http:- status
- schema (OpenAPI)
- jsonpath assertions
-
extractpopulatesvars.*and templating works across steps. -
pollexists for async state transitions; no “wait” step is required for normal sync HTTP calls. -
Step-level reporting includes OpenAPI validation results, redacted logs, and clear failure localisation.