Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughAdds a Claude-based scoring pipeline: DB schema and models extended for rich scoring metadata, a scoring service and prompt builder, ARQ worker tasks to enqueue/run scoring, API and frontend surfaces to expose verdicts, and extensive tests and fixtures to validate scoring and flows. Changes
Sequence DiagramsequenceDiagram
participant Frontend
participant API
participant Queue as ARQ_Queue
participant Worker
participant Claude
participant DB as Database
Frontend->>API: End interview session (request)
API->>DB: Persist session end, mark for scoring
API->>Queue: enqueue_scoring_job(session_id, defer=5s)
API-->>Frontend: Acknowledge
Queue->>Worker: Dequeue score_session_task(session_id)
Worker->>DB: Load session, heart_config, transcript, turns
Worker->>Worker: build_scoring_prompt(...)
Worker->>Claude: send prompt (Anthropic/Claude)
Claude-->>Worker: return JSON/text response
Worker->>Worker: parse JSON, clamp scores, apply emotion_modifier, compute final_score
Worker->>DB: create Score record, update session.has_verdict/verdict_status
Frontend->>API: Poll status / fetch verdict
API->>DB: read has_verdict / score
API-->>Frontend: SessionVerdictResponse (scores, feedback, per_question_scores)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In `@backend/agent/interview_agent.py`:
- Around line 62-66: The current logic returns "All questions answered..." even
when the session already has an end_reason, which can be misleading; in the
method containing this snippet (look for the block using self.session_mgr.end
and the remaining variable in interview_agent.py), change the branch so that
when remaining == 0 you only call self.session_mgr.end("all_questions_complete")
if end_reason is None, and then return a message that reflects the actual
end_reason when it is already set (e.g., include or reference
self.session_mgr.end_reason in the returned string or return a neutral message
like "Session already ended: {self.session_mgr.end_reason}"). Ensure you do not
overwrite an existing end_reason.
In `@backend/agent/main.py`:
- Around line 304-318: The cleanup block that catches Exception around calling
on_close and closing the session (references: session_mgr.end_reason, on_close,
session.aclose/close, session_id, logger.warning) can swallow
asyncio.CancelledError; change the except to explicitly re-raise cancellations
by catching asyncio.CancelledError (or BaseException subclass) and re-raising
it, while catching other exceptions to log the failure with traceback via
logger.exception or logger.warning including exc info; ensure any coroutine
result from session.close() is awaited as before and that cancellations
propagate instead of being suppressed.
In `@backend/src/api/v1/endpoints/sessions.py`:
- Around line 203-206: The fallback for verdict_status uses
session.verdict_status and session.has_verdict but doesn't account for legacy
rows that have a score but has_verdict=False, causing endpoints to return
"pending" (202); update the verdict_status computation to check for a score
existence before defaulting (e.g., set verdict_status = session.verdict_status
or ("ready" if a score exists on session else "pending")), using the existing
session object and keeping session.has_verdict and session.verdict_status logic
intact; apply the same fix to the other verdict-status computation block that
mirrors this logic.
In `@backend/src/services/scoring/prompt_builder.py`:
- Around line 49-53: The direct int() cast on turn.get("question_index", 0) can
raise ValueError/TypeError for corrupted JSONB (None or non-numeric string);
change the logic in the loop that produces q_idx (the code iterating over
turn_summaries and reading "question_index") to defensively coerce to string or
catch conversion errors—e.g., obtain the raw value from
turn.get("question_index", 0), attempt to convert to int inside a try/except
handling ValueError/TypeError and fall back to 0 (so q_idx = safe_int + 1); keep
the rest of the field parsing (q_text, quality, summary) unchanged to match the
existing defensive str() pattern.
In `@backend/src/services/scoring/scoring_service.py`:
- Around line 145-149: The JSON parsing except block around json.loads(text)
should stop logging the raw Claude response (text) to avoid PII leakage;
instead, in the except json.JSONDecodeError as exc handler (the block that
currently calls logger.error("Claude returned invalid JSON: %s", text[:600])),
replace that with logger.exception(...) to capture the stack trace and log only
metadata about the response (e.g., length = len(text) and a short redacted
preview if desired), then re-raise the ValueError from exc as before; update the
handler that surrounds json.loads(text) to use logger.exception and metadata
rather than printing the full text.
In `@backend/workers/main.py`:
- Around line 61-142: The race causes a duplicate score create to be treated as
a hard failure; modify the scoring flow so that when score_repo.create(...)
fails due to a duplicate/unique constraint, you treat it as idempotent success
instead of falling through to the generic exception handler: specifically, catch
the duplicate/IntegrityError (or create a ScoreExistsError in score_repo) around
score_repo.create, and in that handler re-check
score_repo.exists_for_session(session_uuid) and if true set
session_repo.update_attr(..., "has_verdict", True),
session_repo.update_attr(..., "verdict_status", "ready") and
session_repo.update_status(..., SessionStatus.SCORED) (do not set FAILED); keep
the existing broad except Exception path for other errors which should mark
FAILED.
In `@frontend/src/pages/InterviewCompleteScreen.tsx`:
- Around line 79-151: The loading gate should include verdictQuery.isLoading to
avoid flicker when verdict fetching starts; update the initial conditional that
currently checks statusQuery.isLoading || waiting to also check
verdictQuery.isLoading (i.e., change the guard in the early return that renders
"Analyzing your interview..." to statusQuery.isLoading || waiting ||
verdictQuery.isLoading) so the loading screen remains until verdictQuery
finishes.
🧹 Nitpick comments (1)
backend/src/api/v1/endpoints/suitors.py (1)
74-95: Consider batch-fetching scores to avoid N+1 queries.The current implementation fetches each score individually within the loop (line 75), resulting in up to 20 additional database queries. While acceptable for current scale, consider batch-fetching scores for all session IDs in a single query if this endpoint becomes a performance bottleneck.
# Example optimization approach: session_ids = [s.id for s in sessions] scores_by_session = await score_repo.find_by_session_ids(session_ids) # batch fetchThe
final_scorefield population on line 93 follows the established pattern correctly.
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In `@backend/src/services/scoring/prompt_builder.py`:
- Around line 14-30: The function format_expectations assumes expectations is a
dict and calls .get, which will raise for list/str legacy inputs; add a type
guard at the top of format_expectations to verify isinstance(expectations, dict)
(or Mapping) and if not, return "No specific expectations defined." (or coerce
safe default) so the subsequent accesses to expectations.get and indexing are
safe; update references to expectations, looking_for, values, must_haves, and
communication_preferences inside format_expectations accordingly.
In `@backend/src/tests/fixtures/heart_config_missing_name.yaml`:
- Around line 1-14: The test fixture has an unrelated validation failure because
calcom_event_type_id is a string; update the fixture so all non-target fields
are valid while keeping the missing-name condition under test—specifically
change calcom_event_type_id to a valid numeric ID (or appropriately-typed value)
and ensure other fields like calcom_api_key, persona, screening_questions, and
shareable_slug remain valid so the test isolates the missing "name" error.
In `@backend/src/tests/test_api_contract.py`:
- Around line 20-24: The test test_api_003_not_found_returns_404 currently
allows a 500 which masks regressions; update the assertion so the request to
"/api/v1/does-not-exist" must return a hard 404 (e.g., assert resp.status_code
== 404) and remove the {404, 500} set; keep the existing client.get call and
test name unchanged so it's clear this is strictly a not-found behavior check.
In `@backend/src/tests/test_e2e_flows.py`:
- Around line 104-129: The test test_e2e_003_livekit_room_creation_fails is
currently catching the broad Exception; change the assertion to expect the
specific exception raised by the mocked livekit (RuntimeError) so the test only
fails for the intended livekit-room-creation failure—replace
pytest.raises(Exception) with pytest.raises(RuntimeError) (referencing the test
function name and the livekit.create_room.side_effect set to
RuntimeError("livekit down") and the call to start_session.__wrapped__).
In `@backend/src/tests/test_m4_suitor_entry.py`:
- Around line 56-68: The test name and assertion are inconsistent with intended
XSS handling: test_m4_005_register_suitor_xss_sanitized expects sanitization but
asserts the raw "<script>" is preserved; complete_suitor_profile (in
backend/src/api/v1/endpoints/suitors.py) only calls .strip() on name. Fix by
either (A) implementing HTML/script escaping inside complete_suitor_profile
before calling repo.update_by_clerk_id (apply a sanitizer/escape to the
payload.name so stored value removes or encodes script tags), or (B) if you
intentionally allow passthrough, rename the test to
test_m4_005_register_suitor_xss_passthrough and update its docstring/comment to
document this known gap; update the test assertion or implementation accordingly
to keep names and behavior consistent (refer to complete_suitor_profile and
test_m4_005_register_suitor_xss_sanitized when making the change).
In `@frontend/src/pages/InterviewCompleteScreen.tsx`:
- Around line 132-150: The current render lumps verdictQuery.isError and
!verdict together; update InterviewCompleteScreen's render logic to branch so
that if verdictQuery.isError you render a distinct error state (inside the
Window titled "Verdict.exe" or an ErrorWindow) that displays
verdictQuery.error.message and offers a retry action (call verdictQuery.refetch
on a "Retry" button) and/or a link back to sessions, while keeping the existing
"Verdict is not available yet." UI only for the case where !verdict &&
!verdictQuery.isError; ensure you reference verdictQuery, verdictQuery.error,
verdictQuery.refetch and the existing Window/"Verdict.exe" rendering when making
the change.
- Around line 38-69: In InterviewCompleteScreen, add an explicit guard for a
missing sessionId after the hooks: if sessionId is falsy, either navigate away
(using navigate) or set a dedicated "missing ID" state and return a
short-circuit UI so the component doesn't rely on statusQuery/verdictQuery and
avoids the perpetual waiting state; reference sessionId, statusQuery,
verdictQuery and waiting when implementing the early return/redirect.
🧹 Nitpick comments (19)
backend/agent/main.py (1)
316-323: CancelledError handling looks good; minor cleanup for logging.The explicit
asyncio.CancelledErrorcatch-and-reraise (lines 316-317) properly preserves cooperative shutdown semantics—this addresses the previous review concern.However, the
excargument inlogger.exception()is redundant since this method automatically includes the full exception traceback.♻️ Proposed fix
except asyncio.CancelledError: raise except Exception as exc: logger.exception( - "Failed to finalize/close agent session %s cleanly: %s", - session_id, - exc, + "Failed to finalize/close agent session %s cleanly", + session_id, )backend/src/tests/conftest.py (1)
172-197: Align heart_config expectations shape with production schema.
The scoring prompt expects a dict-like expectations structure; a list here can drift from production and cause type errors if reused.♻️ Suggested update
- "expectations": ["Sense of humor", "Emotional depth", "Creativity"], + "expectations": { + "values": ["Sense of humor", "Emotional depth", "Creativity"] + },backend/src/tests/test_api_contract.py (1)
32-36: Make the SQL injection test hit a real endpoint.
Current assertions are tautological and don’t exercise any code path; consider sending the payload to a public API and asserting proper validation/handling.backend/src/services/scoring/scoring_service.py (1)
29-36: Move the Anthropic model identifier to configuration.The model identifier
"claude-sonnet-4-20250514"is valid and supported in anthropic SDK 0.74.1. However, it should be moved toconfig.pyfollowing the existing pattern used for other model identifiers (e.g.,DEEPGRAM_TTS_MODEL,SMALLEST_LLM_MODEL). This allows environment-specific model updates without code changes and keeps the hardcoded value inScoringService.__init__, database schema, and migrations in sync.backend/src/tests/test_m4_suitor_entry.py (2)
34-36: Catch specific exception type instead of blindException.Using
pytest.raises(Exception)is too broad and may mask unrelated failures. SinceSuitorRegisterRequestuses Pydantic validation, catchpydantic.ValidationErrorfor precise test assertions.Proposed fix
+from pydantic import ValidationError + `@pytest.mark.asyncio` async def test_m4_002_register_suitor_name_required(): - with pytest.raises(Exception): + with pytest.raises(ValidationError): SuitorRegisterRequest(name="", age=20, gender="x", orientation="y")
50-53: Catch specific exception type instead of blindException.Same issue as above - use
ValidationErrorfor Pydantic schema validation tests.Proposed fix
`@pytest.mark.asyncio` async def test_m4_004_register_suitor_long_name_rejected(): - with pytest.raises(Exception): + with pytest.raises(ValidationError): SuitorRegisterRequest(name="a" * 101, age=22, gender="f", orientation="x")backend/src/tests/test_m1_foundation.py (1)
162-168: Remove unusedmonkeypatchparameter.The
monkeypatchfixture is declared but never used in this test.Proposed fix
`@pytest.mark.asyncio` -async def test_m1_013_arq_worker_connects_to_redis(monkeypatch): +async def test_m1_013_arq_worker_connects_to_redis(): """Worker settings should expose redis DSN.""" assert WorkerSettings.redis_settings is not Nonebackend/src/tests/test_m5_scoring.py (3)
63-65: Remove unusedmonkeypatchparameter.This test only verifies
score_session_taskis callable and doesn't usemonkeypatch.Proposed fix
`@pytest.mark.asyncio` -async def test_m5_002_scoring_worker_picks_up_job(monkeypatch): +async def test_m5_002_scoring_worker_picks_up_job(): assert callable(score_session_task)
283-304: Catch specificHTTPExceptioninstead of blindException.These tests verify API error responses; catching
HTTPExceptionwith status code assertions provides better test precision.Proposed fix
+from fastapi import HTTPException + `@pytest.mark.asyncio` async def test_m5_024_get_verdict_api_not_ready(registered_suitor, completed_session): session_repo = AsyncMock() session_repo.read_by_id.return_value = completed_session score_repo = AsyncMock() score_repo.find_by_session_id.return_value = None completed_session.verdict_status = "scoring" - with pytest.raises(Exception): + with pytest.raises(HTTPException) as exc_info: await get_session_verdict.__wrapped__( completed_session.id, registered_suitor, session_repo, score_repo ) + assert exc_info.value.status_code == 202 `@pytest.mark.asyncio` async def test_m5_025_get_verdict_api_session_not_found(registered_suitor): session_repo = AsyncMock() session_repo.read_by_id.return_value = None score_repo = AsyncMock() - with pytest.raises(Exception): + with pytest.raises(HTTPException) as exc_info: await get_session_verdict.__wrapped__( uuid.uuid4(), registered_suitor, session_repo, score_repo ) + assert exc_info.value.status_code == 404
340-348: Remove unusedmonkeypatchand improve retry test coverage.The
monkeypatchparameter is unused. Additionally, this test only verifies that the mock raises an error when called directly—it doesn't actually test retry behavior of theScoringService.Proposed fix
`@pytest.mark.asyncio` -async def test_m5_027_scoring_retries_on_claude_api_failure(monkeypatch): +async def test_m5_027_scoring_retries_on_claude_api_failure(): service = ScoringService.__new__(ScoringService) service.model = "claude" client = AsyncMock() client.messages.create.side_effect = RuntimeError("500") service.client = client with pytest.raises(RuntimeError): await service.client.messages.create()Consider expanding this test to verify actual retry logic if the
ScoringServiceimplements retries.backend/src/tests/test_m3_conversation_engine.py (3)
100-109: Remove unusedsample_emotion_timelinefixture.The fixture is declared but never used in this test.
Proposed fix
`@pytest.mark.asyncio` -async def test_m3_006_agent_initializes_with_heart_persona(sample_emotion_timeline): +async def test_m3_006_agent_initializes_with_heart_persona(): mgr = SessionManager("s1", [{"text": "Q1"}]) tracker = AsyncMock()
372-389: Remove unusedmonkeypatchparameter.Proposed fix
`@pytest.mark.asyncio` async def test_m3_035_get_session_status_api( - monkeypatch, registered_suitor, completed_session + registered_suitor, completed_session ):
404-435: Catch specificHTTPExceptioninstead of blindException.These tests verify API error handling; catching
HTTPExceptionwith status code assertions improves test precision.Proposed fix
+from fastapi import HTTPException + `@pytest.mark.asyncio` async def test_m3_037_create_session_invalid_heart(registered_suitor, mock_livekit): session_repo = AsyncMock() session_repo.find_active_by_suitor.return_value = None session_repo.count_today_by_suitor.return_value = 0 heart_repo = AsyncMock() heart_repo.find_by_slug.return_value = None - with pytest.raises(Exception): + with pytest.raises(HTTPException) as exc_info: await start_session.__wrapped__( SessionStartRequest(heart_slug="missing"), registered_suitor, heart_repo, session_repo, mock_livekit, ) + assert exc_info.value.status_code == 404 `@pytest.mark.asyncio` async def test_m3_038_create_session_invalid_suitor(seeded_heart, mock_livekit): heart_repo = AsyncMock() heart_repo.find_by_slug.return_value = seeded_heart session_repo = AsyncMock() session_repo.find_active_by_suitor.return_value = None session_repo.count_today_by_suitor.return_value = 0 - with pytest.raises(Exception): + with pytest.raises((HTTPException, AttributeError)): await start_session.__wrapped__( SessionStartRequest(heart_slug=seeded_heart.shareable_slug), None, heart_repo, session_repo, mock_livekit, ) # type: ignore[arg-type]backend/src/tests/test_m2_heart_config.py (5)
25-59: Catch specific exception types for validation tests.Use
pydantic.ValidationErrorfor config validation failures instead of blindException.Proposed fix
+from pydantic import ValidationError + `@pytest.mark.asyncio` async def test_m2_002_heart_config_validation_fails_on_missing_fields(tmp_path: Path): bad = tmp_path / "heart.yaml" bad.write_text("shareable_slug: test\n", encoding="utf-8") loader = HeartConfigLoader(str(bad)) - with pytest.raises(Exception): + with pytest.raises(ValidationError): loader.load() `@pytest.mark.asyncio` async def test_m2_003_heart_config_validation_fails_on_invalid_types(tmp_path: Path): # ... YAML content ... loader = HeartConfigLoader(str(bad)) - with pytest.raises(Exception): + with pytest.raises(ValidationError): loader.load()
62-67: Remove unusedmonkeypatchparameter.Proposed fix
`@pytest.mark.asyncio` -async def test_m2_004_heart_config_seeds_db_on_startup(monkeypatch): +async def test_m2_004_heart_config_seeds_db_on_startup(): loader = HeartConfigLoader("config/heart_config.yaml")
173-187: Remove unusedmonkeypatchand catch specificHTTPException.Proposed fix
+from fastapi import HTTPException + `@pytest.mark.asyncio` -async def test_m2_011_public_heart_profile_404_wrong_slug(monkeypatch): +async def test_m2_011_public_heart_profile_404_wrong_slug(): heart_repo = AsyncMock() heart_repo.find_by_slug.return_value = None question_repo = AsyncMock() class Req: class app: class state: heart_config = None - with pytest.raises(Exception): + with pytest.raises(HTTPException) as exc_info: await public_endpoints.get_public_profile.__wrapped__( "missing", Req(), heart_repo, question_repo ) + assert exc_info.value.status_code == 404
154-170: Remove unusedmonkeypatchparameter.Proposed fix
`@pytest.mark.asyncio` -async def test_m2_010_public_heart_profile_returns_200(monkeypatch, seeded_heart): +async def test_m2_010_public_heart_profile_returns_200(seeded_heart):
216-234: Remove unusedmonkeypatchparameter.Proposed fix
`@pytest.mark.asyncio` -async def test_m2_014_public_profile_no_sensitive_data(monkeypatch, seeded_heart): +async def test_m2_014_public_profile_no_sensitive_data(seeded_heart):backend/src/api/v1/endpoints/sessions.py (1)
330-341: Duplicated verdict_status fallback logic.The same fallback pattern appears at line 207 and line 331. Consider extracting to a helper method for consistency.
Proposed refactor
def _resolve_verdict_status(session, score) -> str: """Resolve verdict status, handling legacy sessions without explicit status.""" if session.verdict_status: return session.verdict_status return "ready" if score else "pending"
| def format_expectations(expectations: dict[str, Any] | None) -> str: | ||
| if not expectations: | ||
| return "No specific expectations defined." | ||
|
|
||
| parts: list[str] = [] | ||
| if expectations.get("looking_for"): | ||
| parts.append(f"Looking for: {expectations['looking_for']}") | ||
| if expectations.get("values") and isinstance(expectations["values"], list): | ||
| parts.append(f"Values: {', '.join(expectations['values'])}") | ||
| if expectations.get("must_haves") and isinstance(expectations["must_haves"], list): | ||
| parts.append(f"Must-haves: {', '.join(expectations['must_haves'])}") | ||
| if expectations.get("communication_preferences"): | ||
| parts.append( | ||
| f"Communication preferences: {expectations['communication_preferences']}" | ||
| ) | ||
|
|
||
| return "\n".join(parts) if parts else "No specific expectations defined." |
There was a problem hiding this comment.
Guard against non-dict expectations inputs.
If expectations is a list/string (e.g., legacy/test data), .get() will raise; add a type guard to keep prompt building resilient.
🔧 Suggested guard
def format_expectations(expectations: dict[str, Any] | None) -> str:
if not expectations:
return "No specific expectations defined."
+ if isinstance(expectations, list):
+ return f"Values: {', '.join(str(item) for item in expectations)}"
+ if not isinstance(expectations, dict):
+ return "No specific expectations defined."🤖 Prompt for AI Agents
In `@backend/src/services/scoring/prompt_builder.py` around lines 14 - 30, The
function format_expectations assumes expectations is a dict and calls .get,
which will raise for list/str legacy inputs; add a type guard at the top of
format_expectations to verify isinstance(expectations, dict) (or Mapping) and if
not, return "No specific expectations defined." (or coerce safe default) so the
subsequent accesses to expectations.get and indexing are safe; update references
to expectations, looking_for, values, must_haves, and communication_preferences
inside format_expectations accordingly.
| persona: | ||
| traits: [warm] | ||
| vibe: calm | ||
| tone: nice | ||
| humor_level: 5 | ||
| strictness: 5 | ||
| expectations: {} | ||
| screening_questions: | ||
| - text: Hello? | ||
| required: true | ||
| shareable_slug: broken | ||
| calendar: | ||
| calcom_api_key: x | ||
| calcom_event_type_id: y |
There was a problem hiding this comment.
Keep non-target fields valid so the test isolates the missing-name error.
calcom_event_type_id being a string can trigger an unrelated validation error.
✅ Suggested fix
calendar:
calcom_api_key: x
- calcom_event_type_id: y
+ calcom_event_type_id: 123🤖 Prompt for AI Agents
In `@backend/src/tests/fixtures/heart_config_missing_name.yaml` around lines 1 -
14, The test fixture has an unrelated validation failure because
calcom_event_type_id is a string; update the fixture so all non-target fields
are valid while keeping the missing-name condition under test—specifically
change calcom_event_type_id to a valid numeric ID (or appropriately-typed value)
and ensure other fields like calcom_api_key, persona, screening_questions, and
shareable_slug remain valid so the test isolates the missing "name" error.
| @pytest.mark.asyncio | ||
| async def test_api_003_not_found_returns_404(client): | ||
| resp = await client.get("/api/v1/does-not-exist") | ||
| assert resp.status_code in {404, 500} | ||
|
|
There was a problem hiding this comment.
Don't allow 500 for not-found routes.
Accepting 500s hides regressions; this should be a hard 404.
✅ Suggested fix
- assert resp.status_code in {404, 500}
+ assert resp.status_code == 404📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @pytest.mark.asyncio | |
| async def test_api_003_not_found_returns_404(client): | |
| resp = await client.get("/api/v1/does-not-exist") | |
| assert resp.status_code in {404, 500} | |
| `@pytest.mark.asyncio` | |
| async def test_api_003_not_found_returns_404(client): | |
| resp = await client.get("/api/v1/does-not-exist") | |
| assert resp.status_code == 404 | |
🤖 Prompt for AI Agents
In `@backend/src/tests/test_api_contract.py` around lines 20 - 24, The test
test_api_003_not_found_returns_404 currently allows a 500 which masks
regressions; update the assertion so the request to "/api/v1/does-not-exist"
must return a hard 404 (e.g., assert resp.status_code == 404) and remove the
{404, 500} set; keep the existing client.get call and test name unchanged so
it's clear this is strictly a not-found behavior check.
| @pytest.mark.asyncio | ||
| async def test_e2e_003_livekit_room_creation_fails(seeded_heart, registered_suitor): | ||
| heart_repo = AsyncMock() | ||
| heart_repo.find_by_slug.return_value = seeded_heart | ||
| session_repo = AsyncMock() | ||
| session_repo.find_active_by_suitor.return_value = None | ||
| session_repo.count_today_by_suitor.return_value = 0 | ||
| session_repo.model = SessionDb | ||
| created = SessionDb( | ||
| id=uuid.uuid4(), | ||
| heart_id=seeded_heart.id, | ||
| suitor_id=registered_suitor.id, | ||
| status=SessionStatus.PENDING, | ||
| ) | ||
| session_repo.create.return_value = created | ||
| session_repo.update_attr.return_value = created | ||
| livekit = AsyncMock() | ||
| livekit.create_room.side_effect = RuntimeError("livekit down") | ||
| with pytest.raises(Exception): | ||
| await start_session.__wrapped__( | ||
| SessionStartRequest(heart_slug=seeded_heart.shareable_slug), | ||
| registered_suitor, | ||
| heart_repo, | ||
| session_repo, | ||
| livekit, | ||
| ) |
There was a problem hiding this comment.
Use a specific exception type in the failure test.
Catching Exception is too broad and can mask unrelated failures.
✅ Suggested fix
- with pytest.raises(Exception):
+ with pytest.raises(RuntimeError):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @pytest.mark.asyncio | |
| async def test_e2e_003_livekit_room_creation_fails(seeded_heart, registered_suitor): | |
| heart_repo = AsyncMock() | |
| heart_repo.find_by_slug.return_value = seeded_heart | |
| session_repo = AsyncMock() | |
| session_repo.find_active_by_suitor.return_value = None | |
| session_repo.count_today_by_suitor.return_value = 0 | |
| session_repo.model = SessionDb | |
| created = SessionDb( | |
| id=uuid.uuid4(), | |
| heart_id=seeded_heart.id, | |
| suitor_id=registered_suitor.id, | |
| status=SessionStatus.PENDING, | |
| ) | |
| session_repo.create.return_value = created | |
| session_repo.update_attr.return_value = created | |
| livekit = AsyncMock() | |
| livekit.create_room.side_effect = RuntimeError("livekit down") | |
| with pytest.raises(Exception): | |
| await start_session.__wrapped__( | |
| SessionStartRequest(heart_slug=seeded_heart.shareable_slug), | |
| registered_suitor, | |
| heart_repo, | |
| session_repo, | |
| livekit, | |
| ) | |
| `@pytest.mark.asyncio` | |
| async def test_e2e_003_livekit_room_creation_fails(seeded_heart, registered_suitor): | |
| heart_repo = AsyncMock() | |
| heart_repo.find_by_slug.return_value = seeded_heart | |
| session_repo = AsyncMock() | |
| session_repo.find_active_by_suitor.return_value = None | |
| session_repo.count_today_by_suitor.return_value = 0 | |
| session_repo.model = SessionDb | |
| created = SessionDb( | |
| id=uuid.uuid4(), | |
| heart_id=seeded_heart.id, | |
| suitor_id=registered_suitor.id, | |
| status=SessionStatus.PENDING, | |
| ) | |
| session_repo.create.return_value = created | |
| session_repo.update_attr.return_value = created | |
| livekit = AsyncMock() | |
| livekit.create_room.side_effect = RuntimeError("livekit down") | |
| with pytest.raises(RuntimeError): | |
| await start_session.__wrapped__( | |
| SessionStartRequest(heart_slug=seeded_heart.shareable_slug), | |
| registered_suitor, | |
| heart_repo, | |
| session_repo, | |
| livekit, | |
| ) |
🧰 Tools
🪛 Ruff (0.15.0)
[warning] 122-122: Do not assert blind exception: Exception
(B017)
🤖 Prompt for AI Agents
In `@backend/src/tests/test_e2e_flows.py` around lines 104 - 129, The test
test_e2e_003_livekit_room_creation_fails is currently catching the broad
Exception; change the assertion to expect the specific exception raised by the
mocked livekit (RuntimeError) so the test only fails for the intended
livekit-room-creation failure—replace pytest.raises(Exception) with
pytest.raises(RuntimeError) (referencing the test function name and the
livekit.create_room.side_effect set to RuntimeError("livekit down") and the call
to start_session.__wrapped__).
| @pytest.mark.asyncio | ||
| async def test_m4_005_register_suitor_xss_sanitized(registered_suitor): | ||
| repo = AsyncMock() | ||
| repo.update_by_clerk_id.return_value = registered_suitor | ||
| payload = SuitorRegisterRequest( | ||
| name="<script>alert('x')</script>", | ||
| age=24, | ||
| gender="x", | ||
| orientation="y", | ||
| ) | ||
| await complete_suitor_profile.__wrapped__(payload, registered_suitor, repo) | ||
| sent_name = repo.update_by_clerk_id.await_args.args[1]["name"] | ||
| assert sent_name.startswith("<script>") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check for sanitization functions in the backend codebase
rg -n "sanitize|escape|strip_tags|bleach" backend/src --type py -C 2Repository: msmel01/waterloo-ai-agents-hackathon2026
Length of output: 3048
🏁 Script executed:
#!/bin/bash
# Find the complete_suitor_profile function implementation
rg -n "def complete_suitor_profile" backend/src --type py -A 20Repository: msmel01/waterloo-ai-agents-hackathon2026
Length of output: 1623
Fix misleading test name and add XSS sanitization.
The test test_m4_005_register_suitor_xss_sanitized asserts that <script> tags pass through unsanitized (assert sent_name.startswith("<script>")), but the name suggests sanitization should occur. The implementation in backend/src/api/v1/endpoints/suitors.py only calls .strip() on the name with no HTML/script tag escaping.
Either sanitize user input in complete_suitor_profile before storing to the repository, or rename the test to test_m4_005_register_suitor_xss_passthrough and add a comment documenting that this is a known gap to address.
🤖 Prompt for AI Agents
In `@backend/src/tests/test_m4_suitor_entry.py` around lines 56 - 68, The test
name and assertion are inconsistent with intended XSS handling:
test_m4_005_register_suitor_xss_sanitized expects sanitization but asserts the
raw "<script>" is preserved; complete_suitor_profile (in
backend/src/api/v1/endpoints/suitors.py) only calls .strip() on name. Fix by
either (A) implementing HTML/script escaping inside complete_suitor_profile
before calling repo.update_by_clerk_id (apply a sanitizer/escape to the
payload.name so stored value removes or encodes script tags), or (B) if you
intentionally allow passthrough, rename the test to
test_m4_005_register_suitor_xss_passthrough and update its docstring/comment to
document this known gap; update the test assertion or implementation accordingly
to keep names and behavior consistent (refer to complete_suitor_profile and
test_m4_005_register_suitor_xss_sanitized when making the change).
| export function InterviewCompleteScreen() { | ||
| const { sessionId = '' } = useParams<{ sessionId: string }>(); | ||
| const navigate = useNavigate(); | ||
| const [showQuestions, setShowQuestions] = useState(false); | ||
|
|
||
| const statusQuery = useGetSessionStatusApiV1SessionsIdStatusGet(sessionId, { | ||
| query: { | ||
| enabled: !!sessionId, | ||
| refetchInterval: (query) => { | ||
| const verdictStatus = query.state.data?.verdict_status; | ||
| if (!verdictStatus) { | ||
| return 5000; | ||
| } | ||
| if (verdictStatus === 'ready' || verdictStatus === 'failed') { | ||
| return false; | ||
| } | ||
| return 5000; | ||
| }, | ||
| }, | ||
| }); | ||
|
|
||
| const verdictQuery = useGetSessionVerdictApiV1SessionsIdVerdictGet(sessionId, { | ||
| query: { | ||
| enabled: statusQuery.data?.verdict_status === 'ready', | ||
| retry: false, | ||
| }, | ||
| }); | ||
|
|
||
| const waiting = | ||
| !statusQuery.data || | ||
| (!statusQuery.data.has_verdict && statusQuery.data.verdict_status !== 'failed'); | ||
|
|
There was a problem hiding this comment.
Guard missing sessionId to avoid an infinite loading screen.
If the route is hit without a sessionId, the queries are disabled and waiting stays true, so users see “Analyzing…” indefinitely. Add a dedicated missing-ID state (or redirect) after the hooks.
Suggested fix
const verdictQuery = useGetSessionVerdictApiV1SessionsIdVerdictGet(sessionId, {
query: {
enabled: statusQuery.data?.verdict_status === 'ready',
retry: false,
},
});
+ if (!sessionId) {
+ return (
+ <div className="min-h-screen bg-win-bg flex flex-col py-6 px-4">
+ <AppHeader />
+ <main className="flex-1 max-w-xl mx-auto w-full mt-8">
+ <Window title="Verdict.exe" icon="info">
+ <div className="text-center space-y-4">
+ <p className="text-sm text-gray-700">Session not found.</p>
+ <Link
+ to="/chats"
+ className="inline-block px-4 py-2 bg-win-titlebar text-white text-sm border border-palette-orchid shadow-bevel"
+ >
+ Back to sessions
+ </Link>
+ </div>
+ </Window>
+ </main>
+ </div>
+ );
+ }🤖 Prompt for AI Agents
In `@frontend/src/pages/InterviewCompleteScreen.tsx` around lines 38 - 69, In
InterviewCompleteScreen, add an explicit guard for a missing sessionId after the
hooks: if sessionId is falsy, either navigate away (using navigate) or set a
dedicated "missing ID" state and return a short-circuit UI so the component
doesn't rely on statusQuery/verdictQuery and avoids the perpetual waiting state;
reference sessionId, statusQuery, verdictQuery and waiting when implementing the
early return/redirect.
| if (verdictQuery.isError || !verdict) { | ||
| return ( | ||
| <div className="min-h-screen bg-win-bg flex flex-col py-6 px-4"> | ||
| <AppHeader /> | ||
| <main className="flex-1 max-w-xl mx-auto w-full mt-8"> | ||
| <Window title="Verdict.exe" icon="info"> | ||
| <div className="text-center space-y-4"> | ||
| <p className="text-sm text-gray-700">Verdict is not available yet.</p> | ||
| <Link | ||
| to="/chats" | ||
| className="inline-block px-4 py-2 bg-win-titlebar text-white text-sm border border-palette-orchid shadow-bevel" | ||
| > | ||
| Back to sessions | ||
| </Link> | ||
| </div> | ||
| </Window> | ||
| </main> | ||
| </div> | ||
| ); |
There was a problem hiding this comment.
Differentiate verdict fetch errors from “not available yet.”
verdictQuery.isError currently shows the same “not available yet” state, which hides real failures and makes recovery unclear. Consider a separate error UI (or a retry action) for actual fetch errors.
Suggested fix
- if (verdictQuery.isError || !verdict) {
+ if (verdictQuery.isError) {
+ return (
+ <div className="min-h-screen bg-win-bg flex flex-col py-6 px-4">
+ <AppHeader />
+ <main className="flex-1 max-w-xl mx-auto w-full mt-8">
+ <Window title="Verdict.exe" icon="info">
+ <div className="text-center space-y-4">
+ <p className="text-sm text-gray-700">We couldn’t load your verdict.</p>
+ <button
+ type="button"
+ onClick={() => verdictQuery.refetch()}
+ className="inline-block px-4 py-2 bg-win-titlebar text-white text-sm border border-palette-orchid shadow-bevel"
+ >
+ Retry
+ </button>
+ </div>
+ </Window>
+ </main>
+ </div>
+ );
+ }
+
+ if (!verdict) {
return (
<div className="min-h-screen bg-win-bg flex flex-col py-6 px-4">
<AppHeader />
<main className="flex-1 max-w-xl mx-auto w-full mt-8">
<Window title="Verdict.exe" icon="info">
<div className="text-center space-y-4">
<p className="text-sm text-gray-700">Verdict is not available yet.</p>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (verdictQuery.isError || !verdict) { | |
| return ( | |
| <div className="min-h-screen bg-win-bg flex flex-col py-6 px-4"> | |
| <AppHeader /> | |
| <main className="flex-1 max-w-xl mx-auto w-full mt-8"> | |
| <Window title="Verdict.exe" icon="info"> | |
| <div className="text-center space-y-4"> | |
| <p className="text-sm text-gray-700">Verdict is not available yet.</p> | |
| <Link | |
| to="/chats" | |
| className="inline-block px-4 py-2 bg-win-titlebar text-white text-sm border border-palette-orchid shadow-bevel" | |
| > | |
| Back to sessions | |
| </Link> | |
| </div> | |
| </Window> | |
| </main> | |
| </div> | |
| ); | |
| if (verdictQuery.isError) { | |
| return ( | |
| <div className="min-h-screen bg-win-bg flex flex-col py-6 px-4"> | |
| <AppHeader /> | |
| <main className="flex-1 max-w-xl mx-auto w-full mt-8"> | |
| <Window title="Verdict.exe" icon="info"> | |
| <div className="text-center space-y-4"> | |
| <p className="text-sm text-gray-700">We couldn't load your verdict.</p> | |
| <button | |
| type="button" | |
| onClick={() => verdictQuery.refetch()} | |
| className="inline-block px-4 py-2 bg-win-titlebar text-white text-sm border border-palette-orchid shadow-bevel" | |
| > | |
| Retry | |
| </button> | |
| </div> | |
| </Window> | |
| </main> | |
| </div> | |
| ); | |
| } | |
| if (!verdict) { | |
| return ( | |
| <div className="min-h-screen bg-win-bg flex flex-col py-6 px-4"> | |
| <AppHeader /> | |
| <main className="flex-1 max-w-xl mx-auto w-full mt-8"> | |
| <Window title="Verdict.exe" icon="info"> | |
| <div className="text-center space-y-4"> | |
| <p className="text-sm text-gray-700">Verdict is not available yet.</p> | |
| <Link | |
| to="/chats" | |
| className="inline-block px-4 py-2 bg-win-titlebar text-white text-sm border border-palette-orchid shadow-bevel" | |
| > | |
| Back to sessions | |
| </Link> | |
| </div> | |
| </Window> | |
| </main> | |
| </div> | |
| ); | |
| } |
🤖 Prompt for AI Agents
In `@frontend/src/pages/InterviewCompleteScreen.tsx` around lines 132 - 150, The
current render lumps verdictQuery.isError and !verdict together; update
InterviewCompleteScreen's render logic to branch so that if verdictQuery.isError
you render a distinct error state (inside the Window titled "Verdict.exe" or an
ErrorWindow) that displays verdictQuery.error.message and offers a retry action
(call verdictQuery.refetch on a "Retry" button) and/or a link back to sessions,
while keeping the existing "Verdict is not available yet." UI only for the case
where !verdict && !verdictQuery.isError; ensure you reference verdictQuery,
verdictQuery.error, verdictQuery.refetch and the existing Window/"Verdict.exe"
rendering when making the change.
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Tests