Conversation
…dal detail Adds visibility into which instructions are loaded during agent completions: - Backend emits `instructions.context` SSE event after context build and after tools load related instructions - Chat shows lightweight "N instructions loaded" indicator with link to trace modal - Trace modal gets Instructions left-pane item with full table (title, category, load mode, reason, type) - Tool detail in trace modal shows instructions loaded by that specific tool https://claude.ai/code/session_0181SQjfnCYLV799NeHWtsLn
…le everywhere - Persist loaded_instructions in completion JSON field so indicator survives page refresh - Add loaded_instructions to CompletionV2Schema and hydrate from completion_service - Remove 'view' link from chat indicator, switch to cube icon - TraceModal: replace full table with compact collapsible list with always/intelligent counts - TraceModal: tool-loaded instructions now collapsible with chevron toggle - DescribeTablesTool: show "N instructions loaded" with expandable list inline https://claude.ai/code/session_0181SQjfnCYLV799NeHWtsLn
- Move "N instructions" indicator from top of completion to footer row (after thumbs up/down, before debug button) - DescribeTablesTool: move instructions inside the table list as last <li>, same expand/collapse pattern as tables, collapsed by default https://claude.ai/code/session_0181SQjfnCYLV799NeHWtsLn
- Replace static "N instructions" text with UPopover that shows list on click - Each instruction row shows title, load mode badge (always/intelligent) - Clicking an instruction row fetches full instruction and opens InstructionModalComponent - Reuses existing showTrainingInstructionModal pattern https://claude.ai/code/session_0181SQjfnCYLV799NeHWtsLn
SQLAlchemy doesn't detect in-place mutations on JSON columns. Added flag_modified() call so the loaded_instructions data actually gets committed to DB when update_completion_status runs later. https://claude.ai/code/session_0181SQjfnCYLV799NeHWtsLn
…e-I2wUE Add instructions loading visibility to trace modal and reports
Replaces the post-loop SuggestInstructions LLM-call with an agentic sub-loop that runs after the main agent loop completes when trigger conditions fire. The harness can search existing instructions, verify facts via inspect_data/describe_tables, and decide whether to create or edit instructions. - New search_instructions tool (knowledge + training modes) for discovering existing instructions before creating duplicates. - New "knowledge" mode + _build_knowledge_prompt with reflection framing, trigger reasons injection, and tight iteration budget. - _run_knowledge_harness sub-loop in agent_v2 (max 5 steps) that spins up a knowledge-mode planner, runs tool calls via the existing ToolRunner, and streams instructions.suggest.partial events. - Instructions land in a draft AI build that is submitted for review (matches the previous suggestion semantics — admin still approves). - Trigger conditions are formatted into a <trigger_conditions> block injected into the harness prompt as hints. - create_instruction / edit_instruction allowed_modes expanded to include "knowledge". - Removed _stream_suggestions_inline from agent_v2 (replaced). https://claude.ai/code/session_01KnqwhvsU9FANr3tjMfRi2Q
The new RBAC permission registry uses 'manage_llm', not the legacy 'manage_llm_settings' string. The settings layout also gated the LLM tab on a non-existent 'modify_settings' permission, hiding the tab from members entirely. - Rename manage_llm_settings -> manage_llm in LLMsComponent so write controls (Integrate Models button, Actions column, toggle) are properly hidden from members. - Allow the LLM settings tab to render for any authenticated user; intra-page controls remain gated by manage_llm. - Drop the stale page-level permission gate on /settings/models. Fixes the playwright visibility test 'member can see LLM tab (read-only)'.
This reverts commit 5bf9e1e.
Members lack modify_settings, so the LLM tab is intentionally hidden from the settings layout. Update the visibility spec to reflect that the tab should not be rendered, instead of expecting a read-only view.
…ests-VDtJj feat: rbac
Mirrors the Slack/Teams external-platform pattern: a WhatsAppAdapter subclass of PlatformAdapter, registered in the factory; WhatsAppConfig schema; create_whatsapp_platform service with Meta Graph validation; POST /settings/integrations/whatsapp config route; GET/POST /api/settings/integrations/whatsapp/webhook with X-Hub-Signature-256 verification, phone_number_id-based tenant routing, and id-based dedupe. Adapter parses Cloud API webhooks, sends text replies with context threading, uploads media in two steps, and maps Slack-style emoji reactions to unicode for the manager's processing indicator. Frontend adds a WhatsApp card and modal to settings/integrations. Tests: 21 unit tests (adapter parsing, signature verify, send/reaction/ media mocked via httpx MockTransport, and full webhook route behaviour incl. handshake, bad-signature, dedupe, status-only, non-text, dispatch). Sandbox: backend/scripts/whatsapp_sandbox_debug.py boots the real adapter + route against a mock Meta Graph server and replays a full scenario (handshake, unverified->verified, threaded reply, status-only, dedupe). https://claude.ai/code/session_01XtJkarHadpmRsGU5quSYrY
Adds backend/tests/e2e/rbac/ — seven test modules totalling 35 tests covering the registry, every CRUD/visibility path on data sources, instructions, builds, entities, evals, and the resolver paths in permission_resolver._resolve_permissions_inner. Why --- PR 182 introduced the new RBAC primitives (roles, role assignments, groups, resource grants, registry, resolver) but the existing e2e coverage only spot-checked individual endpoints. We needed a tight matrix that exercises real users + per-DS grants against the live FastAPI app to lock in behaviour and surface any drift quickly. Test files ---------- - test_rbac_registry.py: pure static parity. AST-walks routes/*.py for every @requires_permission and check_resource_permissions call and asserts the literal permission strings exist in the registry. Catches the manage_tests / modify_settings class of bug instantly. - test_rbac_data_sources.py: detail-level access matrix + list/detail-invariant in one shot. - test_rbac_instructions.py: per-DS create_instructions matrix, owner vs admin vs other edit branches, list visibility filtering. - test_rbac_builds.py: list endpoint is org-only, publish/submit/etc. enforce per-DS create_instructions on every touched DS. - test_rbac_entities.py: forward list/detail invariant for entities, the recently fixed bug class. - test_rbac_evals.py: suite endpoints are strictly org-level, case create endpoint is resource_scoped + per-DS gated. - test_rbac_role_principals.py: walks every resolver path — user direct role, group → role, user resource grant, group resource grant, full_admin_access bypass, and assignment-removal freshness. Findings (backend code changes) ------------------------------- 1. routes/test.py — both create_case and update_case called check_resource_permissions(..., 'create_evals'), but 'create_evals' is not in permissions_registry.RESOURCE_PERMISSIONS. The check therefore never matched anything for non-admins, locking per-DS evals authors out of any DS-scoped case creation. Changed both call sites to 'manage_evals' (the canonical resource permission, already in the registry and ORG_PERM_IMPLIES_RESOURCE). The new test_rbac_registry.py::test_check_resource_permissions_uses_known_resource_perms walks check_resource_permissions calls in routes/ and would have failed CI on the original code; the test is now the regression guard for this drift class. The existing test_rbac_complex_roles.py::test_eval_case_mixed_ds_list_denied test still passes — the role under test only holds 'run_evals' so denial still applies, just for the right reason now. 2. tests/e2e/test_rbac.py::test_permissions_registry_endpoint — was asserting "Reports" appears in the categories response, but Reports moved to HIDDEN_PERMISSION_CATEGORIES on purpose so the role-editor UI doesn't render meaningless checkboxes for it. Updated the assertion to match current behaviour. (Pre-existing failure on PR 182, unrelated to our fixture/test additions.) Known bugs surfaced as xfail ---------------------------- - test_rbac_data_sources.py::test_data_source_grant_appears_in_list data_source_service.get_data_sources filters the LIST by the legacy DataSourceMembership table only — it ignores ResourceGrant rows. A user with a per-DS RBAC grant but no DataSourceMembership opens the DS in detail (resolver path) but never sees it in the list. Marked strict xfail with full repro context; lift the marker once the service-layer filter is unioned with ResourceGrant. Out of scope ------------ - Frontend Playwright tests (this branch is backend only) - Performance/load testing the resolver - Migration changes - New RBAC features https://claude.ai/code/session_01LVmq3ikLQFiS6Zs1fqb8Ce
The sandbox feedback loop doc tells contributors to set up a Python 3.12 venv at backend/.venv; add it to .gitignore so it never gets accidentally committed alongside test fixture work. https://claude.ai/code/session_01LVmq3ikLQFiS6Zs1fqb8Ce
…nstructions, role grants
Reflects the new RBAC semantics introduced in the latest base-branch
``rbac improvements`` commit:
- ``view`` / ``view_schema`` are now implicit on any DS grant; removed
from RESOURCE_PERMISSIONS and from every test fixture's grant payload.
- ``create_instructions`` (resource-level) is renamed to
``manage_instructions``; updated all grant payloads, route assertions,
and route enforcement tests.
- /api/builds GET endpoints are no longer admin-only — they apply a
per-DS filter via ``get_accessible_data_source_ids``. Updated
test_rbac_builds.py to expect 200 + filtered list for non-admins.
- ``data_source_service.get_data_sources`` now unions ResourceGrant with
the legacy DataSourceMembership table, fixing the list/detail
invariant bug captured by the previous xfail. Removed the xfail
marker from ``test_data_source_grant_appears_in_list`` — it now
asserts forward correctness as a regression guard.
Added tests for the new resolver paths:
- ``test_role_as_principal_resource_grant``: a custom role created with
inline ``resource_grants`` propagates to a directly-assigned member
via ResourceGrant.principal_type='role'.
- ``test_role_as_principal_grant_via_group_assignment``: same path
but transitively (user → group → role → role-attached resource grant).
- ``test_view_and_view_schema_are_implicit_on_any_grant``: holding only
``manage_instructions`` on a DS lets the user GET /data_sources/{id}
and /full_schema, which require ``view`` / ``view_schema``.
- ``test_view_and_view_schema_not_explicit_resource_perms``: registry
guard asserting both strings stay out of RESOURCE_PERMISSIONS and
RESOURCE_SCOPED_GROUPS.
Existing test fixup
-------------------
``test_rbac_complex_roles.py`` arrived from the upstream merge with
three dangling ``assert in perms["permissions"]`` lines (syntax error)
left over from a stale find/replace that removed ``view_instructions``
without cleaning up the assertion sites. Removed the dangling lines so
the file parses again. The corresponding tests now skip cleanly when
no enterprise license is present (matching the pre-existing pattern).
Suite size: 39 RBAC tests (was 35 + 1 xfail), all passing under
``TESTING=true pytest -m e2e --db=sqlite tests/e2e/rbac/``. The
existing tests/e2e/test_rbac*.py + test_eval.py also pass with no
regressions (36 passed, 20 skipped for enterprise license).
https://claude.ai/code/session_01LVmq3ikLQFiS6Zs1fqb8Ce
…skipping
Adds backend/tests/e2e/conftest.py with a session-scoped autouse
fixture that flips the license cache to "enterprise active" for the
entire e2e session. Without this, every existing RBAC test that
probed for a real license (test_rbac.py, test_rbac_complex_roles.py,
test_rbac_policies.py) was skipping the enterprise codepaths in CI.
Approach: directly mutate ee_license._cached_license and
_cache_initialized once per session — get_license_info(),
has_feature() and is_enterprise_licensed() all read from those globals,
so a single session-level swap covers every entry point without
per-test monkeypatch overhead.
Effect on the existing suite (under sqlite, no real license key):
test_rbac.py was 14 → now 15 passing (Reports test
was already fixed in c349884)
test_rbac_complex_roles.py was 1 + 16 skipped → now 15 passing
+ 2 author-flagged "post-MVP" skips
test_rbac_policies.py was 0 + 15 skipped → now 9 passing
+ 6 author-flagged "post-MVP" skips
The remaining 8 skips across complex_roles + policies are explicit
@pytest.mark.skip(reason="post-MVP: ...") markers that the original
test author placed for future view_evals/run_evals split work — they
are not enterprise-license skips and should not be touched.
Also removed the defensive ``if status_code == 402: pytest.skip(...)``
guards from tests/e2e/rbac/test_rbac_role_principals.py — with the
session-level enterprise stub in place, a 402 from the role/group
routes is now a real failure that should fail loudly, not silently
skip.
https://claude.ai/code/session_01LVmq3ikLQFiS6Zs1fqb8Ce
All 12 Copilot comments were legitimate; applied every one.
1. Unasserted PUTs to flip is_public (9 sites)
Moved the "make DS private" dance into the sqlite_data_source
fixture itself and asserted the PUT returns 200 and the flipped
body reflects the requested value. Callers now pass
``is_public=False`` (the default) or ``is_public=True`` at creation
time instead of duplicating an unasserted ``test_client.put(...)``
block. Removes the "silent public DS" failure mode Copilot flagged
— if the flip ever regresses, every RBAC fixture fails loudly with
a clear message instead of quietly passing for the wrong reason.
2. Unused ``org_service`` import in rbac/conftest.py
Leftover from an earlier iteration that was going to monkey-patch
the organization service; the fake-license fixture never ended up
needing it. Removed.
3. Missing status_code assertion in test_registry_hides_reports_category
The test called ``resp.json()`` without first asserting 200. Added
``assert resp.status_code == 200, resp.text`` so an auth or route
regression surfaces as a clear status-code mismatch instead of an
opaque JSON decode error.
4. Tautological pagination assertion in test_list_builds_status_filter
Replaced
``body["total"] == sum(1 for _ in items) or body["total"] >= len(items)``
with the honest paginated invariant
``len(items) <= body["total"] and len(items) >= 1``
— the previous form was always True because the RHS covered every
case the LHS might fail.
Suite still passes clean: 39 passed, 0 skipped, 0 xfail under
``TESTING=true pytest -m e2e --db=sqlite tests/e2e/rbac/``.
https://claude.ai/code/session_01LVmq3ikLQFiS6Zs1fqb8Ce
Claude/test rbac 0.0.355 ub8s8
…tion-hxKBp Add WhatsApp Cloud API integration
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.