0.0.355 by yochze · Pull Request #182 · bagofwords1/bagofwords

yochze · 2026-04-05T18:48:02Z

No description provided.

…dal detail Adds visibility into which instructions are loaded during agent completions: - Backend emits `instructions.context` SSE event after context build and after tools load related instructions - Chat shows lightweight "N instructions loaded" indicator with link to trace modal - Trace modal gets Instructions left-pane item with full table (title, category, load mode, reason, type) - Tool detail in trace modal shows instructions loaded by that specific tool https://claude.ai/code/session_0181SQjfnCYLV799NeHWtsLn

…le everywhere - Persist loaded_instructions in completion JSON field so indicator survives page refresh - Add loaded_instructions to CompletionV2Schema and hydrate from completion_service - Remove 'view' link from chat indicator, switch to cube icon - TraceModal: replace full table with compact collapsible list with always/intelligent counts - TraceModal: tool-loaded instructions now collapsible with chevron toggle - DescribeTablesTool: show "N instructions loaded" with expandable list inline https://claude.ai/code/session_0181SQjfnCYLV799NeHWtsLn

- Move "N instructions" indicator from top of completion to footer row (after thumbs up/down, before debug button) - DescribeTablesTool: move instructions inside the table list as last <li>, same expand/collapse pattern as tables, collapsed by default https://claude.ai/code/session_0181SQjfnCYLV799NeHWtsLn

- Replace static "N instructions" text with UPopover that shows list on click - Each instruction row shows title, load mode badge (always/intelligent) - Clicking an instruction row fetches full instruction and opens InstructionModalComponent - Reuses existing showTrainingInstructionModal pattern https://claude.ai/code/session_0181SQjfnCYLV799NeHWtsLn

SQLAlchemy doesn't detect in-place mutations on JSON columns. Added flag_modified() call so the loaded_instructions data actually gets committed to DB when update_completion_status runs later. https://claude.ai/code/session_0181SQjfnCYLV799NeHWtsLn

…e-I2wUE Add instructions loading visibility to trace modal and reports

Replaces the post-loop SuggestInstructions LLM-call with an agentic sub-loop that runs after the main agent loop completes when trigger conditions fire. The harness can search existing instructions, verify facts via inspect_data/describe_tables, and decide whether to create or edit instructions. - New search_instructions tool (knowledge + training modes) for discovering existing instructions before creating duplicates. - New "knowledge" mode + _build_knowledge_prompt with reflection framing, trigger reasons injection, and tight iteration budget. - _run_knowledge_harness sub-loop in agent_v2 (max 5 steps) that spins up a knowledge-mode planner, runs tool calls via the existing ToolRunner, and streams instructions.suggest.partial events. - Instructions land in a draft AI build that is submitted for review (matches the previous suggestion semantics — admin still approves). - Trigger conditions are formatted into a <trigger_conditions> block injected into the harness prompt as hints. - create_instruction / edit_instruction allowed_modes expanded to include "knowledge". - Removed _stream_suggestions_inline from agent_v2 (replaced). https://claude.ai/code/session_01KnqwhvsU9FANr3tjMfRi2Q

Rbac2

The new RBAC permission registry uses 'manage_llm', not the legacy 'manage_llm_settings' string. The settings layout also gated the LLM tab on a non-existent 'modify_settings' permission, hiding the tab from members entirely. - Rename manage_llm_settings -> manage_llm in LLMsComponent so write controls (Integrate Models button, Actions column, toggle) are properly hidden from members. - Allow the LLM settings tab to render for any authenticated user; intra-page controls remain gated by manage_llm. - Drop the stale page-level permission gate on /settings/models. Fixes the playwright visibility test 'member can see LLM tab (read-only)'.

This reverts commit 5bf9e1e.

Members lack modify_settings, so the LLM tab is intentionally hidden from the settings layout. Update the visibility spec to reflect that the tab should not be rendered, instead of expecting a read-only view.

…ests-VDtJj feat: rbac

Mirrors the Slack/Teams external-platform pattern: a WhatsAppAdapter subclass of PlatformAdapter, registered in the factory; WhatsAppConfig schema; create_whatsapp_platform service with Meta Graph validation; POST /settings/integrations/whatsapp config route; GET/POST /api/settings/integrations/whatsapp/webhook with X-Hub-Signature-256 verification, phone_number_id-based tenant routing, and id-based dedupe. Adapter parses Cloud API webhooks, sends text replies with context threading, uploads media in two steps, and maps Slack-style emoji reactions to unicode for the manager's processing indicator. Frontend adds a WhatsApp card and modal to settings/integrations. Tests: 21 unit tests (adapter parsing, signature verify, send/reaction/ media mocked via httpx MockTransport, and full webhook route behaviour incl. handshake, bad-signature, dedupe, status-only, non-text, dispatch). Sandbox: backend/scripts/whatsapp_sandbox_debug.py boots the real adapter + route against a mock Meta Graph server and replays a full scenario (handshake, unverified->verified, threaded reply, status-only, dedupe). https://claude.ai/code/session_01XtJkarHadpmRsGU5quSYrY

Adds backend/tests/e2e/rbac/ — seven test modules totalling 35 tests covering the registry, every CRUD/visibility path on data sources, instructions, builds, entities, evals, and the resolver paths in permission_resolver._resolve_permissions_inner. Why --- PR 182 introduced the new RBAC primitives (roles, role assignments, groups, resource grants, registry, resolver) but the existing e2e coverage only spot-checked individual endpoints. We needed a tight matrix that exercises real users + per-DS grants against the live FastAPI app to lock in behaviour and surface any drift quickly. Test files ---------- - test_rbac_registry.py: pure static parity. AST-walks routes/*.py for every @requires_permission and check_resource_permissions call and asserts the literal permission strings exist in the registry. Catches the manage_tests / modify_settings class of bug instantly. - test_rbac_data_sources.py: detail-level access matrix + list/detail-invariant in one shot. - test_rbac_instructions.py: per-DS create_instructions matrix, owner vs admin vs other edit branches, list visibility filtering. - test_rbac_builds.py: list endpoint is org-only, publish/submit/etc. enforce per-DS create_instructions on every touched DS. - test_rbac_entities.py: forward list/detail invariant for entities, the recently fixed bug class. - test_rbac_evals.py: suite endpoints are strictly org-level, case create endpoint is resource_scoped + per-DS gated. - test_rbac_role_principals.py: walks every resolver path — user direct role, group → role, user resource grant, group resource grant, full_admin_access bypass, and assignment-removal freshness. Findings (backend code changes) ------------------------------- 1. routes/test.py — both create_case and update_case called check_resource_permissions(..., 'create_evals'), but 'create_evals' is not in permissions_registry.RESOURCE_PERMISSIONS. The check therefore never matched anything for non-admins, locking per-DS evals authors out of any DS-scoped case creation. Changed both call sites to 'manage_evals' (the canonical resource permission, already in the registry and ORG_PERM_IMPLIES_RESOURCE). The new test_rbac_registry.py::test_check_resource_permissions_uses_known_resource_perms walks check_resource_permissions calls in routes/ and would have failed CI on the original code; the test is now the regression guard for this drift class. The existing test_rbac_complex_roles.py::test_eval_case_mixed_ds_list_denied test still passes — the role under test only holds 'run_evals' so denial still applies, just for the right reason now. 2. tests/e2e/test_rbac.py::test_permissions_registry_endpoint — was asserting "Reports" appears in the categories response, but Reports moved to HIDDEN_PERMISSION_CATEGORIES on purpose so the role-editor UI doesn't render meaningless checkboxes for it. Updated the assertion to match current behaviour. (Pre-existing failure on PR 182, unrelated to our fixture/test additions.) Known bugs surfaced as xfail ---------------------------- - test_rbac_data_sources.py::test_data_source_grant_appears_in_list data_source_service.get_data_sources filters the LIST by the legacy DataSourceMembership table only — it ignores ResourceGrant rows. A user with a per-DS RBAC grant but no DataSourceMembership opens the DS in detail (resolver path) but never sees it in the list. Marked strict xfail with full repro context; lift the marker once the service-layer filter is unioned with ResourceGrant. Out of scope ------------ - Frontend Playwright tests (this branch is backend only) - Performance/load testing the resolver - Migration changes - New RBAC features https://claude.ai/code/session_01LVmq3ikLQFiS6Zs1fqb8Ce

The sandbox feedback loop doc tells contributors to set up a Python 3.12 venv at backend/.venv; add it to .gitignore so it never gets accidentally committed alongside test fixture work. https://claude.ai/code/session_01LVmq3ikLQFiS6Zs1fqb8Ce

…nstructions, role grants Reflects the new RBAC semantics introduced in the latest base-branch ``rbac improvements`` commit: - ``view`` / ``view_schema`` are now implicit on any DS grant; removed from RESOURCE_PERMISSIONS and from every test fixture's grant payload. - ``create_instructions`` (resource-level) is renamed to ``manage_instructions``; updated all grant payloads, route assertions, and route enforcement tests. - /api/builds GET endpoints are no longer admin-only — they apply a per-DS filter via ``get_accessible_data_source_ids``. Updated test_rbac_builds.py to expect 200 + filtered list for non-admins. - ``data_source_service.get_data_sources`` now unions ResourceGrant with the legacy DataSourceMembership table, fixing the list/detail invariant bug captured by the previous xfail. Removed the xfail marker from ``test_data_source_grant_appears_in_list`` — it now asserts forward correctness as a regression guard. Added tests for the new resolver paths: - ``test_role_as_principal_resource_grant``: a custom role created with inline ``resource_grants`` propagates to a directly-assigned member via ResourceGrant.principal_type='role'. - ``test_role_as_principal_grant_via_group_assignment``: same path but transitively (user → group → role → role-attached resource grant). - ``test_view_and_view_schema_are_implicit_on_any_grant``: holding only ``manage_instructions`` on a DS lets the user GET /data_sources/{id} and /full_schema, which require ``view`` / ``view_schema``. - ``test_view_and_view_schema_not_explicit_resource_perms``: registry guard asserting both strings stay out of RESOURCE_PERMISSIONS and RESOURCE_SCOPED_GROUPS. Existing test fixup ------------------- ``test_rbac_complex_roles.py`` arrived from the upstream merge with three dangling ``assert in perms["permissions"]`` lines (syntax error) left over from a stale find/replace that removed ``view_instructions`` without cleaning up the assertion sites. Removed the dangling lines so the file parses again. The corresponding tests now skip cleanly when no enterprise license is present (matching the pre-existing pattern). Suite size: 39 RBAC tests (was 35 + 1 xfail), all passing under ``TESTING=true pytest -m e2e --db=sqlite tests/e2e/rbac/``. The existing tests/e2e/test_rbac*.py + test_eval.py also pass with no regressions (36 passed, 20 skipped for enterprise license). https://claude.ai/code/session_01LVmq3ikLQFiS6Zs1fqb8Ce

…skipping Adds backend/tests/e2e/conftest.py with a session-scoped autouse fixture that flips the license cache to "enterprise active" for the entire e2e session. Without this, every existing RBAC test that probed for a real license (test_rbac.py, test_rbac_complex_roles.py, test_rbac_policies.py) was skipping the enterprise codepaths in CI. Approach: directly mutate ee_license._cached_license and _cache_initialized once per session — get_license_info(), has_feature() and is_enterprise_licensed() all read from those globals, so a single session-level swap covers every entry point without per-test monkeypatch overhead. Effect on the existing suite (under sqlite, no real license key): test_rbac.py was 14 → now 15 passing (Reports test was already fixed in c349884) test_rbac_complex_roles.py was 1 + 16 skipped → now 15 passing + 2 author-flagged "post-MVP" skips test_rbac_policies.py was 0 + 15 skipped → now 9 passing + 6 author-flagged "post-MVP" skips The remaining 8 skips across complex_roles + policies are explicit @pytest.mark.skip(reason="post-MVP: ...") markers that the original test author placed for future view_evals/run_evals split work — they are not enterprise-license skips and should not be touched. Also removed the defensive ``if status_code == 402: pytest.skip(...)`` guards from tests/e2e/rbac/test_rbac_role_principals.py — with the session-level enterprise stub in place, a 402 from the role/group routes is now a real failure that should fail loudly, not silently skip. https://claude.ai/code/session_01LVmq3ikLQFiS6Zs1fqb8Ce

All 12 Copilot comments were legitimate; applied every one. 1. Unasserted PUTs to flip is_public (9 sites) Moved the "make DS private" dance into the sqlite_data_source fixture itself and asserted the PUT returns 200 and the flipped body reflects the requested value. Callers now pass ``is_public=False`` (the default) or ``is_public=True`` at creation time instead of duplicating an unasserted ``test_client.put(...)`` block. Removes the "silent public DS" failure mode Copilot flagged — if the flip ever regresses, every RBAC fixture fails loudly with a clear message instead of quietly passing for the wrong reason. 2. Unused ``org_service`` import in rbac/conftest.py Leftover from an earlier iteration that was going to monkey-patch the organization service; the fake-license fixture never ended up needing it. Removed. 3. Missing status_code assertion in test_registry_hides_reports_category The test called ``resp.json()`` without first asserting 200. Added ``assert resp.status_code == 200, resp.text`` so an auth or route regression surfaces as a clear status-code mismatch instead of an opaque JSON decode error. 4. Tautological pagination assertion in test_list_builds_status_filter Replaced ``body["total"] == sum(1 for _ in items) or body["total"] >= len(items)`` with the honest paginated invariant ``len(items) <= body["total"] and len(items) >= 1`` — the previous form was always True because the RHS covered every case the LHS might fail. Suite still passes clean: 39 passed, 0 skipped, 0 xfail under ``TESTING=true pytest -m e2e --db=sqlite tests/e2e/rbac/``. https://claude.ai/code/session_01LVmq3ikLQFiS6Zs1fqb8Ce

Claude/test rbac 0.0.355 ub8s8

…tion-hxKBp Add WhatsApp Cloud API integration

yochze and others added 30 commits March 23, 2026 23:08

feat: rbac

3fe4a3a

feat: rbac

c2c8152

feat: rbac

dbdc86b

rbac: set custom permissions on data sources

b4fdd83

rbac: set custom permissions on data sources

713f347

feat: ldap integration

9242182

ldap

7f3d566

Merge branch 'main' into rbac2

253f850

rbac wip

c589d18

permissions and rbac

4441a56

main merge

239beb5

training mode improvements

24cbe82

side bar in report

de9a2ca

side bar in report

27ea218

instructions indication in report page

8e5837f

faster instructions mgmt

20e2942

faster instructions mgmt

c4980f9

Merge pull request #181 from bagofwords1/claude/show-instruction-usag…

79a95a5

…e-I2wUE Add instructions loading visibility to trace modal and reports

side bar in report

2647830

UI improvements

a7b82b3

UI improvements

03b6672

new: knowledge harness

1123ff2

new: knowledge harness

ee2e575

improve agent knowledge harness

38501bd

yochze and others added 28 commits April 6, 2026 23:53

rbac: resolve connection and mcp tools authorization

b041f9c

Merge pull request #158 from bagofwords1/rbac2

59f78eb

Rbac2

Revert "fix(rbac): make LLM settings tab visible to members (read-only)"

d5942f9

This reverts commit 5bf9e1e.

test(rbac): assert members cannot see LLM settings tab

d24f415

Members lack modify_settings, so the LLM tab is intentionally hidden from the settings layout. Update the visibility spec to reflect that the tab should not be rendered, instead of expecting a read-only view.

Merge pull request #188 from bagofwords1/claude/fix-playwright-rbac-t…

bdabd25

…ests-VDtJj feat: rbac

bechmark: spider data

6451c21

bechmark: spider data

425a301

spider results

c765982

spider results

9358d7c

rbac ui improvements

b824acc

make policy name required

5a01fa3

rbac improvements

ab7c677

chore: gitignore backend/.venv

b0ac065

The sandbox feedback loop doc tells contributors to set up a Python 3.12 venv at backend/.venv; add it to .gitignore so it never gets accidentally committed alongside test fixture work. https://claude.ai/code/session_01LVmq3ikLQFiS6Zs1fqb8Ce

Merge branch 'pr-182' into claude/test-rbac-0.0.355-ub8s8

9fc5ce8

Merge pull request #190 from bagofwords1/claude/test-rbac-0.0.355-ub8s8

23926c5

Claude/test rbac 0.0.355 ub8s8

rbac improvements

fdb5652

Merge pull request #191 from bagofwords1/claude/plan-whatsapp-integra…

51ca3fe

…tion-hxKBp Add WhatsApp Cloud API integration

rbac improvements

5d633de

rbac improvements

4c3b250

whatsapp icon

95713b5

release notes

cb7a5ab

deprecate older gen gpt models

e2507be

yochze merged commit 719010a into main Apr 10, 2026
3 of 5 checks passed

yochze deleted the 0.0.355 branch April 11, 2026 18:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.0.355#182

0.0.355#182
yochze merged 83 commits intomainfrom
0.0.355

yochze commented Apr 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yochze commented Apr 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants