Skip to content

eva-foundry/52-DA-space-cleanup

Repository files navigation

52-DA-space-cleanup

EVA Ecosystem Integration

Tool Purpose How to Use
37-data-model Single source of truth for all project entities GET https://msub-eva-data-model.victoriousgrass-30debbd3.canadacentral.azurecontainerapps.io/model/projects/52-DA-space-cleanup
29-foundry Agentic capabilities (search, RAG, eval, observability) C:\eva-foundry\eva-foundation\29-foundry
48-eva-veritas Trust score and coverage audit MCP tool: audit_repo / get_trust_score
07-foundation-layer Copilot instructions primer + governance templates MCP tool: apply_primer / audit_project

Agent rule: Query the data model API before reading source files.

$base = $env:DATA_MODEL_BASE
if (-not $base -or $base -eq '') {
  $base = "https://msub-eva-data-model.victoriousgrass-30debbd3.canadacentral.azurecontainerapps.io"
}
Invoke-RestMethod "$base/model/agent-guide"
Invoke-RestMethod "$base/model/projects/52-DA-space-cleanup"

Live Client Runtime

The approved live client for this project will run against an EsDAICoESub dev instance.

Use the same Azure identity and subscription context on the next device, and keep the live client target configurable so it can point at either the EsDAICoESub or EspAICoESub dev instance without changing the code scaffold.


Domain Assistant Project Space Manager

Version: 0.3.0 Phase: 2 -- Space Admin CLI + Governed Decommission Design Status: IN PROGRESS Last Updated: 2026-03-24 07:02 ET

Proposal artifact for presentation: docs/20260324_060800-proposal-project-52-phased-delivery.md


Purpose

Investigate and map how the EVA Domain Assistant (InfoAssistant, instance hccld2) stores and manages user groups, roles, and project space data, then deliver two interoperable tools.

Client Commitment Baseline

This project is the implementation surface for the commitment made to the client team on 2026-03-23:

  • deliver fast provisioning and decommissioning support for EVA DA Accelerator project spaces;
  • prove the tool first in the dev subscription and let the client team decide next steps after validation;
  • make all destructive or consequential operations explainable through dry-run, logging, evidence, and reporting;
  • keep the PR destination for the client repo as a separate governance decision pending client direction.

Strategic Relationship To Project 75

Project 52-DA-space-cleanup is special.

  • It is not a PoC lab experiment for 75-EVA-vNext.
  • It is a workshop-style application of the Project 75 factory and control-plane patterns to maintain an existing EVA Domain Assistant estate.
  • Project 75 provides the factory doctrine, memory, governance, and service-packaging inspiration.
  • Project 52 proves that the same chassis can be used to service an already-running EVA system, not only new greenfield products.

Tool 1 -- eva-space CLI (Python)

Governed command-line tool for DA Space Administrators:

  1. init -- create a new project space (Entra group + upload container + content container + AI Search index + Cosmos GroupResourceMap entry)
  2. remove -- decommission or recycle a project space safely (dry-run first, --confirm gate, member cleanup plan, DA-supported file deletion workflow, status-history purge, structured audit receipt)
  3. export -- dump all space configurations to JSON or CSV (Excel-compatible)
  4. import -- apply a CSV/JSON config snapshot back to live state (diff preview + dry-run)
  5. restore -- restore a full GroupResourceMap snapshot from a saved config file

Tool 2 -- DA Space Admin Plugin (React + FastAPI)

Embedded panel inside https://infoasst-web-hccld2.azurewebsites.net served to users in the DA Space Administrator Entra group (admin RBAC tier):

  1. Spaces list -- table of all project spaces: slot name, Entra group, role, user count, status
  2. Space detail -- slot info, users in the group (from Entra Graph), resource links
  3. Save / Restore -- export config (CSV + JSON), import with diff preview, one-click restore
  4. Clean space -- dry-run report, explicit confirm dialog, real-time progress, audit receipt
  5. Audit log viewer -- history of all past init/remove/restore operations
  6. CSV export -- Excel-compatible download for group assignment maintenance (edit offline, re-import)
  7. EN/FR i18n toggle; WCAG 2.1 AA accessible

Source Under Analysis

Item Value
Source repo (read-only) C:\eva-foundry\EVA-JP-v1.2
Live instance https://infoasst-web-hccld2.azurewebsites.net/index.html
Azure instance suffix hccld2
Resource Group infoasst-hccld2 (EsDAICoESub)
Subscription EsDAICoESub (d2d4e571-e0f2-4f6c-901a-f88f7669bcba)

CONSTRAINT: C:\eva-foundry\EVA-JP-v1.2 must never be modified. All analysis is read-only.

ADO Coordination Surface

Project 52 now carries two distinct Azure DevOps references:

  • coordination project: EVA-Jurisprudence, with repo EVA-DA on branch jp-dev;
  • read-only client/source repo: EVA-Jurisprudence-SecMode-Info-Assistant-v1.2 in EVA - Portal.

The project-local operator script scripts/sync-esdc-ado-to-data-model.ps1 is the bounded export path for the visible ESDC ADO org inventory plus the Project 52 WBS and Agile seed. It reads the PAT from Key Vault, writes evidence under evidence/, saves a debug inventory snapshot under debug/, and updates the live Data Model with idempotent create-or-update behavior.

Current packaging direction: keep the workflow project-local while the schema stabilizes, then promote the generic provider-inventory sync path into an MCP service rather than a repo-only skill because the surface spans multiple providers, accounts, and live-system writebacks.


Architecture Summary (Discovery Phase Results)

Groups and Roles

The Domain Assistant uses Entra ID (AAD) groups as the primary RBAC unit. Each project space maps to exactly one Entra group with one of three roles:

  • admin -- group name contains admin; upload + read + manage
  • contributor -- group name contains contributor; upload + read
  • reader -- group name contains reader; read-only RAG access

Group-to-resource mappings are stored in a Cosmos DB container called groupResourcesMapContainer (env var: COSMOSDB_CONTAINER_GROUP_MAP, database: COSMOSDB_DATABASE_GROUP_MAP). Each document has this shape:

{
  "group_id": "<entra-group-object-id>",
  "group_name": "<human-readable-name>",
  "upload_storage": {
    "upload_container": "<blob-upload-container-name>",
    "role": "<role-label>"
  },
  "blob_access": {
    "blob_container": "<blob-content-container-name>",
    "role_blob": "<role-label>"
  },
  "vector_index_access": {
    "index": "<ai-search-index-name>",
    "role_index": "<role-label>"
  }
}

User session preferences (last selected DA group) are stored in a second Cosmos container used by UserProfile (env var: COSMOSDB_CONTAINER_USER_PROFILE-equivalent).

RBAC resolution chain at runtime:

x-ms-client-principal header (App Service Easy Auth)
  -> decode_x_ms_client_principal()
  -> intersect user groups with groupResourcesMap group_ids
  -> priority: admin > contributor > reader
  -> returns controlling group_id

File Upload and Ingestion Pipeline

User uploads file (browser)
  -> POST to backend /upload endpoint
  -> Blob lands in: upload_container (per-group, e.g. "upload-<group-suffix>")
  -> Azure Function: FileUploadedEtrigger fires on BlobCreated event
      -> reads GroupResourceMap to resolve: upload_container -> content_container + index
      -> tags blob metadata (tags persisted to Cosmos status log)
      -> routes to queue based on file type:
           PDF                 -> pdf_submit_queue  -> FileFormRecSubmissionPDF
                                                    -> FileFormRecPollingPDF
           Non-PDF             -> non_pdf_submit_queue -> FileLayoutParsingOther
           Media (audio/video) -> media_submit_queue -> ImageEnrichment / TextEnrichment
           Authority doc       -> authority_doc_queue -> AuthorityDocProcessor
           EN/FR bilingual     -> en_fr_doc_processing -> EnFrTextExtractor
      -> creates Cosmos status log entry (COSMOSDB_LOG_DATABASE_NAME / COSMOSDB_LOG_CONTAINER)
  -> Enrichment functions process file, extract text, chunk, vectorize
  -> Chunks written to: content_container (blob, processed folder)
  -> Vectors + metadata indexed into: AI Search index (per-group)
  -> Status log updated each step (State: Processing -> Complete / Error)

What Is Saved and Where

Data Storage Container / Index Partition Key
Raw uploaded files Azure Blob Storage upload_container (per group) N/A
Processed/chunked content Azure Blob Storage content_container (per group) N/A
AI Search vectors + metadata Azure AI Search index (per group) N/A
Processing status per file Cosmos DB log container document_path
Group -> resource map Cosmos DB groupResourcesMapContainer group_id
User last-selected group Cosmos DB user profile container principalId

Supported Cleanup Doctrine

Client guidance is explicit that project-space cleanup must default to the supported Domain Assistant deletion path rather than direct low-level deletion of downstream assets:

  • remove or reduce project group membership first so the slot can be safely reassigned;
  • use DA Manage Content, or another validated upload-delete trigger path, to delete source files;
  • let the existing delete function clean dependent content storage and index content;
  • purge matching Cosmos file-status history only after the supported delete path completes;
  • do not directly delete content blobs or search-index content as the default operator path, because that can create phantom files and repeated delete-function errors.

Naming Pattern (hccld2 instance)

Based on source analysis, storage resources follow this pattern per group:

  • Upload container: upload-{group-suffix}
  • Content container: content-{group-suffix}
  • Search index: index-{group-suffix} (custom vector index)

The exact suffix strategy is defined in the GroupResourceMap Cosmos documents. It is not derived from Entra group name alone. An Intake Manager must register the entry manually or via this tool.


What This Project Builds

Phase 1 -- Discovery (COMPLETE)

  • Read EVA-JP-v1.2 source (no modifications)
  • Documented: Cosmos GroupResourceMap schema, UserProfile schema, StatusLog schema
  • Documented: full ingestion pipeline (6 function triggers)
  • Documented: supported cleanup path, delete-function cascade, and status-history purge scope
  • Validated: live hccld2 resource inventory

Phase 2 -- Space Admin CLI

Python package eva_space with sub-commands:

eva-space init    --group-id <GID> --group-name <NAME> --role admin|contributor|reader [--dry-run]
eva-space remove  --group-id <GID> --dry-run [--confirm] [--purge-logs]
eva-space export  --format csv|json --output <FILE>
eva-space import  --file <CSV|JSON> [--dry-run]
eva-space restore --snapshot <JSON> [--dry-run] [--confirm]
  • All mutating commands require dry-run preview before --confirm
  • Structured JSON audit receipt per operation (.logs/ directory)
  • EN/FR output; DefaultAzureCredential auth; pytest >= 90% coverage

Phase 3 -- Governed Decommission And Recycle Engine

Core cleanup logic (called by both CLI remove and plugin clean action):

  • plan member removal and admin takeover steps for the target project slot;
  • invoke or support the validated DA file-deletion path so downstream cleanup remains service-native and observable;
  • verify the delete-function cascade across upload, content, index, and status surfaces;
  • optionally purge matching Cosmos status-log entries for the group or project slot;
  • keep direct GroupResourceMap retirement as an explicit follow-on mode, not the default recycle path;
  • JSON audit receipt: files_deleted, status_rows_deleted, downstream_cleanup_verified, actor, timestamp

Phase 4 -- Config Persistence + CSV Export/Import

SpaceConfig schema (JSON):

{
  "exported_at": "<ISO8601>",
  "instance": "hccld2",
  "spaces": [
    {
      "group_id": "<GID>",
      "group_name": "<NAME>",
      "role": "admin|contributor|reader",
      "upload_container": "<NAME>",
      "content_container": "<NAME>",
      "vector_index": "<NAME>",
      "users": ["<UPN1>", "<UPN2>"]
    }
  ]
}

CSV format (Excel-compatible, UTF-8 BOM):

group_id, group_name, role, upload_container, content_container, vector_index, users
  • users column: semicolon-separated UPNs (safe for Excel cells)
  • Import: reads CSV/JSON, diffs vs live state, shows add/update/remove per row before committing
  • Restore: applies full snapshot; rows not in snapshot are flagged (not auto-deleted)

Phase 5 -- DA Space Admin Plugin (InfoAssistant Extension)

FastAPI router (/admin/spaces/*) + React SPA served at the /space-admin path of the hccld2 App Service. Gated to the DA Space Administrator Entra group.

Backend routes:

GET  /admin/spaces/                        list all spaces with metadata
GET  /admin/spaces/{group_id}              space detail + Entra user list
GET  /admin/spaces/{group_id}/users        live Entra group member list
POST /admin/spaces/                        init space (dry-run or commit)
DELETE /admin/spaces/{group_id}            remove space (dry-run or commit)
GET  /admin/spaces/export?format=csv|json  download config file
POST /admin/spaces/import                  upload CSV/JSON, preview diff or commit
GET  /admin/audit-log                      paginated list of past operations

Frontend pages:

  • Spaces list (table: slot, group, role, user count, status, last-modified)
  • Space detail panel (info + users + resource links + action buttons)
  • Clean dialog (dry-run report -> confirm -> progress spinner -> audit receipt)
  • Import/Restore dialog (file picker -> diff table -> confirm)
  • Audit log page (sortable, filterable table)
  • CSV export button (triggers download)
  • EN/FR toggle; WCAG 2.1 AA; Fluent UI v9

Non-Goals

  • This project does not modify C:\eva-foundry\EVA-JP-v1.2 source
  • This project does not deploy new Azure infrastructure (uses existing hccld2 resources)
  • This project does not manage the App Service or Function App deployments
  • This project does not create or delete Entra groups (DA Space Admin handles Entra separately)
  • CSV import does not auto-delete spaces that are absent from the file (requires explicit remove)

Repository Structure

52-DA-space-cleanup/
  README.md
  PLAN.md
  STATUS.md
  ACCEPTANCE.md
  .eva/
    veritas-plan.json
    trust.json
  .github/
    copilot-instructions.md
  docs/
    research/                   -- Phase 1 discovery artifacts
      data-model-map.md
      ingestion-pipeline.md
      naming-convention.md
      delete-surface.md
      hccld2-inventory.md
  src/
    eva_space/                  -- Python CLI + shared engine (Phase 2-4)
      __init__.py
      cli.py                    -- Click entrypoint (init/remove/export/import/restore)
      init_space.py
      remove_space.py
      cleanup_engine.py         -- shared cleanup logic (CLI + plugin backend)
      config_store.py           -- save/restore SpaceConfig JSON snapshots
      csv_export.py             -- GroupResourceMap -> Excel-compatible CSV
      csv_import.py             -- CSV/JSON -> diff + apply
      cosmos_client.py
      search_client.py
      blob_client.py
      entra_client.py           -- Graph API: validate group, list members
      i18n.py                   -- EN/FR string catalog
      audit_log.py              -- structured JSON writer + reader
    tests/
      test_init_space.py
      test_remove_space.py
      test_cleanup_engine.py
      test_config_store.py
      test_csv_export.py
      test_csv_import.py
  plugin/                       -- Phase 5: DA Space Admin Plugin
    backend/
      router_spaces.py          -- FastAPI router: /admin/spaces/*
      router_audit.py           -- FastAPI router: /admin/audit-log
      auth_guard.py             -- Entra group check for DA Space Admin role
    frontend/                   -- React + Vite + Fluent UI v9
      src/
        pages/
          SpacesList.tsx
          SpaceDetail.tsx
          AuditLog.tsx
        components/
          CleanDialog.tsx
          ImportDialog.tsx
          ExportButton.tsx
        hooks/
          useSpaces.ts
          useAuditLog.ts
        i18n/
          en.json
          fr.json
        App.tsx
        main.tsx
      vite.config.ts
      vitest.config.ts
  requirements.txt
  requirements-dev.txt
  .gitignore

Quick Start (Phase 1 -- Analysis Only)

# Prerequisites: az login (EsDAICoESub), Python venv active
C:\eva-foundry\.venv\Scripts\Activate.ps1

# Query live Cosmos GroupResourceMap (read-only)
# See docs/research/cosmos-query-examples.ps1 for full query set

# Query AI Search indexes
az search service list --resource-group infoasst-hccld2 --subscription EsDAICoESub

# Query blob containers
az storage container list --account-name <infoasststorehccld2> --auth-mode login

EVA Data Model Integration

This project uses the 37-data-model ACA API as its source of truth for:

  • service and endpoint registration (eva-space-cli service + all CLI/API endpoints)
  • Cosmos container schema documentation (groupResourcesMapContainer, log container, user profile)
  • AI Search index documentation (per-group vector indexes)
  • screen registration (Phase 4 intake manager portal)

Bootstrap:

$base = $env:DATA_MODEL_BASE
if (-not $base -or $base -eq '') {
  $base = "https://msub-eva-data-model.victoriousgrass-30debbd3.canadacentral.azurecontainerapps.io"
}
Invoke-RestMethod "$base/health"
Invoke-RestMethod "$base/model/projects/52-DA-space-cleanup"
Invoke-RestMethod "$base/model/project_work/?project_id=52-DA-space-cleanup"

Entity registration sequence:

  1. Phase 1 complete -> register infoasst-hccld2 service + Cosmos/Blob/Search containers
  2. Phase 2 start -> register eva-space-cli service + POST /spaces/init endpoint
  3. Phase 3 start -> register DELETE /spaces/{group_id} endpoint
  4. Phase 4 start -> register IntakeManagerPortal screen + GET /spaces endpoint

All discoveries from the hccld2 live audit (DA-02-001 to DA-02-007) are registered as data model entities under the infoasst-hccld2 service scope before Phase 2 begins.


Related Projects

Project Relationship
C:\eva-foundry\EVA-JP-v1.2 Source under analysis (read-only)
37-data-model EVA data model API (registers this project's entities)
31-eva-faces Frontend -- potential host for Phase 4 intake manager portal
33-eva-brain-v2 Backend -- potential API proxy for Phase 4
44-eva-jp-spark Bilingual GC AI assistant (shares design language)
28-rbac RBAC reference implementation

About

DA Space Cleanup -- Domain Assistant Project Space Manager

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors