Azure FinOps workspace for the marcosandbox implementation.
- Source of truth is the Project 37 data model record for
14-az-finops. - Veritas audit and trust score remain the quality gate for repository health.
- Phase 0, Phase 1, and Phase 2 are complete.
- Phase 3 is the active packet: APIM attribution and telemetry.
- The project is now treated as MarcoSub-only validation and development.
- Generated SharePoint and export noise is archived under
docs/finops/_archive/20260320_064339-generated-exports/. - Live documentation stays in
docs/finops/and the root governance files.
- Daily cost exports are configured.
- Storage foundation is in place.
- ADX and ADF ingestion are in place.
- Normalized cost logic is deployed.
- The remaining bounded work is APIM attribution, telemetry ingestion, and toolkit wiring.
- Phase 3: finish APIM base policy, App Insights diagnostics, and telemetry ingestion.
- Phase 3 follow-on: wire FinOps Toolkit Power BI templates to
normalized_costs. - Phase 4: tag FinOps, Azure Optimization Engine, and governance hardening.
- 2026-02-16: Phase 0 data collection foundation established.
- 2026-02-25: Phase 1 storage foundation completed.
- 2026-02-26: Phase 2 analytics and ingestion completed.
- 2026-03-03: project primed and governance docs normalized.
- 2026-03-20: semantic housekeeping archived generated export noise and reset the root docs to current truth.
$base = "https://msub-eva-data-model.victoriousgrass-30debbd3.canadacentral.azurecontainerapps.io"
Invoke-RestMethod "$base/model/projects/14-az-finops" | Select-Object id, maturity, phase, pbi_total, pbi_donedocs/finops/holds the project docs and operational evidence.tools/finops/holds inventory and support scripts.evidence/,debug/,logs/, andsessions/hold generated artifacts.
| Tool | Purpose | How to Use |
|---|---|---|
| 37-data-model | Single source of truth for all project entities | GET https://msub-eva-data-model.victoriousgrass-30debbd3.canadacentral.azurecontainerapps.io/model/projects/14-az-finops |
| 48-eva-veritas | Trust score and coverage audit | MCP tool: audit_repo / get_trust_score |
| 07-foundation-layer | Copilot instructions primer + governance templates | Scripts: Invoke-PrimeWorkspace.ps1, Apply-Project07-Artifacts.ps1 |
Agent rule: Query the data model API before reading source files.
$base = "https://msub-eva-data-model.victoriousgrass-30debbd3.canadacentral.azurecontainerapps.io"
Invoke-RestMethod "$base/model/agent-guide" # complete protocol
Invoke-RestMethod "$base/model/agent-summary" # all layer countsLast Updated: March 9, 2026 (Fractal DPDCA Alignment - Governance Updated to Reflect Phases 0-2 Complete) Status: [PASS] Phase 1 COMPLETE ? containers, lifecycle policy, 28 blobs migrated | [PASS] Phase 2 COMPLETE ? ADX cluster, ADF pipeline, schema deployed, backfill triggered | [IN PROGRESS] Phase 3 ? NormalizedCosts() v2 deployed (tag bug fixed, dual-schema SSC+Legacy, CanonicalEnvironment, 6 ESDC dimensions, 460K rows clean); APIM policy + telemetry pipeline pending | [NEW] FinOps Toolkit Discovery ? 5 components mapped, 3 integration gaps identified Scope: Originating FinOps use case for EsDAICoESub (d2d4e571-e0f2-4f6c-901a-f88f7669bcba) and EsPAICoESub (802d84ab-3189-4221-8453-fcc30c8dc8ea), using shared landing and ingestion assets in the ESDC sandbox, including
marcosandboxfinopshubandmarco-sandbox-finops-adf; MarcoSub references are validation-only where explicitly called out Tenant: bfb12ca1-7f37-47d5-9cf5-8aa52214a0d8 (ESDC tenant - MarcoSub subscription) User: marco.presta@hrsdc-rhdcc.gc.ca (Owner on MarcoSub)
Scope: Deep discovery of Microsoft FinOps Toolkit components against your current project state.
Status: ? Already Building
Cost: $120/mo base + $10/mo per $1M monitored (~$130/mo for your $567K spend)
What You Have: ADX cluster, ADF pipeline, ADLS Gen2 containers, lifecycle policy (90d Cool, 180d Archive)
What's New: Toolkit includes 5 pre-built Power BI reports (charge breakdown, summary, recommendations, governance, rate-optimization) ready to wire to your ADX schema
Phase 3 Action: Wire toolkit Power BI templates ? your ADX normalized_costs table
Status: ? Missing ? High Priority
What It Does: Weekly custom recommendations framework; extends Azure Advisor with 25+ audit rules
Setup: 5 min deployment; outputs to Log Analytics + SQL Server + Power BI
Capabilities:
- Cost: VM right-sizing (with guest OS metrics), orphaned disks/IPs, idle compute, forgotten VMs
- HA: Availability zones, managed disks, VM Scale Sets distribution
- Security: Expired service principal certs, NSG rules with orphaned NICs
- OpsExc: Identity governance, policy compliance, resource group limits
- Workbooks: Reservations potential, cost anomalies, identity governance, policy compliance
Your Overlap: You have manual "idle resource detection" in Advanced-Capabilities-Showcase ? AOE automates this weekly
Phase 4 Action: Deploy AOE into sandbox; wire to your ADX for anomaly correlation
Status: ?? Partial ? You have docs, need Monitor workbooks
What It Includes: 2 pre-built workbooks (Optimization + Governance)
Integration: Works with ADX, ADLS, or both
Phase 4 Action: Deploy toolkit workbooks alongside your ADX dashboards
Status: ?? Not Using ? Uses custom scripts instead
Available Commands:
- Cost Management: Get/New/Remove/Start Cost Exports
- FinOps Hubs: Deploy/Get/Initialize/Register/Remove hub lifecycle
- Open Data: Pricing units, regions, resource types, services lookups
Your Approach: Custom az-inventory-finops.ps1 works well; toolkit module offers pre-built integration
Phase 4 Action: Optional ? keep custom scripts or migrate to module for consistency
Status: ?? Partial ? You built custom; toolkit has pre-built
Bicep: Scheduled actions module (Cost Management alerts)
Open Data: Reference CSVs (PricingUnits, Regions, ResourceTypes, Services, FOCUS examples)
Your Coverage: 99.997% tag coverage; open data would validate normalization further
Phase 4 Action: Validate your normalized_costs schema against FOCUS 1.0 spec
| Gap | Your Project | Toolkit Capability | Phase 3/4 Action |
|---|---|---|---|
| Power BI Starter Kits | Advanced-Capabilities manual doc | 5 pre-built reports for ADX | Wire toolkit reports to your ADX schema |
| Weekly Recommendations | Manual analysis in Advanced-Capabilities | AOE weekly automation + 10 workbooks | Deploy AOE; correlate anomalies with your data |
| FOCUS Alignment | Custom normalized_costs schema | Toolkit FOCUS 1.0 examples + metadata | Validate schema against official spec |
Client Follow-Up: After the success of saving-opportunities.md ($80K+ quick wins), this comprehensive capabilities document answers "What else can we do with the data we've collected?"
ADVANCED-CAPABILITIES-SHOWCASE.md ? Complete Operations Playbook
12 Advanced Analytics + 8 Operational Use Cases ready to deploy:
- Real-Time Anomaly Detection ? Prevent $150K+ incidents; detect cost spikes in <4 hours (vs. 16 days detected)
- Cost Attribution by Team/Department ? Chargeback visibility for all cost centers; $300K+ in owner accountability
- Idle & Underutilized Resource Detection ? $48-84K/yr from compute scheduling + resource cleanup
- Cost Driver Analysis & Service Benchmarking ? $18-24K/yr from Search consolidation insights
- Chargeback & Finance Integration ? Automated monthly invoicing via SAP integration; 99.997% tag compliance
- APIM Cost Attribution ? Per-API chargeback on 50+ EVA-JP APIs (Phase 3 blocker; pending telemetry)
- Sustainability & Carbon Footprint ? ESG reporting, carbon credit monetization, emissions tracking
- Budget Forecasting & Planning ? ARIMA forecasting with seasonality; what-if scenario modeling
- Multi-Region & HA Cost Trade-Off Analysis ? Business case validation for failover design
- Compliance & Audit Trail Reporting ? SOX/PCI-ready audit evidence; cost immutability proof
- Technology Stack Optimization & TCO ? Reserved Instance savings ($80K/yr); architecture optimization roadmap
- Organizational Change & Cost Culture ? 5-15% cost reduction through behavior change; engineering cost awareness
Key Metrics (460K+ cost records collected):
- Cost ingested: $567K total ($318K/yr steady-state after anomalies)
- Tag coverage: 99.997% (460,594/460,609 rows)
- Development vs Production ratio: 4.7:1 (shows optimization opportunity)
- Quick wins identified: $103K/yr potential (32% of current spend)
Navigation: Read the document by role:
- Finance/CFO: Executive summary + chargeback automation (Part 4)
- Engineers: Idle resource detection + anomaly detection (Part 1-2)
- Architects: Technology stack optimization + multi-region analysis (Part 3)
- Operational: Compliance + cost culture + forecasting (Part 4)
8 Complete Deliverables (4,000+ lines, 1,800+ lines of embedded code):
-
00-current-state-inventory.md (268 lines)
- Evidence-based snapshot of existing Azure resources
- Verified inventory: 6 storage containers, 2 APIM APIs, 6 Event Grid topics
- EVA-JP API context: 50+ APIs (38+ active endpoints) requiring cost attribution
- Evidence sources: Baseline captured 2026-02-17 09:09:32 AM ET (12 JSON artifacts)
-
01-gap-analysis-finops-hubs.md (398 lines)
- 7 Critical Gaps identified vs Microsoft FinOps Hubs reference architecture
- Priority matrix: P0 (ADX, ADF, APIM), P1 (Storage, Power BI), P2 (Governance)
- 4-phase remediation sequence (9 weeks, 68 story points)
- Cost estimate: $350/month incremental infrastructure
-
02-target-architecture.md (560 lines)
- Mermaid architecture diagram (full data flow: Cost Management ? ADLS ? ADF ? ADX ? Power BI)
- 7 component specifications with embedded code (900+ lines: Bicep, KQL, JSON, XML)
- ADX schema: 3 tables (raw_costs, apim_usage, normalized_costs) + allocation function
- APIM policy: Cost attribution headers (x-eva-costcenter, x-eva-caller-app, x-eva-environment)
-
03-deployment-plan.md (480 lines)
- 4 Phases, 9 Weeks: Storage Foundation (Weeks 1-2) ? Analytics (3-4) ? APIM Attribution (5-7) ? Governance (8-9)
- Bicep modules for each phase (500+ lines embedded IaC)
- Acceptance criteria per phase with validation commands
- Rollback procedures for deployment safety
-
PHASE1-DEPLOYMENT-CHECKLIST.md (480 lines) ? NEW
- 6 Deployment Tasks (13 story points): Containers ? Lifecycle Policy ? Migration ? Export Config ? Event Grid
- Step-by-step operational guide with copy-paste ready commands
- Embedded migration script:
migrate-costs-to-raw.ps1(60+ lines) with dry-run mode - Bicep module: Storage lifecycle policy (90d Cool, 180d Archive)
- Per-task acceptance criteria with evidence collection
- Rollback procedures (original costs/ container preserved until validation)
-
04-backlog.md (340 lines)
- 5 Epics, 68 Story Points, 9 Weeks: Epic 1 (Foundation, 13pts) ? Epic 2 (Analytics, 21pts) ? Epic 3 (APIM, 13pts) ? Epic 4 (Reporting, 8pts) ? Epic 5 (Governance, 13pts)
- ADO-ready work items with T-shirt sizing (XS=1pt to XL=8pt)
- Success metrics (KPIs): Ingestion latency <30min, APIM attribution >80%, Cost accuracy ?5%
-
05-evidence-pack.md (340 lines)
- 9 Evidence Categories: Storage, Event Grid, ADF, ADX, APIM, Cost Exports, RBAC, Monitoring, Summary
- Executable validation commands (Azure CLI + KQL queries)
- Expected outputs for audit trail
-
az-inventory-finops.ps1 (650 lines) ? TESTED
- Automated inventory collection (15+ Azure resource types across 2 subscriptions)
- Successfully executed 2026-02-17 09:09:32 AM ET
- Generated 12 JSON baseline artifacts (storage, APIM, Event Grid, RBAC)
Navigation: See docs/finops/README.md for complete index with reading order by role (architect, PM, engineer, QA)
Business Value:
- ? Cost Visibility: Centralized analytics across 2 subscriptions (EsDAICoESub + EsPAICoESub)
- ? API Attribution: Chargeback for 50+ EVA-JP APIs (38+ active endpoints from EVA-JP-reference-0113) via APIM telemetry
- ? Application Mapping: Integration with EVA-JP comprehensive API inventory (FastAPI backend, 2144 lines)
- ? Compliance: Tag enforcement, 90%+ resource compliance target
- ? Optimization: Power BI dashboards for trend analysis, cost allocation accuracy ?5%
| Method | Status | Use Case | Limitation |
|---|---|---|---|
| Azure Portal native exports | ? [ACTIVE] Configured | Daily cost data to Blob storage | Manual Portal setup required |
| Python SDK (extract_costs_sdk.py) | ? [WORKS] | Ad-hoc queries <5,000 rows | Pagination broken >5,000 rows |
| REST API via az rest | ? [TESTED] | Historical backfill with pagination | Daily aggregated totals only (no resource details) |
| ADF Pipeline (ingest-costs-to-adx) | ? [DEPLOYED] | Blob CSV ? ADX raw_costs via Event Grid trigger | marco-sandbox-finops-adf deployed Feb 26 |
| ADX (marcofinopsadx) | ? [DEPLOYED] | KQL analytics ? NormalizedCosts() v2 (tag bug-fix, dual-schema, CanonicalEnvironment, 6 ESDC dims), AllocateCostByApp() v2 | Dev(No SLA) SKU, canadacentral, finopsdb ? 460,609 rows, 99.997% tagged |
| Method | Blocker | Attempted |
|---|---|---|
| SDK pagination (>5,000 rows) | TypeError: Session.request() got unexpected kwarg 'skip_token' (azure-mgmt-costmanagement 4.0.1, Python 3.13) | 10+ attempts with chunking strategies |
| CLI export create (az costmanagement export) | Extension install fails (Windows registry bug: FileNotFoundError CSIDL_COMMON_APPDATA) | 3 attempts |
| REST API export config | Microsoft.CostManagementExports provider not registered (needs Owner/Contributor) | 2 attempts |
| FinOps Hub deployment | AuthorizationFailed - no Contributor role on subscription | 2 attempts |
| Programmatic storage access outside approved path | Azure Policy or network controls can block CLI/PowerShell from unmanaged paths even when the storage account is reachable from an approved ESDC network or Dev Box session | Multiple attempts |
14-az-finopsis the originating FinOps use case for the EsDAICoESub and EsPAICoESub estates.marcosandboxfinopshubis the shared landing surface used to centralize the subscription exports and feed the downstream analytics path.51-ACAconsumes this FinOps evidence as part of the ACA product behavior.75-EVA-vNext/products/acais the current productized continuation of that same lineage.
| Export | Subscription | Status | Storage | Configured Via | Evidence |
|---|---|---|---|---|---|
| EsDAICoESub-Daily | d2d4e571-e0f2-4f6c-901a-f88f7669bcba | [PASS] ACTIVE -- 10 consecutive runs Feb 16-25 | marcosandboxfinopshub/costs/EsDAICoESub/ | Azure Portal | Portal run history verified 2026-02-25 ~3:28 PM UTC daily |
| EsPAICoESub-Daily | 802d84ab-3189-4221-8453-fcc30c8dc8ea | [WARN] NOT YET VERIFIED -- Portal check pending | marcosandboxfinopshub/costs/EsPAICoESub/ | Azure Portal | CLI SA query: EmailAlert only (disabled) -- see portal |
| Subscription | ID | Resources | Daily Records | Notes |
|---|---|---|---|---|
| EsDAICoESub | d2d4e571-e0f2-4f6c-901a-f88f7669bcba | 1,200 resources | ~7,000+/day | Primary AI/ML (dev/stage), 21 Azure OpenAI instances |
| EsPAICoESub | 802d84ab-3189-4221-8453-fcc30c8dc8ea | 203 resources | ~2,500/month | Production (prd), 3 Azure OpenAI instances |
Note: Cost figures intentionally omitted pending comprehensive FinOps Hub deployment. Focus: 50+ EVA-JP APIs requiring APIM cost attribution across environments.
What we have:
- Cost Management Contributor (read cost data, configure exports via Portal)
- Cost Management Reader (view costs)
- Azure CLI authentication (az login --tenant 9ed55846-...)
What we lack (and why it matters):
- Contributor/Owner on subscription: Cannot register resource providers, deploy resources, or configure exports programmatically via REST API
- Storage Data Contributor: Cannot write to storage via CLI/SDK (Azure Policy firewall blocks non-Portal access)
Workaround: All configuration done via Azure Portal GUI, which bypasses both permission and firewall restrictions.
14-az-finops/
README.md # This file
requirements.txt # Python deps (azure-mgmt-costmanagement, pandas)
docs/
finops/ # ? NEW: Phase 1 FinOps Enterprise Roadmap
README.md # Navigation index (390 lines)
00-current-state-inventory.md # Evidence-based snapshot (268 lines)
01-gap-analysis-finops-hubs.md # 7 gaps vs Microsoft reference (398 lines)
02-target-architecture.md # Mermaid diagrams + code samples (560 lines)
03-deployment-plan.md # 4 phases, 9 weeks, Bicep modules (480 lines)
PHASE1-DEPLOYMENT-CHECKLIST.md # ? Operational guide (480 lines, copy-paste ready)
04-backlog.md # 5 epics, 68 story points (340 lines)
05-evidence-pack.md # Validation commands (340 lines)
EXPORTATION-BEST-PRACTICES.md # Export strategy and lessons learned
RATE-LIMITING-BEST-PRACTICES.md # API rate limit handling
tools/
finops/
az-inventory-finops.ps1 # ? Automated inventory (650 lines, TESTED)
out/ # Baseline artifacts (12 JSON files, 2026-02-17 09:09:32)
storage-containers-20260217-090932.json
apim-apis-20260217-090932.json
eventgrid-subscriptions-20260217-090932.json
storage-rbac-20260217-090932.json
# ... 8 more JSON files
scripts/
extract_costs_sdk.py # Python SDK - ad-hoc queries (<5K rows)
Backfill-Costs-REST.ps1 # REST API historical backfill (1-month tested)
migrate-costs-to-raw.ps1 # [DONE P1] 28 blobs migrated to raw/costs/
backfill-historical.ps1 # [DONE P2] Trigger ADF for all 28 historical blobs
adx-cluster.bicep # [DONE P2] marcofinopsadx Dev SKU, canadacentral
managed-identity.bicep # [DONE P2] mi-finops-adf + RBAC
assign-rbac-roles.ps1 # [DONE P2] Storage Blob Data Contributor on mi-finops-adf
create-eventgrid-subscription.ps1 # [DONE P2] Event Grid -> ADF webhook wiring
create-schema.kql # [DONE P2] Combined schema (all tables + mappings)
run-schema.ps1 # [DONE P2] Individual KQL file runner (REST API)
exec-kql.ps1 # [DONE P2] Generic KQL query runner
deploy-adf-artefacts.ps1 # [DONE P2] Deploy ADF linked services/datasets/pipelines
redeploy-adf-artefacts.ps1 # [DONE P2] Redeploy ADF artefacts
deploy-functions.ps1 # [DONE P3] Deploy NormalizedCosts + AllocateCostByApp
lifecycle-policy.json # [DONE P1] 90d Cool, 180d Archive
kql/
01-raw-costs-table.kql # [DONE P2] raw_costs schema
02-ingestion-mapping.kql # [DONE P2] CostExportMapping (ordinals verified)
03-raw-costs-retention.kql # [DONE P2] 365d retention
06-apim-usage-table.kql # [DONE P2] apim_usage schema
07-apim-usage-retention.kql # [DONE P2] 365d retention
08-normalized-costs-function.kql # [DONE P3] NormalizedCosts() + attribution fallbacks
09-allocate-cost-function.kql # [DONE P3] AllocateCostByApp() 3-tier attribution
adf/
linked-services/ # [DONE P2] ls_marcosandbox_blob, ls_marcofinops_adx
datasets/ # [DONE P2] ds_blob_cost_csv, ds_adx_raw_costs
pipelines/ # [DONE P2] ingest-costs-to-adx (RFC4180 fixed)
Configure-EsDAICoESub-CostExport.ps1 # Export config (blocked by provider)
Configure-EsPAICoESub-CostExport.ps1 # Export config (blocked by provider)
azure_inventory.py # Resource inventory
offline-packages/ # Offline .whl files for ESDC workstation
portal-exports/ # Manual Portal downloads (Jan 27, 2026)
output/
historical/ # Target for backfill output
EsDAICoESub/ # REST API backfill (1-month tested)
EsPAICoESub/ # REST API backfill (1-month tested)
archive/
20260216-audit-cleanup/ # 96+ archived files from 3-week investigation
| Task | Status | Notes |
|---|---|---|
| Configure EsDAICoESub daily export (Portal) | [PASS] CONFIRMED | 10 consecutive runs Feb 16-25, Portal run history verified 2026-02-25 |
| Configure EsPAICoESub daily export (Portal) | [PASS] DONE | Active, succeeded (Feb 16, 11:36 AM) -- Portal verification pending |
| Test Backfill-Costs-REST.ps1 for historical data | [PASS] DONE | 1-month test successful (January 2026) |
| Verify export data landing in Blob storage | [PASS] DONE | Confirmed 2.3MB + 371KB exports present |
| Capture baseline inventory | [PASS] DONE | 12 JSON artifacts, 2026-02-17 09:09:32 AM ET |
| Create comprehensive roadmap | [PASS] DONE | 8 deliverables, 4,000+ lines, 1,800+ lines code |
| Data model updated | [PASS] DONE 2026-02-25 | maturity=active phase=Phase 1 - Storage Foundation rv=10 |
Status: [PASS] ALL 6 TASKS COMPLETE ? See PHASE1-DEPLOYMENT-CHECKLIST.md for evidence
Storage Audit (2026-02-25 ? all containers confirmed):
| Container | Status | Required by Phase 1 | Result |
|---|---|---|---|
costs |
[PASS] (since Feb 3) | Preserved (original) | Retained as source |
processed |
[PASS] (since Feb 6) | Partial match | Retained |
config |
[PASS] (since Feb 11) | Not in spec | Retained |
ingestion |
[PASS] (since Feb 11) | Not in spec | Retained |
msexports |
[PASS] (since Feb 11) | Not in spec | Retained |
raw |
[PASS] Created Feb 25 | YES | Task 1.1.1 DONE |
archive |
[PASS] Created Feb 25 | YES | Task 1.1.1 DONE |
checkpoint |
[PASS] Created Feb 25 | YES | Task 1.1.1 DONE |
6 Deployment Tasks ? All Complete:
| Task | Size | Description | Result |
|---|---|---|---|
| 1.1.1 | XS (1pt) | Create storage containers (raw, archive, checkpoint) | [PASS] Feb 25 |
| 1.1.2 | S (2pt) | Lifecycle policy (90d Cool, 180d Archive) | [PASS] lifecycle-policy.json deployed Feb 25 |
| 1.1.3 | M (3pt) | Migrate exports to raw/costs/ hierarchy |
[PASS] 28 blobs migrated ? migration-log-20260225-*.txt |
| 1.2.1 | S (2pt) | Update export destinations (Portal) | [PASS] EsDAICoESub + EsPAICoESub pointing to raw/ |
| 1.3.1 | S (2pt) | Verify Event Grid system topic | [PASS] Active |
| 1.3.2 | M (3pt) | Event subscription ? ADF webhook | [PASS] create-eventgrid-subscription.ps1 deployed Feb 26 |
Evidence: scripts/migration-log-20260225-17*.txt (28 blobs confirmed), scripts/lifecycle-policy.json
Status: [PASS] ALL TASKS COMPLETE ? marcofinopsadx live, ADF pipeline deployed, 28 historical blobs backfilled
Completed Tasks:
| Task | Description | Artifact | Result |
|---|---|---|---|
| 2.1.1 | ADX cluster marcofinopsadx Dev(No SLA)_Standard_D11_v2 |
scripts/adx-cluster.bicep |
[PASS] canadacentral, finopsdb |
| 2.1.2 | ADX schema: raw_costs + CostExportMapping + apim_usage |
scripts/kql/01-07-*.kql |
[PASS] Ordinals verified from live CSV |
| 2.1.2 | KQL functions: NormalizedCosts() + AllocateCostByApp() |
scripts/kql/08-09-*.kql |
[PASS] 3-tier attribution fallbacks |
| 2.2.1 | Managed identity mi-finops-adf + RBAC (Storage Blob Data Contributor) |
scripts/managed-identity.bicep, assign-rbac-roles.ps1 |
[PASS] Feb 26 |
| 2.2.2 | ADF linked services (blob + ADX) | scripts/adf/linked-services/ |
[PASS] ls_marcosandbox_blob, ls_marcofinops_adx |
| 2.2.3 | ADF datasets (blob CSV + ADX raw_costs) | scripts/adf/datasets/ |
[PASS] ds_blob_cost_csv, ds_adx_raw_costs |
| 2.2.4 | ADF pipeline ingest-costs-to-adx (Blob ? ADX + checkpoint) |
scripts/adf/pipelines/ingest-costs-to-adx.json |
[PASS] RFC4180 escaping fixed |
| 2.3 | Backfill: triggered pipeline for all 28 historical blobs | scripts/backfill-historical.ps1 |
[PASS] 28 runs triggered Feb 26 |
ADX Database: finopsdb on marcofinopsadx.canadacentral.kusto.windows.net
ADF Factory: marco-sandbox-finops-adf (EsDAICoE-Sandbox RG)
Status: [IN PROGRESS] ? Attribution KQL layer v2 complete (tag bug fixed, ESDC dimensions live); APIM policy + telemetry pipeline pending
Completed (Feb 26):
| Task | Description | Artifact | Result |
|---|---|---|---|
| 3.3a | Critical bug-fix: Tags exported WITHOUT {} ? parse_json() returned null for 100% of rows. Fixed with iif(? !startswith "{", strcat("{",?,"}"), ?). |
08-normalized-costs-function.kql |
[PASS] 460,594 rows now parsed |
| 3.3b | NormalizedCosts() v2 ? dual-schema (SSC Standard 77.8% + Legacy 22.2%), CanonicalEnvironment (Dev/Stage/Prod), 6 new ESDC dimensions: SscBillingCode, FinancialAuthority, OwnerManager, ClientBu, ProjectDisplayName, IsSharedCost |
scripts/kql/08-normalized-costs-function.kql |
[PASS] Deployed |
| 3.3c | AllocateCostByApp() v2 ? 3-tier: header (APIM telemetry) ? tag ? pre-apim; outputs SscBillingCode + CanonicalEnvironment |
scripts/kql/09-allocate-cost-function.kql |
[PASS] Deployed |
| 3.3d | Deploy + smoke-test (6 validation queries) ? 460,609 rows: Dev 72.5% / Prod 15.4% / Stage 12% | scripts/deploy-functions.ps1, smoke-test-v2.ps1 |
[PASS] Evidence saved |
Remaining Tasks:
| Task | Description | Blocker / Notes |
|---|---|---|
| 3.1 | APIM base policy ? inject x-caller-app, x-costcenter, x-eva-environment headers |
Needs APIM policy deployment on 50+ EVA-JP APIs |
| 3.2 | App Insights diagnostics ? 100% sampling, log all APIM requests | Needs App Insights resource linked to APIM |
| 3.4 | Telemetry ingestion pipeline ? App Insights ? apim_usage table in ADX |
Needs ADF pipeline or Logic App trigger |
Business Value: Once APIM policy live, Tier 1 attribution activates automatically (pre-apim rows in ADX flip to header-attributed)
Business Value: 50+ EVA-JP APIs (38+ active endpoints) get granular per-app cost visibility
Dependencies: Phase 2 complete [DONE]
Status: ? [PLANNED] - See 03-deployment-plan.md ? Phase 4
Key Tasks:
- Deploy 3 Power BI reports (cost trend, allocation, tag compliance)
- Implement Azure Policy (tag enforcement: CostCenter required)
- Configure VNet + private endpoints (storage, ADX)
- Deploy GitHub Actions CI/CD workflow
- Final validation (all commands from 05-evidence-pack.md)
Success Metrics:
- Cost Data Completeness: 100% (12 months backfilled)
- Ingestion Latency: <30 minutes
- APIM Attribution Coverage: >80% requests tagged
- Tag Compliance: >90% resources
- Cost Allocation Accuracy: ?5%
- Power BI Query Performance: <3 seconds
| Phase | Duration | Story Points | Status | Key Deliverable |
|---|---|---|---|---|
| Phase 0 | 3 weeks | N/A | ? [COMPLETE] | Daily exports + Comprehensive roadmap |
| Phase 1 | 2 weeks | 13 | ? [COMPLETE] Feb 25 | Storage foundation + 28 blobs migrated to raw/ |
| Phase 2 | 2 weeks | 21 | ? [COMPLETE] Feb 26 | ADX + ADF + Schema + 28-blob backfill triggered |
| Phase 3 | 3 weeks | 13 | ?? [IN PROGRESS] Feb 26 | APIM attribution ? KQL layer done, APIM policy pending |
| Phase 4 | 2 weeks | 21 | ? [PLANNED] | Power BI + Governance |
| TOTAL | 9 weeks | 68 points | 15 story points/sprint | Enterprise FinOps capability |
# Authenticate
az login --tenant 9ed55846-8a81-4246-acd8-b1a01abfc0d1
# Query costs for a few days (SDK works for <5000 rows)
python scripts/extract_costs_sdk.py --subscription d2d4e571-e0f2-4f6c-901a-f88f7669bcba --start-date 2026-02-10 --end-date 2026-02-16- Go to Azure Portal > Cost Management > Exports
- Select subscription scope
- Create new export: Type=ActualCost, Schedule=Daily, Format=CSV
- Storage: marcosandboxfinopshub, container=costs, directory={SubscriptionName}
- Set date range and save
# Test with 1 month
.\scripts\Backfill-Costs-REST.ps1 -MonthsToBackfill 1
# Full 12 months
.\scripts\Backfill-Costs-REST.ps1 -MonthsToBackfill 12This project went through 10 extraction attempts over 3 weeks (Jan 27 - Feb 16, 2026) cycling between SDK, CLI, REST API, and Portal before identifying two fundamental blockers:
-
Python SDK pagination is broken: azure-mgmt-costmanagement 4.0.1 on Python 3.13 throws
TypeError: Session.request() got an unexpected keyword argument 'skip_token'after page 1-2. All chunking strategies (monthly, weekly, 10-day, daily) still hit this ceiling at 5,000-10,000 rows. Thedata/historical/files that appeared complete were silently truncated at exactly 5,001 rows (1 header + 5,000 data = single page). -
No Contributor permissions: Cannot register resource providers, deploy infrastructure, or configure exports programmatically. Azure Policy firewall blocks CLI/PowerShell storage access.
What actually works: Azure Portal for export configuration (bypasses both blockers), and az rest for direct REST API calls with manual nextLink pagination.
| # | Date | Method | Result | Blocker |
|---|---|---|---|---|
| 1 | Jan 27 | Manual Portal download | Partial success | Manual, not automated |
| 2 | Jan 29 | FinOps Toolkit (Deploy-FinOpsHub) | Failed | Region restrictions + no permissions |
| 3 | Jan 30 | CLI (az costmanagement export create) | Failed | Extension install fails (Windows) |
| 4 | Jan 31 | REST API export config | Failed | CostManagementExports provider not registered |
| 5 | Feb 1 | Python SDK (small query) | Success (28 rows) | Worked but only because <5,000 rows |
| 6 | Feb 6 | SDK 12-month extraction | Silently truncated | All 11 CSVs capped at 5,001 rows |
| 7 | Feb 10 | Deploy-ESDC-FinOps.ps1 | Failed | AuthorizationFailed (no Contributor) |
| 8 | Feb 15 | Scope change + deploy retry | Failed | Same permission blocker |
| 9 | Feb 16 | SDK weekly/10-day chunking | Still truncated | SDK pagination broken regardless of chunk size |
| 10 | Feb 16 | Portal native export config | Success | Bypasses all blockers |
See archive/20260216-audit-cleanup/ for the full history of attempts (96+ archived files).
- Start Here: docs/finops/README.md - Master index with reading order by role
- Deploy Phase 1: PHASE1-DEPLOYMENT-CHECKLIST.md - Step-by-step operational guide
- Understand Gaps: 01-gap-analysis-finops-hubs.md - 7 critical gaps vs Microsoft reference
- Architecture: 02-target-architecture.md - Mermaid diagrams + code samples
- Sprint Planning: 04-backlog.md - 5 epics, 68 story points, ADO-ready
- Validation: 05-evidence-pack.md - Executable commands for audit
- Executive Summary: docs/finops/README.md ? Business Value section
- Investment: $350/month incremental (ADX Dev SKU + storage/networking)
- Timeline: 9 weeks (4.5 sprints at 15 points/sprint)
- ROI: Cost visibility for 1,403 resources + 50+ EVA-JP APIs, tag compliance >90%, chargeback capability
- Evidence Baseline: tools/finops/out/ - 12 JSON artifacts (2026-02-17 09:09:32 AM ET)
- Validation Commands: 05-evidence-pack.md - 9 evidence categories
- Inventory Script: az-inventory-finops.ps1 - Automated capture
- Current State: 00-current-state-inventory.md - Evidence-based assessment
Project Owner: Marco Presta (marco.presta@hrsdc-rhdcc.gc.ca)
Last Updated: February 26, 2026
Current Phase: Phase 3 ? APIM Attribution & Telemetry [IN PROGRESS]
ADX: marcofinopsadx.canadacentral.kusto.windows.net / finopsdb | ADF: marco-sandbox-finops-adf