Skip to content

eva-foundry/14-az-finops

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

14-az-finops

Azure FinOps workspace for the marcosandbox implementation.

Current Truth

  • Source of truth is the Project 37 data model record for 14-az-finops.
  • Veritas audit and trust score remain the quality gate for repository health.
  • Phase 0, Phase 1, and Phase 2 are complete.
  • Phase 3 is the active packet: APIM attribution and telemetry.
  • The project is now treated as MarcoSub-only validation and development.
  • Generated SharePoint and export noise is archived under docs/finops/_archive/20260320_064339-generated-exports/.
  • Live documentation stays in docs/finops/ and the root governance files.

What Exists Now

  • Daily cost exports are configured.
  • Storage foundation is in place.
  • ADX and ADF ingestion are in place.
  • Normalized cost logic is deployed.
  • The remaining bounded work is APIM attribution, telemetry ingestion, and toolkit wiring.

Current Packet

  1. Phase 3: finish APIM base policy, App Insights diagnostics, and telemetry ingestion.
  2. Phase 3 follow-on: wire FinOps Toolkit Power BI templates to normalized_costs.
  3. Phase 4: tag FinOps, Azure Optimization Engine, and governance hardening.

History

  • 2026-02-16: Phase 0 data collection foundation established.
  • 2026-02-25: Phase 1 storage foundation completed.
  • 2026-02-26: Phase 2 analytics and ingestion completed.
  • 2026-03-03: project primed and governance docs normalized.
  • 2026-03-20: semantic housekeeping archived generated export noise and reset the root docs to current truth.

Key Files

Live Query

$base = "https://msub-eva-data-model.victoriousgrass-30debbd3.canadacentral.azurecontainerapps.io"
Invoke-RestMethod "$base/model/projects/14-az-finops" | Select-Object id, maturity, phase, pbi_total, pbi_done

Working Set

  • docs/finops/ holds the project docs and operational evidence.
  • tools/finops/ holds inventory and support scripts.
  • evidence/, debug/, logs/, and sessions/ hold generated artifacts.

Azure FinOps - ESDC Cost Management

EVA Ecosystem Integration

Tool Purpose How to Use
37-data-model Single source of truth for all project entities GET https://msub-eva-data-model.victoriousgrass-30debbd3.canadacentral.azurecontainerapps.io/model/projects/14-az-finops
48-eva-veritas Trust score and coverage audit MCP tool: audit_repo / get_trust_score
07-foundation-layer Copilot instructions primer + governance templates Scripts: Invoke-PrimeWorkspace.ps1, Apply-Project07-Artifacts.ps1

Agent rule: Query the data model API before reading source files.

$base = "https://msub-eva-data-model.victoriousgrass-30debbd3.canadacentral.azurecontainerapps.io"
Invoke-RestMethod "$base/model/agent-guide"   # complete protocol
Invoke-RestMethod "$base/model/agent-summary" # all layer counts

Last Updated: March 9, 2026 (Fractal DPDCA Alignment - Governance Updated to Reflect Phases 0-2 Complete) Status: [PASS] Phase 1 COMPLETE ? containers, lifecycle policy, 28 blobs migrated | [PASS] Phase 2 COMPLETE ? ADX cluster, ADF pipeline, schema deployed, backfill triggered | [IN PROGRESS] Phase 3 ? NormalizedCosts() v2 deployed (tag bug fixed, dual-schema SSC+Legacy, CanonicalEnvironment, 6 ESDC dimensions, 460K rows clean); APIM policy + telemetry pipeline pending | [NEW] FinOps Toolkit Discovery ? 5 components mapped, 3 integration gaps identified Scope: Originating FinOps use case for EsDAICoESub (d2d4e571-e0f2-4f6c-901a-f88f7669bcba) and EsPAICoESub (802d84ab-3189-4221-8453-fcc30c8dc8ea), using shared landing and ingestion assets in the ESDC sandbox, including marcosandboxfinopshub and marco-sandbox-finops-adf; MarcoSub references are validation-only where explicitly called out Tenant: bfb12ca1-7f37-47d5-9cf5-8aa52214a0d8 (ESDC tenant - MarcoSub subscription) User: marco.presta@hrsdc-rhdcc.gc.ca (Owner on MarcoSub)


NEW: FinOps Toolkit Discovery (March 2, 2026, 5:38 PM ET)

Scope: Deep discovery of Microsoft FinOps Toolkit components against your current project state.

Components Discovered

1. FinOps Hubs ? Core Data Platform

Status: ? Already Building
Cost: $120/mo base + $10/mo per $1M monitored (~$130/mo for your $567K spend)
What You Have: ADX cluster, ADF pipeline, ADLS Gen2 containers, lifecycle policy (90d Cool, 180d Archive)
What's New: Toolkit includes 5 pre-built Power BI reports (charge breakdown, summary, recommendations, governance, rate-optimization) ready to wire to your ADX schema
Phase 3 Action: Wire toolkit Power BI templates ? your ADX normalized_costs table

2. Azure Optimization Engine (AOE) ? Extensible Recommendations

Status: ? Missing ? High Priority
What It Does: Weekly custom recommendations framework; extends Azure Advisor with 25+ audit rules
Setup: 5 min deployment; outputs to Log Analytics + SQL Server + Power BI
Capabilities:

  • Cost: VM right-sizing (with guest OS metrics), orphaned disks/IPs, idle compute, forgotten VMs
  • HA: Availability zones, managed disks, VM Scale Sets distribution
  • Security: Expired service principal certs, NSG rules with orphaned NICs
  • OpsExc: Identity governance, policy compliance, resource group limits
  • Workbooks: Reservations potential, cost anomalies, identity governance, policy compliance

Your Overlap: You have manual "idle resource detection" in Advanced-Capabilities-Showcase ? AOE automates this weekly
Phase 4 Action: Deploy AOE into sandbox; wire to your ADX for anomaly correlation

3. FinOps Workbooks ? Azure Monitor Native

Status: ?? Partial ? You have docs, need Monitor workbooks
What It Includes: 2 pre-built workbooks (Optimization + Governance)
Integration: Works with ADX, ADLS, or both
Phase 4 Action: Deploy toolkit workbooks alongside your ADX dashboards

4. PowerShell Module (FinOpsToolkit)

Status: ?? Not Using ? Uses custom scripts instead
Available Commands:

  • Cost Management: Get/New/Remove/Start Cost Exports
  • FinOps Hubs: Deploy/Get/Initialize/Register/Remove hub lifecycle
  • Open Data: Pricing units, regions, resource types, services lookups

Your Approach: Custom az-inventory-finops.ps1 works well; toolkit module offers pre-built integration
Phase 4 Action: Optional ? keep custom scripts or migrate to module for consistency

5. Bicep Registry Modules + Open Data

Status: ?? Partial ? You built custom; toolkit has pre-built
Bicep: Scheduled actions module (Cost Management alerts)
Open Data: Reference CSVs (PricingUnits, Regions, ResourceTypes, Services, FOCUS examples)
Your Coverage: 99.997% tag coverage; open data would validate normalization further
Phase 4 Action: Validate your normalized_costs schema against FOCUS 1.0 spec

Integration Gaps (3 Critical)

Gap Your Project Toolkit Capability Phase 3/4 Action
Power BI Starter Kits Advanced-Capabilities manual doc 5 pre-built reports for ADX Wire toolkit reports to your ADX schema
Weekly Recommendations Manual analysis in Advanced-Capabilities AOE weekly automation + 10 workbooks Deploy AOE; correlate anomalies with your data
FOCUS Alignment Custom normalized_costs schema Toolkit FOCUS 1.0 examples + metadata Validate schema against official spec

NEW: Advanced Capabilities Showcase (March 1, 2026)

Client Follow-Up: After the success of saving-opportunities.md ($80K+ quick wins), this comprehensive capabilities document answers "What else can we do with the data we've collected?"

ADVANCED-CAPABILITIES-SHOWCASE.md ? Complete Operations Playbook

12 Advanced Analytics + 8 Operational Use Cases ready to deploy:

Analytics Capabilities (Now Ready)

  1. Real-Time Anomaly Detection ? Prevent $150K+ incidents; detect cost spikes in <4 hours (vs. 16 days detected)
  2. Cost Attribution by Team/Department ? Chargeback visibility for all cost centers; $300K+ in owner accountability
  3. Idle & Underutilized Resource Detection ? $48-84K/yr from compute scheduling + resource cleanup
  4. Cost Driver Analysis & Service Benchmarking ? $18-24K/yr from Search consolidation insights
  5. Chargeback & Finance Integration ? Automated monthly invoicing via SAP integration; 99.997% tag compliance
  6. APIM Cost Attribution ? Per-API chargeback on 50+ EVA-JP APIs (Phase 3 blocker; pending telemetry)
  7. Sustainability & Carbon Footprint ? ESG reporting, carbon credit monetization, emissions tracking
  8. Budget Forecasting & Planning ? ARIMA forecasting with seasonality; what-if scenario modeling

Operational Use Cases (Ready for Implementation)

  1. Multi-Region & HA Cost Trade-Off Analysis ? Business case validation for failover design
  2. Compliance & Audit Trail Reporting ? SOX/PCI-ready audit evidence; cost immutability proof
  3. Technology Stack Optimization & TCO ? Reserved Instance savings ($80K/yr); architecture optimization roadmap
  4. Organizational Change & Cost Culture ? 5-15% cost reduction through behavior change; engineering cost awareness

Key Metrics (460K+ cost records collected):

  • Cost ingested: $567K total ($318K/yr steady-state after anomalies)
  • Tag coverage: 99.997% (460,594/460,609 rows)
  • Development vs Production ratio: 4.7:1 (shows optimization opportunity)
  • Quick wins identified: $103K/yr potential (32% of current spend)

Navigation: Read the document by role:

  • Finance/CFO: Executive summary + chargeback automation (Part 4)
  • Engineers: Idle resource detection + anomaly detection (Part 1-2)
  • Architects: Technology stack optimization + multi-region analysis (Part 3)
  • Operational: Compliance + cost culture + forecasting (Part 4)

?? Phase 1 FinOps Enterprise Roadmap (February 17, 2026)

Comprehensive Documentation Delivered

8 Complete Deliverables (4,000+ lines, 1,800+ lines of embedded code):

  1. 00-current-state-inventory.md (268 lines)

    • Evidence-based snapshot of existing Azure resources
    • Verified inventory: 6 storage containers, 2 APIM APIs, 6 Event Grid topics
    • EVA-JP API context: 50+ APIs (38+ active endpoints) requiring cost attribution
    • Evidence sources: Baseline captured 2026-02-17 09:09:32 AM ET (12 JSON artifacts)
  2. 01-gap-analysis-finops-hubs.md (398 lines)

    • 7 Critical Gaps identified vs Microsoft FinOps Hubs reference architecture
    • Priority matrix: P0 (ADX, ADF, APIM), P1 (Storage, Power BI), P2 (Governance)
    • 4-phase remediation sequence (9 weeks, 68 story points)
    • Cost estimate: $350/month incremental infrastructure
  3. 02-target-architecture.md (560 lines)

    • Mermaid architecture diagram (full data flow: Cost Management ? ADLS ? ADF ? ADX ? Power BI)
    • 7 component specifications with embedded code (900+ lines: Bicep, KQL, JSON, XML)
    • ADX schema: 3 tables (raw_costs, apim_usage, normalized_costs) + allocation function
    • APIM policy: Cost attribution headers (x-eva-costcenter, x-eva-caller-app, x-eva-environment)
  4. 03-deployment-plan.md (480 lines)

    • 4 Phases, 9 Weeks: Storage Foundation (Weeks 1-2) ? Analytics (3-4) ? APIM Attribution (5-7) ? Governance (8-9)
    • Bicep modules for each phase (500+ lines embedded IaC)
    • Acceptance criteria per phase with validation commands
    • Rollback procedures for deployment safety
  5. PHASE1-DEPLOYMENT-CHECKLIST.md (480 lines) ? NEW

    • 6 Deployment Tasks (13 story points): Containers ? Lifecycle Policy ? Migration ? Export Config ? Event Grid
    • Step-by-step operational guide with copy-paste ready commands
    • Embedded migration script: migrate-costs-to-raw.ps1 (60+ lines) with dry-run mode
    • Bicep module: Storage lifecycle policy (90d Cool, 180d Archive)
    • Per-task acceptance criteria with evidence collection
    • Rollback procedures (original costs/ container preserved until validation)
  6. 04-backlog.md (340 lines)

    • 5 Epics, 68 Story Points, 9 Weeks: Epic 1 (Foundation, 13pts) ? Epic 2 (Analytics, 21pts) ? Epic 3 (APIM, 13pts) ? Epic 4 (Reporting, 8pts) ? Epic 5 (Governance, 13pts)
    • ADO-ready work items with T-shirt sizing (XS=1pt to XL=8pt)
    • Success metrics (KPIs): Ingestion latency <30min, APIM attribution >80%, Cost accuracy ?5%
  7. 05-evidence-pack.md (340 lines)

    • 9 Evidence Categories: Storage, Event Grid, ADF, ADX, APIM, Cost Exports, RBAC, Monitoring, Summary
    • Executable validation commands (Azure CLI + KQL queries)
    • Expected outputs for audit trail
  8. az-inventory-finops.ps1 (650 lines) ? TESTED

    • Automated inventory collection (15+ Azure resource types across 2 subscriptions)
    • Successfully executed 2026-02-17 09:09:32 AM ET
    • Generated 12 JSON baseline artifacts (storage, APIM, Event Grid, RBAC)

Navigation: See docs/finops/README.md for complete index with reading order by role (architect, PM, engineer, QA)

Business Value:

  • ? Cost Visibility: Centralized analytics across 2 subscriptions (EsDAICoESub + EsPAICoESub)
  • ? API Attribution: Chargeback for 50+ EVA-JP APIs (38+ active endpoints from EVA-JP-reference-0113) via APIM telemetry
  • ? Application Mapping: Integration with EVA-JP comprehensive API inventory (FastAPI backend, 2144 lines)
  • ? Compliance: Tag enforcement, 90%+ resource compliance target
  • ? Optimization: Power BI dashboards for trend analysis, cost allocation accuracy ?5%

Current Status (February 17, 2026)

What Works

Method Status Use Case Limitation
Azure Portal native exports ? [ACTIVE] Configured Daily cost data to Blob storage Manual Portal setup required
Python SDK (extract_costs_sdk.py) ? [WORKS] Ad-hoc queries <5,000 rows Pagination broken >5,000 rows
REST API via az rest ? [TESTED] Historical backfill with pagination Daily aggregated totals only (no resource details)
ADF Pipeline (ingest-costs-to-adx) ? [DEPLOYED] Blob CSV ? ADX raw_costs via Event Grid trigger marco-sandbox-finops-adf deployed Feb 26
ADX (marcofinopsadx) ? [DEPLOYED] KQL analytics ? NormalizedCosts() v2 (tag bug-fix, dual-schema, CanonicalEnvironment, 6 ESDC dims), AllocateCostByApp() v2 Dev(No SLA) SKU, canadacentral, finopsdb ? 460,609 rows, 99.997% tagged

What Does NOT Work

Method Blocker Attempted
SDK pagination (>5,000 rows) TypeError: Session.request() got unexpected kwarg 'skip_token' (azure-mgmt-costmanagement 4.0.1, Python 3.13) 10+ attempts with chunking strategies
CLI export create (az costmanagement export) Extension install fails (Windows registry bug: FileNotFoundError CSIDL_COMMON_APPDATA) 3 attempts
REST API export config Microsoft.CostManagementExports provider not registered (needs Owner/Contributor) 2 attempts
FinOps Hub deployment AuthorizationFailed - no Contributor role on subscription 2 attempts
Programmatic storage access outside approved path Azure Policy or network controls can block CLI/PowerShell from unmanaged paths even when the storage account is reachable from an approved ESDC network or Dev Box session Multiple attempts

Product Lineage

  • 14-az-finops is the originating FinOps use case for the EsDAICoESub and EsPAICoESub estates.
  • marcosandboxfinopshub is the shared landing surface used to centralize the subscription exports and feed the downstream analytics path.
  • 51-ACA consumes this FinOps evidence as part of the ACA product behavior.
  • 75-EVA-vNext/products/aca is the current productized continuation of that same lineage.

Active Exports

Export Subscription Status Storage Configured Via Evidence
EsDAICoESub-Daily d2d4e571-e0f2-4f6c-901a-f88f7669bcba [PASS] ACTIVE -- 10 consecutive runs Feb 16-25 marcosandboxfinopshub/costs/EsDAICoESub/ Azure Portal Portal run history verified 2026-02-25 ~3:28 PM UTC daily
EsPAICoESub-Daily 802d84ab-3189-4221-8453-fcc30c8dc8ea [WARN] NOT YET VERIFIED -- Portal check pending marcosandboxfinopshub/costs/EsPAICoESub/ Azure Portal CLI SA query: EmailAlert only (disabled) -- see portal

Subscriptions

Subscription ID Resources Daily Records Notes
EsDAICoESub d2d4e571-e0f2-4f6c-901a-f88f7669bcba 1,200 resources ~7,000+/day Primary AI/ML (dev/stage), 21 Azure OpenAI instances
EsPAICoESub 802d84ab-3189-4221-8453-fcc30c8dc8ea 203 resources ~2,500/month Production (prd), 3 Azure OpenAI instances

Note: Cost figures intentionally omitted pending comprehensive FinOps Hub deployment. Focus: 50+ EVA-JP APIs requiring APIM cost attribution across environments.


Permission Model

What we have:

  • Cost Management Contributor (read cost data, configure exports via Portal)
  • Cost Management Reader (view costs)
  • Azure CLI authentication (az login --tenant 9ed55846-...)

What we lack (and why it matters):

  • Contributor/Owner on subscription: Cannot register resource providers, deploy resources, or configure exports programmatically via REST API
  • Storage Data Contributor: Cannot write to storage via CLI/SDK (Azure Policy firewall blocks non-Portal access)

Workaround: All configuration done via Azure Portal GUI, which bypasses both permission and firewall restrictions.


Project Structure

14-az-finops/
  README.md                                    # This file
  requirements.txt                             # Python deps (azure-mgmt-costmanagement, pandas)
  
  docs/
    finops/                                    # ? NEW: Phase 1 FinOps Enterprise Roadmap
      README.md                                # Navigation index (390 lines)
      00-current-state-inventory.md            # Evidence-based snapshot (268 lines)
      01-gap-analysis-finops-hubs.md           # 7 gaps vs Microsoft reference (398 lines)
      02-target-architecture.md                # Mermaid diagrams + code samples (560 lines)
      03-deployment-plan.md                    # 4 phases, 9 weeks, Bicep modules (480 lines)
      PHASE1-DEPLOYMENT-CHECKLIST.md           # ? Operational guide (480 lines, copy-paste ready)
      04-backlog.md                            # 5 epics, 68 story points (340 lines)
      05-evidence-pack.md                      # Validation commands (340 lines)
    EXPORTATION-BEST-PRACTICES.md              # Export strategy and lessons learned
    RATE-LIMITING-BEST-PRACTICES.md            # API rate limit handling
  
  tools/
    finops/
      az-inventory-finops.ps1                  # ? Automated inventory (650 lines, TESTED)
      out/                                     # Baseline artifacts (12 JSON files, 2026-02-17 09:09:32)
        storage-containers-20260217-090932.json
        apim-apis-20260217-090932.json
        eventgrid-subscriptions-20260217-090932.json
        storage-rbac-20260217-090932.json
        # ... 8 more JSON files
  
  scripts/
    extract_costs_sdk.py                       # Python SDK - ad-hoc queries (<5K rows)
    Backfill-Costs-REST.ps1                    # REST API historical backfill (1-month tested)
    migrate-costs-to-raw.ps1                   # [DONE P1] 28 blobs migrated to raw/costs/
    backfill-historical.ps1                    # [DONE P2] Trigger ADF for all 28 historical blobs
    adx-cluster.bicep                          # [DONE P2] marcofinopsadx Dev SKU, canadacentral
    managed-identity.bicep                     # [DONE P2] mi-finops-adf + RBAC
    assign-rbac-roles.ps1                      # [DONE P2] Storage Blob Data Contributor on mi-finops-adf
    create-eventgrid-subscription.ps1          # [DONE P2] Event Grid -> ADF webhook wiring
    create-schema.kql                          # [DONE P2] Combined schema (all tables + mappings)
    run-schema.ps1                             # [DONE P2] Individual KQL file runner (REST API)
    exec-kql.ps1                               # [DONE P2] Generic KQL query runner
    deploy-adf-artefacts.ps1                   # [DONE P2] Deploy ADF linked services/datasets/pipelines
    redeploy-adf-artefacts.ps1                 # [DONE P2] Redeploy ADF artefacts
    deploy-functions.ps1                       # [DONE P3] Deploy NormalizedCosts + AllocateCostByApp
    lifecycle-policy.json                      # [DONE P1] 90d Cool, 180d Archive
    kql/
      01-raw-costs-table.kql                   # [DONE P2] raw_costs schema
      02-ingestion-mapping.kql                 # [DONE P2] CostExportMapping (ordinals verified)
      03-raw-costs-retention.kql               # [DONE P2] 365d retention
      06-apim-usage-table.kql                  # [DONE P2] apim_usage schema
      07-apim-usage-retention.kql              # [DONE P2] 365d retention
      08-normalized-costs-function.kql         # [DONE P3] NormalizedCosts() + attribution fallbacks
      09-allocate-cost-function.kql            # [DONE P3] AllocateCostByApp() 3-tier attribution
    adf/
      linked-services/                         # [DONE P2] ls_marcosandbox_blob, ls_marcofinops_adx
      datasets/                                # [DONE P2] ds_blob_cost_csv, ds_adx_raw_costs
      pipelines/                               # [DONE P2] ingest-costs-to-adx (RFC4180 fixed)
    Configure-EsDAICoESub-CostExport.ps1       # Export config (blocked by provider)
    Configure-EsPAICoESub-CostExport.ps1       # Export config (blocked by provider)
    azure_inventory.py                         # Resource inventory
    offline-packages/                          # Offline .whl files for ESDC workstation
  
  portal-exports/                              # Manual Portal downloads (Jan 27, 2026)
  output/
    historical/                                # Target for backfill output
      EsDAICoESub/                             # REST API backfill (1-month tested)
      EsPAICoESub/                             # REST API backfill (1-month tested)
  archive/
    20260216-audit-cleanup/                    # 96+ archived files from 3-week investigation

Deployment Roadmap

[PASS] Phase 0: Data Collection Foundation (COMPLETE - Feb 16, 2026 -- Re-verified Feb 25, 2026)

Task Status Notes
Configure EsDAICoESub daily export (Portal) [PASS] CONFIRMED 10 consecutive runs Feb 16-25, Portal run history verified 2026-02-25
Configure EsPAICoESub daily export (Portal) [PASS] DONE Active, succeeded (Feb 16, 11:36 AM) -- Portal verification pending
Test Backfill-Costs-REST.ps1 for historical data [PASS] DONE 1-month test successful (January 2026)
Verify export data landing in Blob storage [PASS] DONE Confirmed 2.3MB + 371KB exports present
Capture baseline inventory [PASS] DONE 12 JSON artifacts, 2026-02-17 09:09:32 AM ET
Create comprehensive roadmap [PASS] DONE 8 deliverables, 4,000+ lines, 1,800+ lines code
Data model updated [PASS] DONE 2026-02-25 maturity=active phase=Phase 1 - Storage Foundation rv=10

[PASS] Phase 1: Storage Foundation ? COMPLETE (Feb 25, 2026)

Status: [PASS] ALL 6 TASKS COMPLETE ? See PHASE1-DEPLOYMENT-CHECKLIST.md for evidence

Storage Audit (2026-02-25 ? all containers confirmed):

Container Status Required by Phase 1 Result
costs [PASS] (since Feb 3) Preserved (original) Retained as source
processed [PASS] (since Feb 6) Partial match Retained
config [PASS] (since Feb 11) Not in spec Retained
ingestion [PASS] (since Feb 11) Not in spec Retained
msexports [PASS] (since Feb 11) Not in spec Retained
raw [PASS] Created Feb 25 YES Task 1.1.1 DONE
archive [PASS] Created Feb 25 YES Task 1.1.1 DONE
checkpoint [PASS] Created Feb 25 YES Task 1.1.1 DONE

6 Deployment Tasks ? All Complete:

Task Size Description Result
1.1.1 XS (1pt) Create storage containers (raw, archive, checkpoint) [PASS] Feb 25
1.1.2 S (2pt) Lifecycle policy (90d Cool, 180d Archive) [PASS] lifecycle-policy.json deployed Feb 25
1.1.3 M (3pt) Migrate exports to raw/costs/ hierarchy [PASS] 28 blobs migrated ? migration-log-20260225-*.txt
1.2.1 S (2pt) Update export destinations (Portal) [PASS] EsDAICoESub + EsPAICoESub pointing to raw/
1.3.1 S (2pt) Verify Event Grid system topic [PASS] Active
1.3.2 M (3pt) Event subscription ? ADF webhook [PASS] create-eventgrid-subscription.ps1 deployed Feb 26

Evidence: scripts/migration-log-20260225-17*.txt (28 blobs confirmed), scripts/lifecycle-policy.json


[PASS] Phase 2: Analytics & Ingestion ? COMPLETE (Feb 26, 2026)

Status: [PASS] ALL TASKS COMPLETE ? marcofinopsadx live, ADF pipeline deployed, 28 historical blobs backfilled

Completed Tasks:

Task Description Artifact Result
2.1.1 ADX cluster marcofinopsadx Dev(No SLA)_Standard_D11_v2 scripts/adx-cluster.bicep [PASS] canadacentral, finopsdb
2.1.2 ADX schema: raw_costs + CostExportMapping + apim_usage scripts/kql/01-07-*.kql [PASS] Ordinals verified from live CSV
2.1.2 KQL functions: NormalizedCosts() + AllocateCostByApp() scripts/kql/08-09-*.kql [PASS] 3-tier attribution fallbacks
2.2.1 Managed identity mi-finops-adf + RBAC (Storage Blob Data Contributor) scripts/managed-identity.bicep, assign-rbac-roles.ps1 [PASS] Feb 26
2.2.2 ADF linked services (blob + ADX) scripts/adf/linked-services/ [PASS] ls_marcosandbox_blob, ls_marcofinops_adx
2.2.3 ADF datasets (blob CSV + ADX raw_costs) scripts/adf/datasets/ [PASS] ds_blob_cost_csv, ds_adx_raw_costs
2.2.4 ADF pipeline ingest-costs-to-adx (Blob ? ADX + checkpoint) scripts/adf/pipelines/ingest-costs-to-adx.json [PASS] RFC4180 escaping fixed
2.3 Backfill: triggered pipeline for all 28 historical blobs scripts/backfill-historical.ps1 [PASS] 28 runs triggered Feb 26

ADX Database: finopsdb on marcofinopsadx.canadacentral.kusto.windows.net
ADF Factory: marco-sandbox-finops-adf (EsDAICoE-Sandbox RG)


?? Phase 3: APIM Attribution & Telemetry ? IN PROGRESS (Feb 26, 2026)

Status: [IN PROGRESS] ? Attribution KQL layer v2 complete (tag bug fixed, ESDC dimensions live); APIM policy + telemetry pipeline pending

Completed (Feb 26):

Task Description Artifact Result
3.3a Critical bug-fix: Tags exported WITHOUT {} ? parse_json() returned null for 100% of rows. Fixed with iif(? !startswith "{", strcat("{",?,"}"), ?). 08-normalized-costs-function.kql [PASS] 460,594 rows now parsed
3.3b NormalizedCosts() v2 ? dual-schema (SSC Standard 77.8% + Legacy 22.2%), CanonicalEnvironment (Dev/Stage/Prod), 6 new ESDC dimensions: SscBillingCode, FinancialAuthority, OwnerManager, ClientBu, ProjectDisplayName, IsSharedCost scripts/kql/08-normalized-costs-function.kql [PASS] Deployed
3.3c AllocateCostByApp() v2 ? 3-tier: header (APIM telemetry) ? tag ? pre-apim; outputs SscBillingCode + CanonicalEnvironment scripts/kql/09-allocate-cost-function.kql [PASS] Deployed
3.3d Deploy + smoke-test (6 validation queries) ? 460,609 rows: Dev 72.5% / Prod 15.4% / Stage 12% scripts/deploy-functions.ps1, smoke-test-v2.ps1 [PASS] Evidence saved

Remaining Tasks:

Task Description Blocker / Notes
3.1 APIM base policy ? inject x-caller-app, x-costcenter, x-eva-environment headers Needs APIM policy deployment on 50+ EVA-JP APIs
3.2 App Insights diagnostics ? 100% sampling, log all APIM requests Needs App Insights resource linked to APIM
3.4 Telemetry ingestion pipeline ? App Insights ? apim_usage table in ADX Needs ADF pipeline or Logic App trigger

Business Value: Once APIM policy live, Tier 1 attribution activates automatically (pre-apim rows in ADX flip to header-attributed)
Business Value: 50+ EVA-JP APIs (38+ active endpoints) get granular per-app cost visibility

Dependencies: Phase 2 complete [DONE]


?? Phase 4: Reporting & Governance (Weeks 8-9, 21 Story Points)

Status: ? [PLANNED] - See 03-deployment-plan.md ? Phase 4

Key Tasks:

  • Deploy 3 Power BI reports (cost trend, allocation, tag compliance)
  • Implement Azure Policy (tag enforcement: CostCenter required)
  • Configure VNet + private endpoints (storage, ADX)
  • Deploy GitHub Actions CI/CD workflow
  • Final validation (all commands from 05-evidence-pack.md)

Success Metrics:

  • Cost Data Completeness: 100% (12 months backfilled)
  • Ingestion Latency: <30 minutes
  • APIM Attribution Coverage: >80% requests tagged
  • Tag Compliance: >90% resources
  • Cost Allocation Accuracy: ?5%
  • Power BI Query Performance: <3 seconds

?? Deployment Timeline Summary

Phase Duration Story Points Status Key Deliverable
Phase 0 3 weeks N/A ? [COMPLETE] Daily exports + Comprehensive roadmap
Phase 1 2 weeks 13 ? [COMPLETE] Feb 25 Storage foundation + 28 blobs migrated to raw/
Phase 2 2 weeks 21 ? [COMPLETE] Feb 26 ADX + ADF + Schema + 28-blob backfill triggered
Phase 3 3 weeks 13 ?? [IN PROGRESS] Feb 26 APIM attribution ? KQL layer done, APIM policy pending
Phase 4 2 weeks 21 ? [PLANNED] Power BI + Governance
TOTAL 9 weeks 68 points 15 story points/sprint Enterprise FinOps capability

Quick Start

Ad-Hoc Cost Query (Small Date Ranges)

# Authenticate
az login --tenant 9ed55846-8a81-4246-acd8-b1a01abfc0d1

# Query costs for a few days (SDK works for <5000 rows)
python scripts/extract_costs_sdk.py --subscription d2d4e571-e0f2-4f6c-901a-f88f7669bcba --start-date 2026-02-10 --end-date 2026-02-16

Configure Daily Export (Portal)

  1. Go to Azure Portal > Cost Management > Exports
  2. Select subscription scope
  3. Create new export: Type=ActualCost, Schedule=Daily, Format=CSV
  4. Storage: marcosandboxfinopshub, container=costs, directory={SubscriptionName}
  5. Set date range and save

Historical Backfill (REST API)

# Test with 1 month
.\scripts\Backfill-Costs-REST.ps1 -MonthsToBackfill 1

# Full 12 months
.\scripts\Backfill-Costs-REST.ps1 -MonthsToBackfill 12

Lessons Learned (Audit Summary)

This project went through 10 extraction attempts over 3 weeks (Jan 27 - Feb 16, 2026) cycling between SDK, CLI, REST API, and Portal before identifying two fundamental blockers:

  1. Python SDK pagination is broken: azure-mgmt-costmanagement 4.0.1 on Python 3.13 throws TypeError: Session.request() got an unexpected keyword argument 'skip_token' after page 1-2. All chunking strategies (monthly, weekly, 10-day, daily) still hit this ceiling at 5,000-10,000 rows. The data/historical/ files that appeared complete were silently truncated at exactly 5,001 rows (1 header + 5,000 data = single page).

  2. No Contributor permissions: Cannot register resource providers, deploy infrastructure, or configure exports programmatically. Azure Policy firewall blocks CLI/PowerShell storage access.

What actually works: Azure Portal for export configuration (bypasses both blockers), and az rest for direct REST API calls with manual nextLink pagination.

Extraction Attempt Timeline

# Date Method Result Blocker
1 Jan 27 Manual Portal download Partial success Manual, not automated
2 Jan 29 FinOps Toolkit (Deploy-FinOpsHub) Failed Region restrictions + no permissions
3 Jan 30 CLI (az costmanagement export create) Failed Extension install fails (Windows)
4 Jan 31 REST API export config Failed CostManagementExports provider not registered
5 Feb 1 Python SDK (small query) Success (28 rows) Worked but only because <5,000 rows
6 Feb 6 SDK 12-month extraction Silently truncated All 11 CSVs capped at 5,001 rows
7 Feb 10 Deploy-ESDC-FinOps.ps1 Failed AuthorizationFailed (no Contributor)
8 Feb 15 Scope change + deploy retry Failed Same permission blocker
9 Feb 16 SDK weekly/10-day chunking Still truncated SDK pagination broken regardless of chunk size
10 Feb 16 Portal native export config Success Bypasses all blockers

See archive/20260216-audit-cleanup/ for the full history of attempts (96+ archived files).


?? Documentation Quick Links

For Implementation Teams

For Executives

  • Executive Summary: docs/finops/README.md ? Business Value section
  • Investment: $350/month incremental (ADX Dev SKU + storage/networking)
  • Timeline: 9 weeks (4.5 sprints at 15 points/sprint)
  • ROI: Cost visibility for 1,403 resources + 50+ EVA-JP APIs, tag compliance >90%, chargeback capability

For Auditors


Project Owner: Marco Presta (marco.presta@hrsdc-rhdcc.gc.ca)
Last Updated: February 26, 2026
Current Phase: Phase 3 ? APIM Attribution & Telemetry [IN PROGRESS]
ADX: marcofinopsadx.canadacentral.kusto.windows.net / finopsdb | ADF: marco-sandbox-finops-adf

About

Azure FinOps cost management hub with daily exports

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors