Skip to content

Runbook and follow-up from PD incident Q27IUX3O3X3EKP: IndexOutOfRangeException in DashboardController.ExtractCriticalSegment #131

@mrsharm

Description

@mrsharm

Incident reference: Q27IUX3O3X3EKP
Incident link: https://microsoft-sre-agent-test.pagerduty.com/incidents/Q27IUX3O3X3EKP
Service: Test Service
Priority: P1 (high)
Status at time of capture: triggered (created 2026-01-17T23:05:23Z)

Summary

  • Title: orchardcorecmsweb2 - IndexOutOfRangeException resulting in Admin Page Breaking
  • Symptom: Admin dashboard inaccessible due to unhandled IndexOutOfRangeException
  • Component: DashboardController.ExtractCriticalSegment()
  • Source: Security audit system - User-Agent parsing failure
  • Message: "Index was outside the bounds of the array."

Suspected/Documented Root Cause

  • AI analysis: Unvalidated input parsing
  • Description: Parsing logic assumes a fixed structure and indexes into arrays or lists without bounds checks. Unexpected or malformed input causes out-of-range access and unhandled exceptions during request processing.

Resolution (as per incident notes/summary)

  • Mitigation: Temporarily bypassed User-Agent parsing for affected requests to restore access
  • Fix: Added bounds checks and robust parsing in ExtractCriticalSegment to handle malformed User-Agent strings; redeployed

Actionable Runbook (Draft)

  1. Detection
  • Alerts: Exception spikes (IndexOutOfRangeException), 5xx errors, Admin page load failures
  • Dashboards/APM: Monitor error rate, p95 latency for orchardcorecmsweb2
  • Logs: Filter by ExceptionType=IndexOutOfRangeException and path containing Admin
  1. Immediate Mitigation
  • If admin is blocked, deploy a feature flag or config toggle to bypass User-Agent parsing for admin endpoints only
  • If feature flag not available, hotfix to guard try/catch around ExtractCriticalSegment to fail closed with safe defaults
  1. Triage and Diagnosis
  • Collect a sample of failing User-Agent strings from logs over last 15–30 min
  • Reproduce locally/unit test ExtractCriticalSegment with those inputs
  • Inspect any array/list indexing; add length and null checks
  1. Remediation
  • Implement defensive parsing:
    • Validate tokens count before indexing
    • Use TryParse and fallback defaults
    • Centralize parsing with a tolerant parser that returns a Result type
  • Add unit tests covering malformed/edge UA patterns
  • Roll out via standard CI/CD pipeline
  1. Verification
  • Post-deploy: Verify zero IndexOutOfRangeException in logs for 30–60 minutes
  • Confirm Admin dashboard accessible and functional
  • Monitor error rate < baseline +1% and p95 latency within SLO
  1. Prevention/Hardening
  • Add input validation library for UA parsing
  • Configure circuit breaker or feature flag to disable UA parsing on anomaly spike
  • Add synthetic tests for Admin page
  1. Rollback Plan
  • If errors persist, rollback to last good build and re-enable UA parsing bypass temporarily

Key Commands/Queries (examples)

  • Logs (Kusto/ADX/Azure Monitor):
    AppTraces
    | where ExceptionType == "IndexOutOfRangeException" and Controller == "Dashboard" and Method == "ExtractCriticalSegment"
    | summarize count() by bin(Timestamp, 5m), Message

  • Error Rate and Latency checks (replace with your metric names)
    requests
    | summarize err_rate = avg(toint(status >= 500)), p95_latency = percentile(durationMs, 95) by bin(Timestamp, 5m)

Follow-ups / To-Dos

  • Confirm bounds checks merged in DashboardController.ExtractCriticalSegment
  • Add unit tests for malformed UA strings (attach examples from logs)
  • Add feature flag for UA parsing and document ops toggle
  • Create synthetic monitor for Admin page accessibility
  • Post-incident review to document precise patterns that caused failures

Context Artifacts

  • Incident Notes: Resolution Note: DONE by Dheeraj Bandaru
  • Impacted Component: orchardcorecmsweb2 Admin dashboard

Please use this issue to track the remaining hardening and tests. Once complete, link PRs and close the incident follow-up.

This issue was created by mrsharm-sri1111--3f136ed8
Tracked by the SRE agent here

Metadata

Metadata

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions