Skip to content

Comments

Add diagnostic endpoints and fix critical issues for Azure Web App cpu-app investigation#101

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/fix-100
Draft

Add diagnostic endpoints and fix critical issues for Azure Web App cpu-app investigation#101
Copilot wants to merge 2 commits intomainfrom
copilot/fix-100

Conversation

Copy link

Copilot AI commented Aug 19, 2025

This PR addresses performance issues in the Azure Web App cpu-app by adding diagnostic tools and fixing critical problems that could cause application crashes or resource exhaustion.

Root Cause Analysis

Investigation revealed several problematic endpoints that could cause high CPU usage, memory leaks, and application crashes:

  1. Infinite loop in /crash endpoint - The condition while (true || bytesSize < 1_000_000) creates an infinite loop, continuously allocating 10MB chunks until the application runs out of memory
  2. Memory leaks - Multiple endpoints create objects that are never released due to static references and unmanaged event subscriptions
  3. High CPU consumption - The /work endpoint spawns multiple threads performing intensive mathematical calculations
  4. Lack of monitoring - No visibility into resource usage or endpoint behavior

Changes Made

Fixed Critical Issues

  • Fixed infinite loop in crash endpoint: Added allocation limits (100 max allocations, 1GB total limit) to prevent application termination
  • Added comprehensive logging: All problematic endpoints now log warnings when called and track resource usage

Added Diagnostic Endpoints

  1. GET /api/app/health - Returns quick health status with memory warnings
  2. GET /api/app/diagnostics - Provides detailed system metrics, process information, and GC statistics
  3. POST /api/app/cleanup - Emergency endpoint to clear accumulated memory allocations for recovery

Added SRE Documentation

Created SRE_INVESTIGATION_GUIDE.md with:

  • Comprehensive endpoint documentation
  • Investigation procedures
  • Emergency recovery steps
  • Monitoring recommendations

Example Usage

# Check application health
curl http://localhost:5000/api/app/health

# Get detailed diagnostics
curl http://localhost:5000/api/app/diagnostics

# Emergency memory cleanup
curl -X POST http://localhost:5000/api/app/cleanup

The health endpoint will show warnings when memory usage is high:

{
  "status": "warning",
  "warnings": ["High memory usage: 960.0MB", "Memory hog active: 96 allocations"]
}

Impact

  • Prevents crashes: The infinite loop fix prevents out-of-memory application termination
  • Enables monitoring: New diagnostic endpoints provide visibility into resource usage
  • Facilitates recovery: Cleanup endpoint allows SRE team to recover from high memory situations
  • Improves observability: Comprehensive logging tracks problematic endpoint usage

All changes are surgical and focused - existing functionality is preserved while adding essential investigation and recovery capabilities.

Fixes #100.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…vestigation

Co-authored-by: mrsharm <68247673+mrsharm@users.noreply.github.com>
Copilot AI changed the title [WIP] Automated Issue: Investigation Required for Azure Web App cpu-app Add diagnostic endpoints and fix critical issues for Azure Web App cpu-app investigation Aug 19, 2025
Copilot AI requested a review from mrsharm August 19, 2025 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Automated Issue: Investigation Required for Azure Web App cpu-app

2 participants