feat(mcp): Add per-tool-call metrics to `logs` response


## Problem

The `logs` tool provides aggregate metrics per episode: total_tokens,
total_estimated_cost, mcp_failure_count. There is no breakdown by individual
tool call.

This means consumers cannot answer:
- "Which MCP tool consumes the most tokens?"
- "Which tool call failed and why?"
- "What is the latency distribution per tool?"

## Current behavior

Only aggregates: `total_tokens: 12840`, `mcp_failure_count: 1`

## Expected behavior

Include a `tool_calls` array per episode:

```json
{
  "tool_calls": [
    {
      "tool": "get_file_contents",
      "server": "github",
      "tokens": 2400,
      "duration_ms": 350,
      "status": "success"
    },
    {
      "tool": "search_code",
      "server": "github",
      "tokens": 5200,
      "duration_ms": 1200,
      "status": "success"
    },
    {
      "tool": "create_pull_request",
      "server": "github",
      "tokens": 800,
      "duration_ms": 600,
      "status": "error",
      "error": "403 Resource not accessible by integration"
    }
  ]
}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): Add per-tool-call metrics to `logs` response #24372

Problem

Current behavior

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat(mcp): Add per-tool-call metrics to logs response #24372

Description

Problem

Current behavior

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

feat(mcp): Add per-tool-call metrics to `logs` response #24372