feat: Pass traces of failures to dashboard

When a workflow or task fails today, the dashboard surfaces only the **error message** that bubbled up from the runtime. Full exception *traces* are streamed to the backend logs but are **not accessible from the UI**. This forces developers to leave the dashboard and dig through log aggregators to diagnose problems.

## Goal

Expose the complete trace associated with a failed run directly in the dashboard so that developers can download it with a single click.

## Current Behaviour

1. Backend returns a JSON payload for failed runs that contains the `error_message` string only.
2. The dashboard lists each run with a red status chip and the error message (truncated after ~120 chars).
3. There is no UI affordance to retrieve the trace.

## Proposed Solution

### Backend

- Augment failure payloads with a field that resolves to a stored trace (e.g. S3 object, database blob).
- Expose **`GET /api/runs/{run_id}/trace`** that returns
  - `Content-Disposition: attachment; filename="{run_id}_trace.txt"`
  - Raw text stack-trace in the body.
- Keep response size reasonable (<10 MB) via gzip or truncation of middle frames.

### Dashboard UI

1. Replace static error‐message cell with an **interactive chip**.
2. On hover / click, open a popover showing the full error message and a **"View Trace"** button.
3. Clicking the button downloads `run_<id>_trace.txt` using the native browser download flow.
4. No in-browser rendering required; developers can open in their IDE of choice.

### Acceptance Criteria

- [ ] Failed run rows display a clickable element that opens the popover.
- [ ] Popover contains the full error message without truncation.
- [ ] "View Trace" button triggers a file download of the trace.
- [ ] If a trace is unavailable, the button is disabled and a tooltip explains why.
- [ ] API endpoint is authenticated and honours RBAC (same as runs endpoint).

## Open Questions

1. **Storage location** – Persist traces in S3 vs database? Expected retention?
2. **Large traces** – Impose max file size or stream?
3. **Security** – Do we need additional masking (e.g. secrets) before exposing traces?

## Definition of Done

- Backend PR merged exposing the new endpoint and trace persistence logic.
- Dashboard PR merged implementing the UI changes.
- E2E test: trigger a synthetic failure and assert that the downloaded trace matches backend logs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Pass traces of failures to dashboard #634

Goal

Current Behaviour

Proposed Solution

Backend

Dashboard UI

Acceptance Criteria

Open Questions

Definition of Done

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: Pass traces of failures to dashboard #634

Description

Goal

Current Behaviour

Proposed Solution

Backend

Dashboard UI

Acceptance Criteria

Open Questions

Definition of Done

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions