[FEATURE] Interface improvements to execute simple agent invocation

### Problem Statement

Simplify the evaluation execution interface to reduce boilerplate when evaluating agents, making it easier to trigger agent calls without always requiring a custom task function wrapper.

## Current State
Today, running evaluations with experiment.run_evaluations() requires users to define a task function that:

- Takes a `Case` object as input
- Manually instantiates and configures agents
- Handles telemetry setup and span collection for trace-based evaluators
- Maps spans to sessions using mappers
- Returns either raw output or a dictionary with output, trajectory, and interactions

This pattern is repetitive across examples and creates friction for users who just want to evaluate an agent quickly.

### Proposed Solution

Provide convenience methods that allow users to pass agents or agent factories directly to run_evaluations, with automatic handling of common patterns like:

- Telemetry setup and span collection
- Session mapping for trace-based evaluation
- Output formatting
- Tool trajectory extraction


### Use Case

- Users can evaluate simple agents in 3-5 lines of code instead of 15-20
- All existing examples continue to work without modification

### Alternatives Solutions

_No response_

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Interface improvements to execute simple agent invocation #135

Problem Statement

Current State

Proposed Solution

Use Case

Alternatives Solutions

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Interface improvements to execute simple agent invocation #135

Description

Problem Statement

Current State

Proposed Solution

Use Case

Alternatives Solutions

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions