Discussion: What tasks should we add next?

## Context

PinchBench currently has a set of tasks for evaluating coding agents. We should expand coverage to better represent real-world coding scenarios.

## Questions

- What types of tasks are missing from current benchmarks?
- Are there specific failure modes we want to test for?
- Should we include multi-file refactoring, debugging, or documentation tasks?
- Any language/framework gaps?

## Input Wanted

Looking for ideas from anyone running benchmarks or building coding agents. What would be most useful to measure?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: What tasks should we add next? #52

Context

Questions

Input Wanted

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discussion: What tasks should we add next? #52

Description

Context

Questions

Input Wanted

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions