Test Fast Small Models @2 or @3 passes

Hi Dan, hope all is well
Would love an addition to benchy to compare small and fast models at @2 or @3 passes. 

If your model is 4x faster you can give it more passes to see if & how the accuracy improves. I'm thinking R1 1.5B @3 passes compared to 72B @1 pass 
What do you think ?

Even a "Generic" 2nd or 3rd step such as:

```
{chat_history}
Would you change anything about your previous answer? Are there any corrections or fixes or improvements you can make? Completing the user's task accuratly and correctly is incrediblly important
```
Or something like that might actually improve accuracy 

And I bet you can use benchy to test which is the best "generic" 2nd pass prompt, or even make a router to different 2nd or 3rd passes which are domain or context specific, and benchmark the router flows

**Also with @2+ passes you can make N number of async calls to different providers simultaneously** - and have the 2nd layer triggered after n/2 (or thresh) replied - and 2nd layer chooses best solution or returns hybrid/fixed final response to benchy.
... Like a MoE approach with auction/race - use super fast and cheap models , might get the job done cheaper/faster with relative or exceeding accuracy of the big models 

Have never seen that being benched
Lmk what you think
All the best! 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Fast Small Models @2 or @3 passes #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Test Fast Small Models @2 or @3 passes #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions