Skip to content

Test Fast Small Models @2 or @3 passes #7

@fire17

Description

@fire17

Hi Dan, hope all is well
Would love an addition to benchy to compare small and fast models at @2 or @3 passes.

If your model is 4x faster you can give it more passes to see if & how the accuracy improves. I'm thinking R1 1.5B @3 passes compared to 72B @1 pass
What do you think ?

Even a "Generic" 2nd or 3rd step such as:

{chat_history}
Would you change anything about your previous answer? Are there any corrections or fixes or improvements you can make? Completing the user's task accuratly and correctly is incrediblly important

Or something like that might actually improve accuracy

And I bet you can use benchy to test which is the best "generic" 2nd pass prompt, or even make a router to different 2nd or 3rd passes which are domain or context specific, and benchmark the router flows

Also with @2+ passes you can make N number of async calls to different providers simultaneously - and have the 2nd layer triggered after n/2 (or thresh) replied - and 2nd layer chooses best solution or returns hybrid/fixed final response to benchy.
... Like a MoE approach with auction/race - use super fast and cheap models , might get the job done cheaper/faster with relative or exceeding accuracy of the big models

Have never seen that being benched
Lmk what you think
All the best!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions