Skip to content

Commit dde0146

Browse files
WenyuehWenyueh
andauthored
Update blog post (#45)
* Update blog post: * Correct model versions in technical deep dive Updated model references in the discussion on LLM routing systems. --------- Co-authored-by: Wenyueh <norahua1996@outlook.com>
1 parent a43b0d0 commit dde0146

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

docs/blog/posts/technical-deep-dive.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ categories:
1919

2020
*\* Equal contribution*
2121

22-
Most teams pick a model, usually the latest frontier release, and run every step of their agent on it. Planner? GPT-4o. Solver? GPT-4o. Critic? GPT-4o. It works, so nobody questions it.
22+
Most teams pick a model, usually the latest frontier release, and run every step of their agent on it. Planner? GPT-5.4. Solver? GPT-5.4. Critic? GPT-5.4. It works, so nobody questions it.
2323

2424
But "it works" is not "it's optimal." What if the same accuracy costs 20x less with a different combination? What if a *weaker* model actually performs *better* at one of those steps? These aren't hypotheticals. We ran the experiments.
2525

@@ -67,7 +67,7 @@ These are real numbers from real benchmarks. Same accuracy band, 20-100x cost di
6767

6868
## Agent Routing Is Not LLM Routing
6969

70-
If you've seen LLM routing systems (the ones that pick GPT-4 for hard questions and GPT-3.5 for easy ones), you might think: "Can't I just do that for each step of my agent?"
70+
If you've seen LLM routing systems (the ones that pick GPT-5.4 for hard questions and GPT-4o for easy ones), you might think: "Can't I just do that for each step of my agent?"
7171

7272
No. And here's why.
7373

0 commit comments

Comments
 (0)