Batching #17

Techbert08 · 2026-01-05T21:04:05Z

This does three things (I love threes):

Update default config to explicitly use 4.1 mini for scoring. This makes it more clear what's in use.
Extending the LLM interface to have a "get_completions" in parallel to "get_completion". The default providers just route "get_completions" to parallel-call "get_completion". But the Pi one instead uses first-class support for parallel calls. There's some yuckiness around lists of kwargs vs kwargs.
I added "ask_llm_parallel" to expose this functionality to the ranker.

Now it works with a local Pi modelserver and with 4.1 mini. My plan is to merge this (assuming it looks good), then finish deploying the updated Pi modelserver to Azure.

Techbert08 added 2 commits January 5, 2026 15:57

Add ability to score all items in one request.

642b37f

Back out index name

10559ca

chelseacarter29 merged commit 62de835 into nlweb-ai:main Jan 5, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batching #17

Batching #17

Uh oh!

Techbert08 commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Batching #17

Batching #17

Uh oh!

Conversation

Techbert08 commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants