Immediate focus points #8
Replies: 3 comments 1 reply
-
|
Finetuning! Model kinda sucks at some cases now that I dont use langextract. |
Beta Was this translation helpful? Give feedback.
-
|
I wrote eval scripts for 2 of my implementation branches. Both have the same input prompts but different response structures.
|
Beta Was this translation helpful? Give feedback.
-
|
Upon testing with the legacy non chunking implementation (llama.cpp json schema), doubling the context length and providing many examples to the system prompt gives amazing results. The score jumped from 13/30 to 24/30, but the tokens/second slashed from 500t/s to 250t/s (local inference on Mac M4 Air) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Beta Was this translation helpful? Give feedback.
All reactions