feat(venice): add additional models to Venice.ai provider#112
feat(venice): add additional models to Venice.ai provider#112georgeglarson wants to merge 1 commit intocharmbracelet:mainfrom
Conversation
- Add Llama 3.1 models (405B, 70B, 8B) - Add Deepseek Coder V2 for coding tasks - Add Qwen 32B and 72B models - Add Mistral Nemo - Add Hermes 3 405B Expands Venice.ai model selection from 5 to 13 models, providing more options for different use cases including coding, reasoning, and cost-effective inference.
There was a problem hiding this comment.
Pull request overview
This PR expands the Venice.ai provider model selection from 5 to 13 models, offering users more options for different use cases including coding, reasoning, and cost-effective inference.
- Adds 8 new models: Llama 3.1 variants (405B, 70B, 8B), Deepseek Coder V2, Qwen 32B and 72B, Mistral Nemo, and Hermes 3 405B
- Updates default_small_model_id from mistral-31-24b to llama-3.2-3b
- Adjusts Llama 3.2 3B pricing to be more cost-effective
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "cost_per_1m_in": 0.05, | ||
| "cost_per_1m_out": 0.05, |
There was a problem hiding this comment.
The pricing for Llama 3.2 3B has been reduced from 0.15/0.6 to 0.05/0.05. This represents a 3x reduction in input costs and 12x reduction in output costs. Verify this significant pricing change is accurate with Venice.ai's current pricing, as such a substantial decrease could impact cost calculations for users.
| "cost_per_1m_in": 0.05, | |
| "cost_per_1m_out": 0.05, | |
| "cost_per_1m_in": 0.15, | |
| "cost_per_1m_out": 0.6, |
| "id": "llama-3.1-8b", | ||
| "name": "Llama 3.1 8B", | ||
| "cost_per_1m_in": 0.1, | ||
| "cost_per_1m_out": 0.1, | ||
| "cost_per_1m_in_cached": 0, | ||
| "cost_per_1m_out_cached": 0, | ||
| "context_window": 128000, | ||
| "default_max_tokens": 4096, | ||
| "can_reason": true, |
There was a problem hiding this comment.
All newly added Llama 3.1 models (8B, 70B, 405B) have 'can_reason' set to true, but the existing Llama 3.2 3B and Llama 3.3 70B models have 'can_reason' set to false. This inconsistency is unclear - verify whether the reasoning capability designation is correct across all Llama model versions, as Llama 3.1 and 3.2 are similar model families.
|
Will see if we can get #137 in so it automatically updates all the models daily. |
Summary
Expands Venice.ai model selection from 5 to 13 models, providing more options for different use cases.
Changes
Motivation
Venice.ai offers a wide range of models beyond the initial 5 included in Catwalk. This PR adds 8 additional popular models that are commonly used for:
Testing
Tested with VeniceCode (Venice.ai-optimized fork of Crush) to ensure all models work correctly with the OpenAI-compatible API.
Related