Add LlamaGate model provider #2365

hkd987 · 2026-01-06T00:37:43Z

Add LlamaGate Model Provider

Adds LlamaGate as a new model provider plugin for Dify.

Provider Details

API: OpenAI-compatible (https://api.llamagate.dev/v1)
- Auth: Bearer token via api_key credential

Models Included

LLM Models (12):

Llama 3.1 8B Instruct, Llama 3.2 3B
- DeepSeek R1 8B, DeepSeek R1 Distill Qwen 7B
- Qwen 3 8B, Mistral 7B v0.3
- Qwen 2.5 Coder 7B, CodeLlama 7B, DeepSeek Coder 6.7B
- Qwen 3 VL 8B (Vision), OpenThinker 7B, Dolphin 3 8B
  Embedding Models (2):
Nomic Embed Text
- Qwen 3 Embedding 8B

Features

Competitive pricing ($0.02-$0.55 per 1M tokens)
- All models are open-weights
- Extends OAICompatLargeLanguageModel and OAICompatEmbeddingModel base classes

Checklist

Other Changes (Add New Models, Fix Model Parameters etc.)
- [x] I have Ensured dify_plugin is in requirements.txt

Adds LlamaGate (https://llamagate.dev) as a new model provider with: LLM Models (12): - Llama 3.1 8B Instruct, Llama 3.2 3B - DeepSeek R1 8B, DeepSeek R1 Distill Qwen 7B - Qwen 3 8B, Mistral 7B v0.3 - Qwen 2.5 Coder 7B, CodeLlama 7B, DeepSeek Coder 6.7B - Qwen 3 VL 8B (Vision), OpenThinker 7B, Dolphin 3 8B Embedding Models (2): - Nomic Embed Text - Qwen3 Embedding 8B Provider details: - OpenAI-compatible API at https://api.llamagate.dev/v1 - Competitive pricing: $0.02-$0.55 per 1M tokens - All models are open-weights (Apache 2.0, MIT, Llama, Gemma)

gemini-code-assist · 2026-01-06T00:38:09Z

Summary of Changes

Hello @hkd987, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances Dify's model offerings by introducing LlamaGate as a new, fully integrated provider. This addition broadens the selection of available open-source Large Language Models and embedding models, all accessible through a familiar OpenAI-compatible API. The integration aims to provide Dify users with more diverse, performant, and cost-effective options for their AI applications, backed by LlamaGate's competitive pricing and open-weight model philosophy.

Highlights

New Model Provider: Integrates LlamaGate as a new model provider plugin for Dify, offering an OpenAI-compatible API for a wide range of open-source LLMs and embedding models.
Extensive Model Support: Adds support for 12 Large Language Models, including Llama 3.1/3.2, DeepSeek R1, Qwen, Mistral, CodeLlama, and OpenThinker, along with 2 embedding models: Nomic Embed Text and Qwen 3 Embedding 8B.
API Compatibility and Authentication: Leverages an OpenAI-compatible API ("https://api.llamagate.dev/v1") and uses bearer token authentication via an "api_key" credential for secure access.
Cost-Effective and Open-Weight: Features competitive pricing ranging from $0.02 to $0.55 per 1 million tokens, with all integrated models being open-weight, providing flexible and economical options.
Dify Plugin Integration: Extends Dify's "OAICompatLargeLanguageModel" and "OAICompatEmbeddingModel" base classes, ensuring seamless integration and functionality within the Dify ecosystem.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces LlamaGate as a new model provider, which is a great addition. The implementation correctly leverages the OpenAI-compatible base classes, simplifying the integration. However, I've identified a few critical issues where method parameters are not correctly passed to the superclass methods, which could lead to incorrect behavior or loss of functionality. Additionally, there are some opportunities for improvement in configuration and exception handling to enhance maintainability and correctness. Please see the detailed comments below.

models/llamagate/models/llm/llm.py

models/llamagate/models/text_embedding/text_embedding.py

models/llamagate/models/llm/_position.yaml

models/llamagate/provider/llamagate.py

- Add missing `user` parameter to LLM _invoke() call - Add missing `input_type` parameter to text embedding _invoke() call - Add 4 missing models to _position.yaml (codellama-7b, deepseek-coder-6.7b, openthinker-7b, dolphin3-8b) - Replace `raise ex` with bare `raise` to preserve stack traces 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

crazywoola · 2026-01-06T01:47:38Z

models/llamagate/manifest.yaml

@@ -0,0 +1,37 @@
+author: llamagate


Author should be langgenius if you want to submit plugin to this repo.

crazywoola

See comments

dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jan 6, 2026

dosubot bot added the enhancement New feature or request label Jan 6, 2026

gemini-code-assist bot reviewed Jan 6, 2026

View reviewed changes

crazywoola reviewed Jan 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add LlamaGate model provider #2365

Add LlamaGate model provider #2365

Uh oh!

hkd987 commented Jan 6, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

crazywoola Jan 6, 2026

Uh oh!

crazywoola left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add LlamaGate model provider #2365

Are you sure you want to change the base?

Add LlamaGate model provider #2365

Uh oh!

Conversation

hkd987 commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!