Skip to content

Conversation

@hkd987
Copy link

@hkd987 hkd987 commented Jan 6, 2026

Add LlamaGate Model Provider

Adds LlamaGate as a new model provider plugin for Dify.

Provider Details

  • API: OpenAI-compatible (https://api.llamagate.dev/v1)
    • Auth: Bearer token via api_key credential

Models Included

LLM Models (12):

  • Llama 3.1 8B Instruct, Llama 3.2 3B
    • DeepSeek R1 8B, DeepSeek R1 Distill Qwen 7B
    • Qwen 3 8B, Mistral 7B v0.3
    • Qwen 2.5 Coder 7B, CodeLlama 7B, DeepSeek Coder 6.7B
    • Qwen 3 VL 8B (Vision), OpenThinker 7B, Dolphin 3 8B
      Embedding Models (2):
  • Nomic Embed Text
    • Qwen 3 Embedding 8B

Features

  • Competitive pricing ($0.02-$0.55 per 1M tokens)
    • All models are open-weights
    • Extends OAICompatLargeLanguageModel and OAICompatEmbeddingModel base classes

Checklist

  • Other Changes (Add New Models, Fix Model Parameters etc.)
  • - [x] I have Ensured dify_plugin is in requirements.txt

Adds LlamaGate (https://llamagate.dev) as a new model provider with:

LLM Models (12):
- Llama 3.1 8B Instruct, Llama 3.2 3B
- DeepSeek R1 8B, DeepSeek R1 Distill Qwen 7B
- Qwen 3 8B, Mistral 7B v0.3
- Qwen 2.5 Coder 7B, CodeLlama 7B, DeepSeek Coder 6.7B
- Qwen 3 VL 8B (Vision), OpenThinker 7B, Dolphin 3 8B

Embedding Models (2):
- Nomic Embed Text
- Qwen3 Embedding 8B

Provider details:
- OpenAI-compatible API at https://api.llamagate.dev/v1
- Competitive pricing: $0.02-$0.55 per 1M tokens
- All models are open-weights (Apache 2.0, MIT, Llama, Gemma)
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jan 6, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @hkd987, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances Dify's model offerings by introducing LlamaGate as a new, fully integrated provider. This addition broadens the selection of available open-source Large Language Models and embedding models, all accessible through a familiar OpenAI-compatible API. The integration aims to provide Dify users with more diverse, performant, and cost-effective options for their AI applications, backed by LlamaGate's competitive pricing and open-weight model philosophy.

Highlights

  • New Model Provider: Integrates LlamaGate as a new model provider plugin for Dify, offering an OpenAI-compatible API for a wide range of open-source LLMs and embedding models.
  • Extensive Model Support: Adds support for 12 Large Language Models, including Llama 3.1/3.2, DeepSeek R1, Qwen, Mistral, CodeLlama, and OpenThinker, along with 2 embedding models: Nomic Embed Text and Qwen 3 Embedding 8B.
  • API Compatibility and Authentication: Leverages an OpenAI-compatible API ("https://api.llamagate.dev/v1") and uses bearer token authentication via an "api_key" credential for secure access.
  • Cost-Effective and Open-Weight: Features competitive pricing ranging from $0.02 to $0.55 per 1 million tokens, with all integrated models being open-weight, providing flexible and economical options.
  • Dify Plugin Integration: Extends Dify's "OAICompatLargeLanguageModel" and "OAICompatEmbeddingModel" base classes, ensuring seamless integration and functionality within the Dify ecosystem.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot dosubot bot added the enhancement New feature or request label Jan 6, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces LlamaGate as a new model provider, which is a great addition. The implementation correctly leverages the OpenAI-compatible base classes, simplifying the integration. However, I've identified a few critical issues where method parameters are not correctly passed to the superclass methods, which could lead to incorrect behavior or loss of functionality. Additionally, there are some opportunities for improvement in configuration and exception handling to enhance maintainability and correctness. Please see the detailed comments below.

- Add missing `user` parameter to LLM _invoke() call
- Add missing `input_type` parameter to text embedding _invoke() call
- Add 4 missing models to _position.yaml (codellama-7b, deepseek-coder-6.7b, openthinker-7b, dolphin3-8b)
- Replace `raise ex` with bare `raise` to preserve stack traces

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@@ -0,0 +1,37 @@
author: llamagate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Author should be langgenius if you want to submit plugin to this repo.

Copy link
Member

@crazywoola crazywoola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants