Skip to content

[Model Request] Add support for Kimi-K2 model family #219

@KarlXMagnusson

Description

@KarlXMagnusson

As per the recommendation from @neilmehta24 in issue #29, I'm creating this new issue to formally request and track support for the Kimi-K2 model family. For me in particular, the MLX supported model:

mlx-community/Kimi-K2-Instruct-0905-mlx-3bit
Text Generation • 1T parameters

When attempting to load models from this family in LM Studio, the process fails with a ValueError requiring trust_remote_code=True. This is a highly useful model, and adding native support would be a valuable addition to mlx-engine.

2025-09-10 19:21:12 [DEBUG]
 ValueError: The repository /Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit contains custom code which must be executed to correctly load the model. You can inspect the repository content at /Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit.
 You can inspect the repository content at https://hf.co//Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.

At:
  /Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/framework-unified-mac-arm64@19/lib/python3.11/site-packages/transformers/dynamic_module_utils.py(742): resolve_trust_remote_code
  /Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/framework-unified-mac-arm64@19/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py(1091): from_pretrained
  /Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac-arm64@62/lib/python3.11/site-packages/mlx_lm/tokenizer_utils.py(451): load_tokenizer
  /Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac-arm64@62/lib/python3.11/site-packages/mlx_lm/utils.py(270): load
  /Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac-arm64@62/lib/python3.11/site-packages/mlx_engine/model_kit/model_kit.py(90): _full_model_init
  /Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac-arm64@62/lib/python3.11/site-packages/mlx_engine/model_kit/model_kit.py(122): __init__
  /Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac-arm64@62/lib/python3.11/site-packages/mlx_engine/generate.py(98): load_model
2025-09-10 19:21:12 [DEBUG]
 lmstudio-llama-cpp: failed to load model. Error: Error when loading model: ValueError: The repository /Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit contains custom code which must be executed to correctly load the model. You can inspect the repository content at /Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit.
 You can inspect the repository content at https://hf.co//Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.

As reference, the model works fine with mlx_lm, SERVER:

tage@Mac-Studio ~ % mlx_lm.server --model kimi-model --host 0.0.0.0 --port 8080 --max-tokens 262144 --use-default-chat-template --trust-remote-code
2025-09-10 19:34:46,286 - INFO - Reloaded tiktoken model from kimi-model/tiktoken.model
2025-09-10 19:34:46,287 - INFO - #words: 163842 - BOS ID: 163584 - EOS ID: 163585
/Users/tage/Library/Python/3.13/lib/python/site-packages/mlx_lm/server.py:986: UserWarning: mlx_lm.server is not recommended for production as it only implements basic security checks.
  warnings.warn(
2025-09-10 19:34:46,647 - INFO - Starting httpd at 0.0.0.0 on port 8080...
192.168.0.1 - - [10/Sep/2025 19:37:41] "POST /v1/chat/completions HTTP/1.1" 200 -
[WARNING] Generating with a model that requires 446707 MB which is close to the maximum recommended size of 475136 MB. This can be slow. See the documentation for possible work-arounds: https://github.com/ml-explore/mlx-lm/tree/main#large-models
2025-09-10 19:37:52,960 - INFO - Prompt processing progress: 0/24
2025-09-10 19:37:54,148 - INFO - Prompt processing progress: 23/24
2025-09-10 19:37:54,404 - INFO - Prompt processing progress: 24/24

CLIENT

tage@MacBookProM4Max ~ % curl -N http://Mac-Studio.local:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
  "model": "kimi-model",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "How many Rs in strawberry."}
  ],
  "stream": false
}'
{"id": "chatcmpl-0ac355bc-7699-4e94-a5d4-604025e45c1b", "system_fingerprint": "0.27.1-0.29.0-macOS-15.6.1-arm64-arm-64bit-Mach-O-applegpu_g15d", "object": "chat.completion", "model": "kimi-model", "created": 1757525861, "choices": [{"index": 0, "finish_reason": "stop", "logprobs": {"token_logprobs": [-0.125, 0.0, 0.0, -0.625, -0.25, -1.0, -0.5, 0.0, -0.5, -0.25, 0.0, -0.25, 0.0, 0.0, -0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], "top_logprobs": [], "tokens": [5472, 554, 3465, 53890, 381, 10812, 3465, 49, 381, 82, 306, 276, 6268, 3465, 1, 731, 1039, 18038, 5046, 381, 163586]}, "message": {"role": "assistant", "content": "There are **three** letter **R**s in the word **\"strawberry.\"**", "tool_calls": []}}], "usage": {"prompt_tokens": 24, "completion_tokens": 21, "total_tokens": 45}}%        

SYSTEM

LM Studio 0.3.25 (Build 2)

Mac Studio 2025
Chip Apple M3 Ultra
Memory 512 GB
MacOS 15.5 (24F74)

Hardware Overview:
  Model Name:	Mac Studio
  Model Identifier:	Mac15,14
  Chip:	Apple M3 Ultra
  Total Number of Cores:	32 (24 performance and 8 efficiency)
  Memory:	512 GB
  System Firmware Version:	11881.121.1
  OS Loader Version:	11881.121.1

System Software Overview:
  System Version:	macOS 15.5 (24F74)
  Kernel Version:	Darwin 24.5.0
  Boot Volume:	Macintosh HD
  Boot Mode:	Normal
  Secure Virtual Memory:	Enabled
  System Integrity Protection:	Enabled

Similar Models:
The issue has been observed with the following models:

  • mlx-community/moonshotai_Kimi-K2-Instruct-mlx
  • inferencerlabs/Kimi-K2-Instruct-MLX-3.985bit
  • Kimi-K2-Instruct-0905-mlx-DQ3_K_M (as reported by user xain in the previous thread)

*Edit Quoting @Dayleyfocus :

Please make it so there is a way to override it. If I want to do an advanced override and I pick a unsafe model, thats on me

.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions