-
Notifications
You must be signed in to change notification settings - Fork 76
Open
Description
As per the recommendation from @neilmehta24 in issue #29, I'm creating this new issue to formally request and track support for the Kimi-K2 model family. For me in particular, the MLX supported model:
mlx-community/Kimi-K2-Instruct-0905-mlx-3bit
Text Generation • 1T parameters
When attempting to load models from this family in LM Studio, the process fails with a ValueError requiring trust_remote_code=True. This is a highly useful model, and adding native support would be a valuable addition to mlx-engine.
2025-09-10 19:21:12 [DEBUG]
ValueError: The repository /Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit contains custom code which must be executed to correctly load the model. You can inspect the repository content at /Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit.
You can inspect the repository content at https://hf.co//Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
At:
/Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/framework-unified-mac-arm64@19/lib/python3.11/site-packages/transformers/dynamic_module_utils.py(742): resolve_trust_remote_code
/Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/framework-unified-mac-arm64@19/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py(1091): from_pretrained
/Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac-arm64@62/lib/python3.11/site-packages/mlx_lm/tokenizer_utils.py(451): load_tokenizer
/Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac-arm64@62/lib/python3.11/site-packages/mlx_lm/utils.py(270): load
/Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac-arm64@62/lib/python3.11/site-packages/mlx_engine/model_kit/model_kit.py(90): _full_model_init
/Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac-arm64@62/lib/python3.11/site-packages/mlx_engine/model_kit/model_kit.py(122): __init__
/Users/tage/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac-arm64@62/lib/python3.11/site-packages/mlx_engine/generate.py(98): load_model
2025-09-10 19:21:12 [DEBUG]
lmstudio-llama-cpp: failed to load model. Error: Error when loading model: ValueError: The repository /Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit contains custom code which must be executed to correctly load the model. You can inspect the repository content at /Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit.
You can inspect the repository content at https://hf.co//Users/tage/.lmstudio/models/mlx-community/Kimi-K2-Instruct-0905-mlx-3bit.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
As reference, the model works fine with mlx_lm, SERVER:
tage@Mac-Studio ~ % mlx_lm.server --model kimi-model --host 0.0.0.0 --port 8080 --max-tokens 262144 --use-default-chat-template --trust-remote-code
2025-09-10 19:34:46,286 - INFO - Reloaded tiktoken model from kimi-model/tiktoken.model
2025-09-10 19:34:46,287 - INFO - #words: 163842 - BOS ID: 163584 - EOS ID: 163585
/Users/tage/Library/Python/3.13/lib/python/site-packages/mlx_lm/server.py:986: UserWarning: mlx_lm.server is not recommended for production as it only implements basic security checks.
warnings.warn(
2025-09-10 19:34:46,647 - INFO - Starting httpd at 0.0.0.0 on port 8080...
192.168.0.1 - - [10/Sep/2025 19:37:41] "POST /v1/chat/completions HTTP/1.1" 200 -
[WARNING] Generating with a model that requires 446707 MB which is close to the maximum recommended size of 475136 MB. This can be slow. See the documentation for possible work-arounds: https://github.com/ml-explore/mlx-lm/tree/main#large-models
2025-09-10 19:37:52,960 - INFO - Prompt processing progress: 0/24
2025-09-10 19:37:54,148 - INFO - Prompt processing progress: 23/24
2025-09-10 19:37:54,404 - INFO - Prompt processing progress: 24/24
CLIENT
tage@MacBookProM4Max ~ % curl -N http://Mac-Studio.local:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-model",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "How many Rs in strawberry."}
],
"stream": false
}'
{"id": "chatcmpl-0ac355bc-7699-4e94-a5d4-604025e45c1b", "system_fingerprint": "0.27.1-0.29.0-macOS-15.6.1-arm64-arm-64bit-Mach-O-applegpu_g15d", "object": "chat.completion", "model": "kimi-model", "created": 1757525861, "choices": [{"index": 0, "finish_reason": "stop", "logprobs": {"token_logprobs": [-0.125, 0.0, 0.0, -0.625, -0.25, -1.0, -0.5, 0.0, -0.5, -0.25, 0.0, -0.25, 0.0, 0.0, -0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], "top_logprobs": [], "tokens": [5472, 554, 3465, 53890, 381, 10812, 3465, 49, 381, 82, 306, 276, 6268, 3465, 1, 731, 1039, 18038, 5046, 381, 163586]}, "message": {"role": "assistant", "content": "There are **three** letter **R**s in the word **\"strawberry.\"**", "tool_calls": []}}], "usage": {"prompt_tokens": 24, "completion_tokens": 21, "total_tokens": 45}}%
SYSTEM
LM Studio 0.3.25 (Build 2)
Mac Studio 2025
Chip Apple M3 Ultra
Memory 512 GB
MacOS 15.5 (24F74)
Hardware Overview:
Model Name: Mac Studio
Model Identifier: Mac15,14
Chip: Apple M3 Ultra
Total Number of Cores: 32 (24 performance and 8 efficiency)
Memory: 512 GB
System Firmware Version: 11881.121.1
OS Loader Version: 11881.121.1
System Software Overview:
System Version: macOS 15.5 (24F74)
Kernel Version: Darwin 24.5.0
Boot Volume: Macintosh HD
Boot Mode: Normal
Secure Virtual Memory: Enabled
System Integrity Protection: Enabled
Similar Models:
The issue has been observed with the following models:
- mlx-community/moonshotai_Kimi-K2-Instruct-mlx
- inferencerlabs/Kimi-K2-Instruct-MLX-3.985bit
- Kimi-K2-Instruct-0905-mlx-DQ3_K_M (as reported by user xain in the previous thread)
*Edit Quoting @Dayleyfocus :
Please make it so there is a way to override it. If I want to do an advanced override and I pick a unsafe model, thats on me
.
acediac, xain, robofearth, DiscoStew6082, gmouth and 2 moreck37 and gmouth
Metadata
Metadata
Assignees
Labels
No labels