GoSeek-V3

A high-performance Go runtime for DeepSeek-V3 models.

Disclaimer: This project is an independent implementation and is not affiliated with, endorsed by, or sponsored by DeepSeek. DeepSeek is a trademark of its respective owners. This project is a Go rewrite of software related to DeepSeek models.

Original work: Copyright (c) 2023 DeepSeek

Modifications: Rewritten from Python to Go, 2026.

Forked from: DeepSeek-V3 / README.md

1. Introduction

GoSeek-V3 is a Go reimplementation of the inference engine for DeepSeek-V3, a powerful Mixture-of-Experts (MoE) language model with 671B total parameters (37B activated per token).

This project provides a DeepSeek-compatible inference runtime written entirely in Go, targeting production deployments that benefit from Go's concurrency model, low memory overhead, and static compilation.

GoSeek-V3 preserves all capabilities of the original model:

Multi-head Latent Attention (MLA) for efficient inference
DeepSeekMoE architecture for cost-effective computation
Auxiliary-loss-free load balancing for stable training
Multi-Token Prediction (MTP) objective for stronger performance and speculative decoding

The underlying model was pre-trained on 14.8 trillion diverse, high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages. Reasoning capabilities were further enhanced via knowledge distillation from the DeepSeek-R1 series.

This project focuses on the inference runtime. Model weights are sourced from the original DeepSeek-V3 release and are subject to the DeepSeek Model License.

2. Model Summary

Architecture

Property	Value
Architecture	Mixture-of-Experts (MoE)
Total Parameters	671B
Activated Parameters (per token)	37B
Context Length	128K tokens
Attention	Multi-head Latent Attention (MLA)

Key Features

Load Balancing: An auxiliary-loss-free strategy minimizes performance degradation while encouraging balanced expert utilization.

Multi-Token Prediction: MTP improves model performance and enables speculative decoding for faster inference.

FP8 Training: The model was trained using an FP8 mixed precision framework, validating FP8 effectiveness at extreme scale. FP8 weights are provided natively; a conversion script for BF16 is available (see How to Run Locally).

Knowledge Distillation: Reasoning patterns from DeepSeek-R1's long Chain-of-Thought are distilled into DeepSeek-V3, improving its reasoning while preserving output style and length control.

3. Model Downloads

Model	Total Params	Activated Params	Context Length	Download
DeepSeek-V3-Base	671B	37B	128K	🤗 Hugging Face
DeepSeek-V3	671B	37B	128K	🤗 Hugging Face

Note: The total size on Hugging Face is 685B, which includes 671B of main model weights and 14B of Multi-Token Prediction (MTP) module weights.

4. License

This code repository is licensed under the MIT License.

The use of DeepSeek-V3 Base/Chat model weights is subject to the DeepSeek Model License. DeepSeek-V3 (Base and Chat) supports commercial use.

This project (GoSeek-V3) is an independent Go reimplementation and is not affiliated with or endorsed by DeepSeek. The DeepSeek Model License applies to the model weights and derivatives.

5. Contact

For questions about this Go runtime, please open an issue in this repository or write leycm@proton.me.

For questions about the underlying DeepSeek-V3 model, contact the original authors at their repository or at service@deepseek.com.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
go		go
interface		interface
.gitignore		.gitignore
LICENSE-CODE		LICENSE-CODE
LICENSE-MODEL		LICENSE-MODEL
README.md		README.md
README_WEIGHTS.md		README_WEIGHTS.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GoSeek-V3

Table of Contents

1. Introduction

2. Model Summary

Architecture

Key Features

3. Model Downloads

4. License

5. Contact

About

Licenses found

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GoSeek-V3

Table of Contents

1. Introduction

2. Model Summary

Architecture

Key Features

3. Model Downloads

4. License

5. Contact

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages