Go implementation of DKSplit - fast word segmentation for text without spaces.
Built with BiLSTM-CRF model and ONNX Runtime.
| CPU | Mode | QPS |
|---|---|---|
| Intel Core i9-14900K | Single | ~1,700/s |
| Intel Core i9-14900K | Batch | ~7,000/s |
| Intel Core i9-9900K | Single | ~1,000/s |
| Intel Core i9-9900K | Batch | ~3,000/s |
Batch mode is 4.6x faster than single mode.
Compared to Python version:
- Single: 2.7x faster
- Batch: 5.6x faster
go get github.com/ABTdomain/dksplit-gopackage main
import (
"fmt"
"log"
dksplit "github.com/ABTdomain/dksplit-go"
)
func main() {
splitter, err := dksplit.New("models")
if err != nil {
log.Fatal(err)
}
defer splitter.Close()
// Single
result, _ := splitter.Split("chatgptlogin")
fmt.Println(result)
// Output: [chatgpt login]
// Batch
results, _ := splitter.SplitBatch([]string{"openaikey", "microsoftoffice"}, 256)
fmt.Println(results)
// Output: [[openai key] [microsoft office]]
}| Input | Output |
|---|---|
| chatgptlogin | chatgpt login |
| kubernetescluster | kubernetes cluster |
| microsoftoffice | microsoft office |
| mercibeaucoup | merci beaucoup |
| gutenmorgen | guten morgen |
Tested on Majestic Million domains:
| Input | Output |
|---|---|
| amitriptylineinfo | amitriptyline info |
| autoriteprotectiondonnees | autorite protection donnees |
| mountaingoatsoftware | mountain goat software |
| psychologytoday | psychology today |
| affordablecollegesonline | affordable colleges online |
| stephenwolfram | stephen wolfram |
| ralphlauren | ralphlauren |
| m12ivermectin | m12i vermectin |
Run benchmark yourself:
wget https://downloads.majestic.com/majestic_million.csv -O top-1m.csv
go test -v -run TestRealWorldBenchmarkFor detailed accuracy benchmarks on 1,000 real newly registered domains (DKSplit vs WordSegment vs WordNinja vs GPT-5.2), see the Python version benchmark.
The Go and Python versions use the same model and produce identical results.
Results on Intel Core i9-9900K:
- Dataset: 10,000 unique domains (length > 10, no hyphens)
- QPS: 3,175/s
- Go 1.21+
- Linux x64
- Website: ABTdomain.com
- Use Case: domainkits.com
- Python version: github.com/ABTdomain/dksplit
- Hugging Face: huggingface.co/ABTdomain/dksplit
- Documentation: dksplit.readthedocs.io
- PyPI: pypi.org/project/dksplit
If you find this useful:
- ⭐ Star this repo
- 🐛 Report issues on GitHub Issues
This project is licensed under the Apache License 2.0.
Please attribute as: DKsplit by ABTdomain