HypEmbed

Pure-Rust text embedding inference for local-first applications.

HypEmbed is a Rust library for generating BERT-compatible text embeddings without Python, ONNX Runtime, libtorch, or hosted inference services. Load local model weights, tokenize input, run the encoder, and get normalized vectors from a small API surface.

Why HypEmbed

Pure Rust from tokenizer to encoder forward pass
Local-first inference with no external ML runtime dependency
BERT-family support for common embedding models such as MiniLM
Correctness-focused math with stable softmax, layer norm, and normalization
Performance-aware implementation with SIMD primitives, memory-mapped weights, and batch tokenization

Current Scope

Supports BERT-style encoder models, including BERT, MiniLM, and DistilBERT-style layouts
Loads config.json, vocab.txt, and model.safetensors from a local model directory
Offers mean pooling and CLS pooling
Accepts F32, F16, and BF16 weights, converting to f32 for inference
Runs on CPU only

HypEmbed does not currently handle training, quantization, GPU execution, or direct Hugging Face Hub downloads.

Installation

cargo add hypembed

Quick Start

use hypembed::{Embedder, EmbeddingOptions, PoolingStrategy};

let model = Embedder::load("./model").unwrap();

let options = EmbeddingOptions::default()
    .with_normalize(true)
    .with_pooling(PoolingStrategy::Mean);

let embeddings = model
    .embed(&["hello world", "rust embeddings"], &options)
    .unwrap();

println!("Embedding dim: {}", embeddings[0].len());
println!("First 5 values: {:?}", &embeddings[0][..5]);

To try a complete example locally:

cargo run --example basic_embed -- ./path/to/model

Model Directory

HypEmbed expects a local directory with:

File	Description
`config.json`	Hugging Face style model configuration
`vocab.txt`	BERT WordPiece vocabulary
`model.safetensors`	SafeTensors weights

Example compatible model:

sentence-transformers/all-MiniLM-L6-v2

Documentation

Project site: https://neuralforgeone.github.io/hypembed/
API docs: https://neuralforgeone.github.io/hypembed/api/hypembed/
Architecture notes: ARCHITECTURE.md
Product spec: PRODUCT_SPEC.md
Roadmap: ROADMAP.md

Design Notes

HypEmbed follows a simple pipeline:

input text
  -> pre-tokenize and normalize
  -> WordPiece tokenize
  -> add special tokens, truncate, and pad
  -> embedding layer
  -> encoder stack
  -> mean or CLS pooling
  -> optional L2 normalization
  -> embedding vector

The project favors explicit behavior and stable numerics:

softmax subtracts the row maximum before exponentiation
layer norm uses epsilon guards
pooling and vector normalization avoid divide-by-zero edge cases
typed errors keep load and inference failures inspectable

Open Source Status

HypEmbed is early-stage but already includes:

cross-platform CI
benchmark compilation checks
generated API documentation
architecture and roadmap notes in-repo

License

Licensed under either of:

Apache License, Version 2.0, see LICENSE-APACHE
MIT license, see LICENSE-MIT

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github		.github
benches		benches
docs		docs
examples		examples
scripts		scripts
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HypEmbed

Why HypEmbed

Current Scope

Installation

Quick Start

Model Directory

Documentation

Design Notes

Open Source Status

License

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HypEmbed

Why HypEmbed

Current Scope

Installation

Quick Start

Model Directory

Documentation

Design Notes

Open Source Status

License

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages