Ultra-fast text embeddings with native SIMD acceleration
Features β’ Languages β’ Performance β’ Installation β’ Documentation
FastEmbed is a cross-platform, multi-language text embedding library providing:
- β‘ Blazing fast hash-based embeddings (0.01-1ms per embedding)
- π§ ONNX Runtime support (1.23.2) for semantic embeddings (all 4 languages)
- π 4 native bindings: Node.js, Python, C#, Java
- π SIMD optimized assembly code (SSE4/AVX2)
- π§ Easy integration with ML frameworks
- π¦ Zero dependencies (self-contained native libraries)
Perfect for real-time semantic search, large-scale text processing, edge deployment, and ML prototyping.
- Hash-based embeddings: Deterministic, fast generation without neural networks
- ONNX-based embeddings: Semantic understanding with ONNX Runtime 1.23.2 (all 4 languages)
- Vector operations: Cosine similarity, dot product, normalization, addition
- Batch processing: Generate multiple embeddings efficiently
- Text similarity: High-level API for semantic comparison
- SIMD acceleration: Hand-optimized x86-64 assembly (SSE4, AVX2)
- Multi-threading: Parallel processing support
- Memory efficient: Minimal memory footprint
- Cross-platform: Windows, Linux, macOS
- ABI compliant: Follows System V ABI for maximum compatibility
| Language | Binding | Status | Performance (ONNX, short text) | Install |
|---|---|---|---|---|
| Java | JNI | β Ready | β‘ 22.5 ms (45 emb/s) | See bindings/java/ |
| Node.js | N-API | β Ready | β‘ 27.1 ms (37 emb/s) | npm install && npm run build |
| Python | pybind11 | β Ready | β‘ 28.6 ms (35 emb/s) | pip install . |
| C# | P/Invoke | β Ready | β‘ 28.5 ms (35 emb/s) | dotnet build |
See bindings/ for detailed integration guides.
π Full Benchmarks: See BENCHMARK_RESULTS.md for comprehensive performance data and methodology.
Ultra-fast deterministic embeddings - sub-millisecond performance:
- Performance: ~0.01-0.1 ms per embedding (~27,000 embeddings/sec average)
- SIMD optimized: Consistent performance across text lengths
- Deterministic: Same text always produces same embedding
- Vector operations: Sub-microsecond latency (see Vector Operations section below)
Note: Hash-based embeddings are deterministic and fast, but lack semantic understanding. For semantic search, use ONNX-based embeddings below.
All 4 language bindings support ONNX Runtime - semantic understanding with quality embeddings:
| Language | Short (108 chars) | Medium (460 chars) | Long (1574 chars) | Throughput (emb/s) |
|---|---|---|---|---|
| Java | 22.459 ms | 47.361 ms | 110.655 ms | 45 (short) |
| Node.js | 27.144 ms | 53.582 ms | 123.068 ms | 37 (short) |
| Python | 28.569 ms | 51.913 ms | 123.028 ms | 35 (short) |
| C# | 28.502 ms | 54.355 ms | 129.634 ms | 35 (short) |
Key Features:
- Semantic quality: 0.72 similarity for semantically similar texts, 0.59 for different
- Batch processing: 14-40 embeddings/sec (single), 14-17 emb/s (batch 100)
- Memory efficient: 0-0.3 MB overhead per embedding
- Consistent performance across all language bindings (8-45 emb/s depending on text length)
All bindings achieve sub-microsecond latency with SIMD optimizations:
- Dot Product: 0.000-0.001 ms (1M-5.6M ops/sec)
- Cosine Similarity: 0.001 ms (750K-2M ops/sec)
- Vector Norm: 0.000-0.001 ms (1.4M-5.7M ops/sec)
- Normalization: 0.001-0.003 ms (350K-885K ops/sec)
Tested on x86_64 (Windows/Linux) with GCC -O3 -march=native, SIMD instructions (AVX2/SSE4)
Windows:
- Visual Studio 2022 Build Tools (with "Desktop development with C++")
- NASM >= 2.14 (download)
- Node.js 18+ (for Node.js binding)
- Python 3.7+ (for Python binding)
- .NET SDK 8.0+ (for C# binding)
- JDK 17+ and Maven (for Java binding)
Linux/macOS:
- NASM (assembler) >= 2.14
- C/C++ compiler (GCC 7+, Clang, or MSVC)
- Make
Windows:
# Build shared library
python scripts\build_native.py
# Or use batch script
scripts\build_windows.batLinux/macOS:
# Clone repository
git clone https://github.com/shuanat/fastembed-native.git
cd fastembed-native
# Build shared C/Assembly library
make shared
# Or manually
cd bindings/shared
make all
make shared
cd ../..macOS (alternative):
# Use Makefile (recommended)
make shared
# Or use cross-platform build script
python scripts/build_native.pyWindows:
# Build shared library first
scripts\build_windows.bat
# Then build all bindings using Makefile
make all
# Or build individually:
cd bindings\nodejs && npm install && npm run build
cd ..\python && python setup.py build_ext --inplace
cd ..\csharp\src && dotnet build
cd ..\..\java\java && mvn compileLinux/macOS:
# Build all bindings
make all
# Or build individually (see language sections below)Windows:
cd bindings\nodejs
npm install
npm run build
node test-native.jsLinux/macOS:
cd bindings/nodejs
npm install
npm run build
node test-native.jsconst { FastEmbedNativeClient } = require('./lib/fastembed-native');
const client = new FastEmbedNativeClient(768);
const embedding = client.generateEmbedding("machine learning");
console.log(embedding); // Float32Array[768]Windows:
REM Build shared native library first (required on Windows)
REM This produces embedding_lib.obj and embedding_generator.obj in bindings\shared\build\
scripts\build_windows.bat
REM Alternatively: python scripts\build_native.py
cd bindings\python
pip install pybind11 numpy
python setup.py build_ext --inplace
python test_python_native.pyNote (Windows): the Python extension links against precompiled assembly objects from
bindings\shared\build\embedding_lib.obj and bindings\shared\build\embedding_generator.obj.
If they are missing, build the shared library first using scripts\build_windows.bat
or python scripts\build_native.py.
Linux/macOS:
cd bindings/python
pip install pybind11 numpy
python setup.py build_ext --inplace
python test_python_native.pyfrom fastembed_native import FastEmbedNative
client = FastEmbedNative(768)
embedding = client.generate_embedding("machine learning")
print(embedding.shape) # (768,)Windows:
cd bindings\csharp\src
dotnet build FastEmbed.csproj
cd ..\tests
dotnet testLinux/macOS:
cd bindings/csharp/src
dotnet build FastEmbed.csproj
cd ../tests
dotnet testusing FastEmbed;
var client = new FastEmbedClient(dimension: 768);
float[] embedding = client.GenerateEmbedding("machine learning");Windows:
cd bindings\java\java
mvn compile
cd ..
java -Djava.library.path=target\lib -cp "target\classes;target\lib\*" com.fastembed.FastEmbedBenchmarkLinux/macOS:
cd bindings/java/java
mvn compile
cd ..
java -Djava.library.path=target/lib -cp target/classes:target/lib/* com.fastembed.FastEmbedBenchmarkimport com.fastembed.FastEmbed;
FastEmbed client = new FastEmbed(768);
float[] embedding = client.generateEmbedding("machine learning");# Python example
from fastembed_native import FastEmbedNative
client = FastEmbedNative(768)
emb1 = client.generate_embedding("artificial intelligence")
emb2 = client.generate_embedding("machine learning")
similarity = client.cosine_similarity(emb1, emb2)
print(f"Similarity: {similarity:.4f}") # 0.9500+// Node.js example
const { FastEmbedNativeClient } = require('./lib/fastembed-native');
const client = new FastEmbedNativeClient(768);
const texts = ["AI", "ML", "NLP", "Computer Vision"];
const embeddings = texts.map(text =>
client.generateEmbedding(text)
);
console.log(`Generated ${embeddings.length} embeddings`);Generate embedding from text.
- Parameters:
text(string) - Input textdimension(int) - Embedding dimension (e.g., 768)
- Returns: Float array/vector
Calculate cosine similarity between two vectors.
- Returns:
float- Similarity score (-1 to 1)
Calculate dot product of two vectors.
Calculate L2 norm of a vector.
Normalize vector to unit length (L2 normalization).
Element-wise vector addition.
See each binding's README for language-specific API details.
fastembed/
βββ bindings/
β βββ shared/ # C/Assembly core library
β β βββ src/ # Assembly + C implementation
β β βββ include/ # Public headers
β β βββ Makefile # Build configuration
β βββ nodejs/ # Node.js N-API binding
β βββ python/ # Python pybind11 binding
β βββ csharp/ # C# P/Invoke binding
β βββ java/ # Java JNI binding
βββ scripts/ # Build automation scripts
βββ docs/ # Documentation
βββ tests/ # Integration tests
βββ Makefile # Root build system
βββ README.md # This file
# Root directory
make all # Build shared library + all bindings
make shared # Build shared library only
make nodejs # Build Node.js binding
make python # Build Python binding
make csharp # Build C# binding
make java # Build Java binding
make test # Run all tests
make clean # Clean build artifactsWindows: Full native build support with Visual Studio.
# Build shared library first
scripts\build_windows.bat
# Then build all bindings using Makefile
make all
# Run all tests
make testLinux/macOS: Use Makefile.
make all # Build all
make test # Run testsmacOS (alternative):
# Use Makefile (recommended)
make all
# Or use cross-platform build script
python scripts/build_native.pySee bindings/shared/README.md for detailed build instructions.
Windows:
# Build and run all tests
# Use CMake directly (see docs/BUILD_CMAKE.md)
cd bindings\shared
mkdir build_cmake
cd build_cmake
cmake ..
cmake --build .
cd bindings\shared\build_cmake
ctest -C Release
# Or run individual tests
Release\test_hash_functions.exe
Release\test_embedding_generation.exe
Release\test_sqrt_quality.exe
Release\benchmark_improved.exeLinux/WSL:
# Build and run all tests
# Use CMake directly (see docs/BUILD_CMAKE.md)
cd bindings/shared
mkdir -p build_cmake
cd build_cmake
cmake ..
cmake --build .
cd bindings/shared/build_cmake
ctest
# Or run individual tests
./test_hash_functions
./test_embedding_generation
./test_sqrt_quality
./benchmark_improvedSee: tests/README_TESTING.md for detailed testing guide.
Windows:
# Run all tests using Makefile
make test
# Or test individually
cd bindings\nodejs && node test-native.js
cd ..\python && python test_python_native.py
cd ..\csharp\tests && dotnet test
cd ..\java\java && mvn testLinux/macOS:
# Test all bindings
make test
# Or test individually
cd bindings/nodejs && node test-native.js
cd bindings/python && python test_python_native.py
cd bindings/csharp/tests && dotnet test
cd bindings/java && mvn test| Document | Description |
|---|---|
| [bindings/shared/README.md] | Shared C/Assembly library |
| [bindings/nodejs/README.md] | Node.js binding guide |
| [bindings/python/README.md] | Python binding guide |
| [bindings/csharp/README.md] | C# binding guide |
| [bindings/java/README.md] | Java binding guide |
| [docs/ARCHITECTURE.md] | System architecture and design |
| [docs/API.md] | Complete API reference |
| [CONTRIBUTING.md] | Contribution guidelines |
| [CHANGELOG.md] | Version history and release notes |
graph TB
subgraph App["Application Layer"]
NodeJS["Node.js"]
Python["Python"]
CSharp["C#"]
Java["Java"]
end
subgraph Bind["Language Binding Layer"]
NAPI["N-API"]
PyBind["pybind11"]
PInvoke["P/Invoke"]
JNI["JNI"]
end
subgraph CLib["FastEmbed C Library"]
HashAPI["Hash-based<br/>Embeddings"]
VecAPI["Vector<br/>Operations"]
ONNXAPI["ONNX<br/>Runtime"]
end
subgraph Asm["Assembly Layer"]
SIMD["SIMD Optimized<br/>SSE4/AVX2<br/>x86-64"]
end
NodeJS --> NAPI
Python --> PyBind
CSharp --> PInvoke
Java --> JNI
NAPI --> CLib
PyBind --> CLib
PInvoke --> CLib
JNI --> CLib
CLib --> Asm
style App fill:#e1f5ff
style Bind fill:#fff4e1
style CLib fill:#e8f5e9
style Asm fill:#fce4ec
Contributions welcome! Please see CONTRIBUTING.md for guidelines.
- π Bug fixes and stability improvements
- β¨ New language bindings (Go, Rust, Ruby)
- π Documentation improvements
- π Performance optimizations
- π§ͺ Test coverage expansion
- π‘ Use case examples
Dual-licensed under AGPL-3.0 and a Commercial License:
- Open Source: see LICENSE
- Commercial licensing (closed source/SaaS): see LICENSING.md and LICENSE-COMMERCIAL.md
Built with:
- NASM - Netwide Assembler
- Node-API - Node.js native bindings
- pybind11 - Python C++ bindings
- P/Invoke - .NET native interop
- JNI - Java Native Interface
- π Documentation
- π Issue Tracker (GitHub Issues)
- π Commercial License Requests: open a GitHub Issue β "License Request" template
Made with β€οΈ for developers who need fast, reliable embeddings
β Star us on GitHub if you find this useful!