ml-rust · farhan-syah · Nov 26, 2025 · Nov 26, 2025 · Nov 26, 2025 · Nov 26, 2025
diff --git a/.python-version b/.python-version
diff --git a/Cargo.toml b/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "splintr"
-version = "0.3.0"
+version = "0.4.0"
 edition = "2021"
 description = "Fast Rust BPE tokenizer with Python bindings"
 license = "MIT"

diff --git a/README.md b/README.md
@@ -2,7 +2,7 @@
 
 [![Crates.io](https://img.shields.io/crates/v/splintr.svg)](https://crates.io/crates/splintr) [![PyPI](https://img.shields.io/pypi/v/splintr-rs.svg)](https://pypi.org/project/splintr-rs/) [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
 
-**A high-performance BPE tokenizer built in Rust with Python bindings, focused on speed, safety, and resource optimization.**
+**A high-performance BPE tokenizer built with Rust with Python bindings, focused on speed, safety, and resource optimization.**
 
 ## The Problem
 
@@ -35,9 +35,12 @@ pip install splintr-rs
 ```python
 from splintr import Tokenizer
 
-# Load a pretrained vocabulary
+# Load a pretrained vocabulary (OpenAI)
 tokenizer = Tokenizer.from_pretrained("cl100k_base")
 
+# Or load Llama 3 tokenizer (Meta) - supports all versions up to Llama 3.3
+# tokenizer = Tokenizer.from_pretrained("llama3")
+
 # Encode text to token IDs
 tokens = tokenizer.encode("Hello, world!")
 print(tokens)  # [9906, 11, 1917, 0]
@@ -56,7 +59,7 @@ print(batch_tokens)  # [[9906, 11, 1917, 0], [4438, 527, 499, 30], ...]
 
 ```toml
 [dependencies]
-splintr = "0.3.0"
+splintr = "0.4.0"
 ```
 
 ```rust
@@ -88,7 +91,7 @@ let batch_tokens = tokenizer.encode_batch(&texts);
 
 **Built for production:**
 
-- **Compatible vocabularies** - Supports cl100k_base and o200k_base (OpenAI models), with a familiar API
+- **Compatible vocabularies** - Supports cl100k_base, o200k_base (OpenAI), and Llama 3 family (Meta), with a familiar API
 - **Streaming decoder** - Real-time LLM output display with proper UTF-8 handling
 - **54 agent tokens** - Built-in support for chat, CoT reasoning, ReAct agents, tool calling, RAG citations
 - **Battle-tested algorithms** - PCRE2 with JIT, Aho-Corasick for special tokens, linked-list BPE
@@ -242,7 +245,7 @@ print(decoder.flush())
 
 ```python
 # Load pretrained model (includes vocabulary and special tokens)
-tokenizer = Tokenizer.from_pretrained("cl100k_base")  # or "o200k_base"
+tokenizer = Tokenizer.from_pretrained("cl100k_base")  # or "o200k_base", "llama3"
 
 # Load from custom vocabulary file
 tokenizer = Tokenizer(
@@ -295,32 +298,42 @@ See the [API documentation](https://docs.rs/splintr) for complete details.
 
 ## Supported Vocabularies
 
-| Vocabulary    | Used By              | Vocabulary Size | Special Tokens | Import Constant       |
-| ------------- | -------------------- | --------------- | -------------- | --------------------- |
-| `cl100k_base` | GPT-4, GPT-3.5-turbo | ~100,000        | 5 + 54 agent   | `CL100K_BASE_PATTERN` |
-| `o200k_base`  | GPT-4o               | ~200,000        | 2 + 54 agent   | `O200K_BASE_PATTERN`  |
+| Vocabulary    | Used By                       | Vocabulary Size | Special Tokens | Import Constant       |
+| ------------- | ----------------------------- | --------------- | -------------- | --------------------- |
+| `cl100k_base` | GPT-4, GPT-3.5-turbo          | ~100,000        | 5 + 54 agent   | `CL100K_BASE_PATTERN` |
+| `o200k_base`  | GPT-4o                        | ~200,000        | 2 + 54 agent   | `O200K_BASE_PATTERN`  |
+| `llama3`      | Llama 3, 3.1, 3.2, 3.3 (Meta) | ~128,000        | 11 + 54 agent  | `LLAMA3_PATTERN`      |
 
 **OpenAI standard tokens:**
 
 - **cl100k_base**: `<|endoftext|>`, `<|fim_prefix|>`, `<|fim_middle|>`, `<|fim_suffix|>`, `<|endofprompt|>`
 - **o200k_base**: `<|endoftext|>`, `<|endofprompt|>`
 
+**Meta Llama 3 standard tokens:**
+
+- **llama3**: `<|begin_of_text|>`, `<|end_of_text|>`, `<|start_header_id|>`, `<|end_header_id|>`, `<|eot_id|>`, `<|eom_id|>` (3.1+), `<|python_tag|>` (3.1+), `<|step_id|>` (3.2-Vision), `<|image|>` (3.2-Vision)
+
 ### Agent Tokens (54 per model)
 
-Splintr extends both vocabularies with tokens for building agent systems. See [docs/special_tokens.md](docs/special_tokens.md) for complete documentation.
+Splintr extends all vocabularies with tokens for building agent systems. See [docs/special_tokens.md](docs/special_tokens.md) for complete documentation.
 
 ```python
-from splintr import Tokenizer, CL100K_AGENT_TOKENS
+from splintr import Tokenizer, CL100K_AGENT_TOKENS, LLAMA3_AGENT_TOKENS
 
+# OpenAI models
 tokenizer = Tokenizer.from_pretrained("cl100k_base")
-
-# Encode with special tokens
 text = "<|think|>Let me reason...<|/think|>The answer is 42."
 tokens = tokenizer.encode_with_special(text)
-
-# Access token IDs programmatically
 print(CL100K_AGENT_TOKENS.THINK)      # 100282
 print(CL100K_AGENT_TOKENS.FUNCTION)   # 100292
+
+# Llama 3 models (vocabulary includes all special tokens up to Llama 3.3)
+tokenizer = Tokenizer.from_pretrained("llama3")
+tokens = tokenizer.encode_with_special(text)
+print(LLAMA3_AGENT_TOKENS.THINK)           # 128305
+print(LLAMA3_AGENT_TOKENS.FUNCTION)        # 128315
+print(LLAMA3_AGENT_TOKENS.BEGIN_OF_TEXT)   # 128000 (official Meta token)
+print(LLAMA3_AGENT_TOKENS.IMAGE)           # 128256 (official Meta 3.2-Vision token)
 ```
 
 | Category     | Tokens                                              | Purpose                    |