AI sovereignty for the JVM.
The JVM powers global finance, big data, and mission-critical infrastructure. Quixotic provides core building blocks for running LLM inference natively on the JVM, model loading, tokenization, and tensor operations, with native-performance CPU/GPU backends where needed. No external services, no Python interop, no ONNX bridges.
- Write Once, Accelerate Everywhere - A single Tensor API across Panama, C, CUDA, HIP, Metal, OpenCL, and Mojo. Switch backends with one line.
- GraalVM Native Image - First-class support for small footprint and fast startup.
- JVM-Native Architecture - Built from first principles for the JVM. No Python dependencies, no external runtimes.
- On-Device LLM Inference - Run large language models locally with quantization and efficient memory management.
- Vector Embeddings - Fast vector operations for RAG pipelines and semantic search.
| Module | Description |
|---|---|
jota |
Tensor engine with CPU/GPU backends |
toknroll |
TikToken-compatible, BPE and common LLM tokenizers |
gguf |
Pure Java read/write for llama.cpp's GGUF model format |
safetensors |
Pure Java read/write for HuggingFace's Safetensors format |
The tensor engine supports multiple backends, packaged as separate artifacts:
| Backend | Artifact | Runtime Dependencies |
|---|---|---|
| Java (Panama) | jota-backend-panama |
Any JVM (not Native Image compatible) |
| C | jota-backend-c |
gcc or clang |
| CUDA | jota-backend-cuda |
NVIDIA driver + nvcc |
| HIP | jota-backend-hip |
ROCm + hipcc |
| Metal | jota-backend-metal |
Xcode CLI tools (xcrun) |
| OpenCL | jota-backend-opencl |
OpenCL ICD runtime |
| Mojo | jota-backend-mojo |
mojo CLI + ROCm runtime (experimental) |
Just include the backend JAR on the classpath, it becomes available automatically. No -Djava.library.path required.
For GraalVM Native Image, add jota-graal.