From Latin "artifex": craftsman, artist, maker.
⚠️ Major Refactoring in ProgressArtifex is still in a heavy rebuild cycle. Stability is not guaranteed, and breaking changes are expected between commits.
Area Status Current expectation API surface Unstable Public interfaces can change without deprecation while runtime boundaries are still being simplified. Performance In progress Optimization work is ongoing across training, inference, and benchmark paths. Do not assume current throughput or memory behavior is final. Feature breadth Expanding Core model families ship today, but additional capabilities, deeper integrations, and broader examples are still being added. Docs and workflows Maintained but evolving Checked-in installation, quickstart, examples, and contributor workflows are kept aligned with the live runtime, while broader documentation continues to be revised. Artifex is suitable for active research and repository development. It is not in a state where long-term API stability or production guarantees should be assumed.
Artifex is a modular library for generative modeling research, providing implementations of various state-of-the-art generative models with a focus on modularity, type safety, and scientific reproducibility. Built on JAX and Flax NNX, it emphasizes clean abstractions and extensible design for research experimentation.
- Research First: Designed for experimentation with clean, modular architecture
- Modern Stack: Built on JAX/Flax NNX with full JIT compilation and automatic differentiation
- Typed Surfaces: Protocol-based design with Pyright-checked source interfaces
- Multi-Modal: Unified interface across images, text, audio, proteins, and more
- Extensible: Easy to add new models, losses, and domain-specific constraints
- Actively Verified: Blocking CI enforces repository contracts, packaging checks, and focused test suites
Artifex prioritizes:
- Modularity: Easy to swap components and experiment
- Clarity: Clean, readable implementations over clever optimizations
- Extensibility: Simple to add new models and functionality
- Reproducibility: Deterministic with clear configuration management
- Type Checking: Pyright basic-mode reports track the supported source surface while repo-wide blocking enforcement is still being rebuilt
- JAX Native: Leverages JAX's functional programming paradigm
- Flax NNX: Modern object-oriented API for neural networks
- Configuration Management: Frozen dataclass configs with validation
- Testing: Blocking CI enforces repository contracts and a 70% repo-wide coverage floor while new changes target 80% coverage
See Design Philosophy for detailed discussion.
- VAE Family: VAE, β-VAE, VQ-VAE, Conditional VAE
- GAN Family: DCGAN, WGAN, StyleGAN, CycleGAN, PatchGAN
- Diffusion Models: DDPM, DDIM, Score-based models, DiT, Latent Diffusion
- Normalizing Flows: RealNVP, Glow, MAF, IAF, Neural Spline Flows
- Energy-Based Models: Langevin dynamics, MCMC sampling with BlackJAX
- Autoregressive Models: PixelCNN, WaveNet, Transformer-based
- Geometric Models: Point clouds, meshes, protein structures, SE(3) molecular flows
- Image: Multi-scale architectures, various loss functions, quality metrics
- Text: Tokenization, language modeling, text generation
- Audio: Spectral processing, waveform generation, WaveNet
- Protein: Structure generation with physical constraints
- Tabular: Mixed data types, privacy-preserving generation
- Timeseries: Sequential patterns, temporal dynamics
- Multi-Modal: Cross-modal generation and alignment
- Unified Configuration: Frozen dataclass configs with nested validation
- Protocol-Based Design: Clear interfaces for models, trainers, and data
- Modular Losses: Composable loss functions (reconstruction, adversarial, perceptual)
- Flexible Sampling: Multiple sampling strategies (ancestral, MCMC, ODE/SDE)
- Extension System: Domain-specific constraints and functionality
- Evaluation Framework: Standardized metrics and benchmarks with CalibraX-aligned composition
# Package users
pip install artifex
# Optional Linux NVIDIA GPU support
pip install "artifex[cuda12]"If you are contributing from a source checkout instead:
git clone https://github.com/avitai/artifex.git
cd artifex
# Run setup script (creates .venv, syncs extras, chooses a backend policy)
./setup.sh
# Activate the environment (must use 'source')
source ./activate.shThe setup script automatically:
- Detects an appropriate backend policy
- Creates a virtual environment with uv
- Syncs the right extras for CPU, CUDA 12, or Metal development
- Writes a generated
.artifex.envfile and leaves.envfor user-owned overrides - Re-sourcing
activate.shrefreshes the managed backend state before applying user overrides
For an explicit choice, use ./setup.sh --backend cpu, ./setup.sh --backend cuda12, or ./setup.sh --backend metal.
If you need to rebuild from scratch, use ./setup.sh --recreate. If you also want to clear repo-local test and coverage artifacts without touching user-owned .env files, use ./setup.sh --force-clean.
For detailed package-user and source-checkout options, see the Installation Guide.
The primary onboarding path is the live VAE quickstart under docs/getting-started/quickstart.py and docs/getting-started/quickstart.ipynb. It trains a VAE on MNIST with TFDSEagerSource, VAETrainer, and train_epoch_staged.
from datarax.sources import TFDSEagerSource
from datarax.sources.tfds_source import TFDSEagerConfig
from artifex.generative_models.core.configuration import DecoderConfig, EncoderConfig, VAEConfig
from artifex.generative_models.models.vae import VAE
from artifex.generative_models.training import train_epoch_staged
from artifex.generative_models.training.trainers import VAETrainer, VAETrainingConfigFrom a source checkout, run the maintained quickstart pair directly:
uv run python docs/getting-started/quickstart.py
uv run jupyter lab docs/getting-started/quickstart.ipynbFor the full walkthrough, see the Quickstart Guide.
- Installation Guide - Environment setup, backend policy, and package installation
- Quickstart Guide - VAE-first onboarding on MNIST
- Core Concepts - Architecture, configuration, and runtime model
- Examples Catalog - Executable and documented example inventory
- Benchmarks - Evaluation suites and benchmark guidance
- Model Guides - User-facing guides across model families
- Core API - Core runtime and protocol surfaces
- Models API - Model-family API reference
- Training API - Training and optimization surfaces
- Contributing Guide - Setup, workflow, and contribution expectations
- Testing Guide - Supported pytest workflow and backend guidance
- Example Documentation Design - Reader-facing example standards
- Planned Modules - Areas that remain intentionally unshipped or planned
Artifex keeps the public package surface relatively small at the top level and concentrates most runtime code under artifex.generative_models.
artifex/
├── src/artifex/
│ ├── benchmarks/ # Benchmark foundations, adapters, datasets, and suites
│ ├── cli/ # Supported `artifex` command-line entrypoint
│ ├── configs/ # Checked-in config defaults and loader utilities
│ ├── data/ # Shared data helpers and retained dataset surfaces
│ ├── generative_models/
│ │ ├── core/ # Configuration, protocols, losses, layers, sampling, evaluation
│ │ ├── extensions/ # Audio, chemical, NLP, protein, and vision extensions
│ │ ├── factory/ # Canonical model creation surface
│ │ ├── inference/ # Inference and optimization helpers
│ │ ├── modalities/ # Image, text, audio, protein, tabular, timeseries, multimodal
│ │ ├── models/ # VAE, GAN, diffusion, flow, energy, autoregressive, geometric
│ │ ├── scaling/ # Distributed and scaling helpers
│ │ ├── training/ # Loops, callbacks, optimizers, schedulers, RL, trainers
│ │ ├── utils/ # Logging, JAX helpers, visualization, analysis utilities
│ │ └── zoo/ # Checked-in model zoo configs
│ ├── utils/ # Shared package utilities
│ └── visualization/ # Public visualization helpers
├── docs/ # User, API, and contributor documentation
├── examples/ # Executable scripts and notebook pairs
└── tests/ # Package, integration, unit, and repo-contract coverage
See Architecture Overview for more detail.
# Standard test suite
uv run pytest
# Focused contract checks
uv run pytest tests/artifex/repo_contracts -q
# Docs validation
uv run python scripts/validate_docs.py --check-only --config-path mkdocs.yml --docs-path docs --src-path src# Run the repository hooks
uv run pre-commit run --all-files
# Targeted quality tools
uv run ruff check src tests
uv run ruff format src tests
uv run pyrightSee Testing Guide and Contributing Guide for the maintained contributor workflow.
Artifex is in active alpha development.
- Checked-in installation, onboarding, example, and contributor guides are maintained against the live runtime.
- Blocking CI enforces repository contracts and build verification.
- Quality and security workflows remain reviewed but informational while broader release hardening continues.
- Package surfaces can still evolve between commits when a simpler or more truthful runtime design requires it.
Use the Installation Guide, Quickstart Guide, Testing Guide, and Planned Modules as the current source of truth for supported workflows.
Artifex accepts contributions through the standard repository workflow.
- Clone the repository and run
./setup.sh. - Activate the environment with
source ./activate.sh. - Create a feature branch for the change.
- Add or update tests and documentation with the code change.
- Run
uv run pytestanduv run pre-commit run --all-files. - Open a Pull Request.
See the Contributing Guide for the full contributor checklist and coding expectations.
If you use Artifex in research, please cite:
@software{artifex_2025,
title = {Artifex: Generative Modeling Research Library},
author = {Shafiei, Mahdi and contributors},
year = {2025},
url = {https://github.com/avitai/artifex},
version = {0.1.0}
}This project is licensed under the MIT License - see the LICENSE file for details.
Artifex builds on several strong open-source projects:
- JAX - Numerical computing and transformations
- Flax - Neural network modules with NNX support
- Optax - Optimization utilities
- Orbax - Checkpointing
- BlackJAX - MCMC and energy-based sampling
- CalibraX - Evaluation and benchmark composition
- DataRax - Dataset and source adapters used in onboarding workflows