diffuseR is a functional R implementation of diffusion models, inspired by Hugging Face's Python diffusers library. The package provides a simple, idiomatic R interface to state-of-the-art generative AI models for image generation and manipulation using base R and the torch package. No Python dependencies. Currently supports Windows and Linux cpu and cuda devices.
First install torch. As per this comment, using the pre-built binaries from https://torch.mlverse.org/docs/articles/installation#pre-built is heavily recommend. "The pre-built binaries bundle the necessary CUDA and cudnn versions, so you don't need a global compatible system version of CUDA":
options(timeout = 600) # increasing timeout is recommended since we will be downloading a 2GB file.
# For Windows and Linux: "cpu", "cu128" are the only currently supported
# For MacOS the supported are: "cpu-intel" or "cpu-m1"
kind <- "cu124"
version <- available.packages()["torch","Version"]
options(repos = c(
torch = sprintf("https://torch-cdn.mlverse.org/packages/%s/%s/", kind, version),
CRAN = "https://cloud.r-project.org" # or any other from which you want to install the other R dependencies.
))
install.packages("torch")You can install the development version of diffuseR from GitHub:
# install.packages("devtools")
devtools::install_github("cornball-ai/diffuseR")
# Or
# install.packages("targets")
targets::install_github("cornball-ai/diffuseR")- Text-to-Image Generation: Create images from textual descriptions
- Image-to-Image Generation: Modify existing images based on text prompts
- Two Models: Stable Diffusion 2.1 and SDXL (fully native R torch implementation)
- Scheduler Options: DDIM (more coming soon)
- Device Support: CPU and CUDA GPUs (including Blackwell RTX 50xx)
- R-native Interface: Functional programming approach that feels natural in R
Warning: The first time you run the code below, it will download ~5.3GB of Stable Diffusion 2.1 CPU-only model files from Hugging Face and load them into memory. Ensure you have enough RAM, disk space, and a stable internet connection. Memory management with deep learning models is crucial, so consider using a machine with sufficient resources; ~8GB of free RAM is recommended for running Stable Diffusion 2.1 on CPU only.
options(timeout = 600) # increasing timeout is recommended since we will be downloading a 3.5GB file.
library(diffuseR)
torch::local_no_grad()
# Generate an image from text
cat_img <- txt2img(
prompt = "a photorealistic cat wearing sunglasses",
model = "sd21", # Specify the model to use, e.g., "sd21" for Stable Diffusion 2.1
download_models = TRUE, # Automatically download the model if not already present
steps = 30,
seed = 42,
filename = "cat.png",
)
# Clear out pipeline to free up GPU memory
pipeline <- NULL
torch::cuda_empty_cache()The unet is the most computationally-intensive part of the model, so it is recommended to run it on a GPU if possible. The decoder and text encoder can be run on CPU if you have limited GPU memory. SDXL's unet requires a minimum of 6GB of GPU memory (VRAM), while Stable Diffusion 2.1 requires a minimum of 2GB.
# Increasing timeout is recommended since we will be downloading 5.1 and 2.8GB model files, among others.
options(timeout = 1200)
library(diffuseR)
torch::local_no_grad() # Prevents torch from tracking gradients, which is not needed for inference
# Assign the various deep learning models to devices
model_name = "sdxl"
devices = list(unet = "cuda", decoder = "cpu",
text_encoder = "cpu", encoder = "cpu")
m2d <- models2devices(model_name = model_name, devices = devices,
unet_dtype_str = "float16", download_models = TRUE)
pipeline <- load_pipeline(model_name = model_name, m2d = m2d, i2i = TRUE,
unet_dtype_str = "float16")
# Generate an image from text
cat_img <- txt2img(
prompt = "a photorealistic cat wearing sunglasses",
model_name = model_name,
devices = devices,
pipeline = pipeline,
num_inference_steps = 30,
guidance_scale = 7.5,
seed = 42,
filename = "cat2.png",
)
gambling_cat <- img2img(
input_image = "cat2.png",
prompt = "a photorealistic cat throwing dice",
img_dim = 1024,
model_name = model_name,
devices = devices,
pipeline = pipeline,
num_inference_steps = 30,
strength = 0.75,
guidance_scale = 7.5,
seed = 42,
filename = "gambling_cat.png"
)
# Clear out pipeline to free up GPU memory
pipeline <- NULL
torch::cuda_empty_cache()Currently supported models:
- Stable Diffusion 2.1
- Stable Diffusion XL (SDXL)
- More coming soon!
Future plans for diffuseR include:
- Inpainting support
- Additional schedulers (PNDM, DPMSolverMultistep, Euler ancestral)
- text-to-video generation
diffuseR supports two execution modes:
Native R torch (recommended for SDXL): Pure R implementations of VAE decoder, text encoders, and UNet. Required for Blackwell GPUs (RTX 50xx series) and recommended for best compatibility. Enable with use_native_* flags or use txt2img_sdxl() which defaults to native.
TorchScript (legacy): Pre-exported models from PyTorch. Still available for SD21 and older GPUs. Scripts to build TorchScript files are at diffuseR-TS.
No Python dependencies required for either mode.
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the Apache 2. License - see the LICENSE file for details.
- Hugging Face for the original diffusers library
- Stability AI for Stable Diffusion
- The R and torch communities for their excellent tooling






