π Accepted at ICCV 2025
Official PyTorch implementation of SA-LUT.
Project Page | Paper | PST50 Dataset | Model Checkpoint
# 1. Clone the repository
git clone https://github.com/Ry3nG/SA-LUT.git
cd SA-LUT
# 2. Set up environment (creates conda env, installs dependencies & CUDA extensions)
make setup
# 3. Activate environment
conda activate salut_env
# 4. Download model checkpoint (~208 MB from HuggingFace)
make download-ckpts
# 5. Run inference (interactive CLI)
make inference- Linux system (tested on Ubuntu/CentOS)
- CUDA-capable GPU (recommended) or CPU
- Conda package manager
- CUDA toolkit 11.x or 12.x (for GPU support)
1. Create Environment
make setupThis will:
- Create a conda environment named
salut_env - Install all Python dependencies from
environment.yml - Build and install custom CUDA extensions (
quadrilinear_cpp,trilinear_cpp)
2. Activate Environment
conda activate salut_env3. Download Model Checkpoint
make download-ckptsAlternatively, manually download from https://huggingface.co/zrgong/SA-LUT and place in ckpts/salut_ckpt/.
make inferenceThe CLI offers two modes:
Mode 1: Single Image Pair
- Stylize one content image with one style image
- Prompts for:
- Content image path
- Style image path
- Output directory (default:
outputs/)
- Output automatically named:
content_<name>_style_<name>.<ext>
Mode 2: Batch Inference
- Process multiple image pairs
- Prompts for:
- Content images directory
- Style images directory
- Output directory (default:
outputs/)
- Requirements:
- Same number of images in both directories
- Matching filenames between content and style directories
- Shows progress bar during processing
Single image pair:
python inference_cli.py \
--ckpt ckpts/salut_ckpt/epoch=100-step=4127466.ckpt.state.pt
# Then follow interactive promptsForce CPU mode:
python inference_cli.py --cpuCustom checkpoint:
python inference_cli.py --ckpt /path/to/custom.ckpt.state.pt$ make inference
============================================================
SA-LUT Inference CLI
============================================================
Select inference mode:
1. Single image pair
2. Batch inference
Enter your choice (1 or 2): 1
Enter content image path: data/PST50/content_709/1.png
Enter style image path: data/PST50/paired_style/1.png
Enter output directory (default: outputs):
Starting stylization...
Output will be saved as: content_1_style_1.png
------------------------------------------------------------
Stylizing content: data/PST50/content_709/1.png
Using style: data/PST50/paired_style/1.png
Content image size: 1920Γ1080
...
Stylization complete!
Output saved to: outputs/content_1_style_1.pngFor batch processing, ensure your files are organized with matching names:
my_content/
βββ photo1.jpg
βββ photo2.jpg
βββ photo3.jpg
my_styles/
βββ photo1.jpg # Matches content/photo1.jpg
βββ photo2.jpg # Matches content/photo2.jpg
βββ photo3.jpg # Matches content/photo3.jpg
Then run:
$ make inference
Select inference mode:
1. Single image pair
2. Batch inference
Enter your choice (1 or 2): 2
Enter content images directory: my_content
Enter style images directory: my_styles
Validating directories...
Validation passed: 3 matching image pairs found
Enter output directory (default: outputs): my_results
Processing 3 image pairs...
Stylizing: 100%|ββββββββββββββββ| 3/3 [00:15<00:00, 5.2s/pair]
Batch stylization complete!
All 3 images saved to: my_resultsThe PST50 dataset is the first benchmark for photorealistic style transfer evaluation.
python << EOF
from huggingface_hub import snapshot_download
snapshot_download(
repo_id='zrgong/PST50',
repo_type='dataset',
local_dir='data/PST50'
)
EOFdata/PST50/
βββ content_709/ # 50 content images (Rec.709 color space)
βββ content_log/ # 50 content images (log color space)
βββ paired_gt/ # 50 ground truth stylizations
βββ paired_style/ # 50 style references (paired evaluation)
βββ unpaired_style/ # 51 style references (unpaired evaluation)
βββ video/ # Video sequences for temporal consistency testing
All images are numbered 1.png to 50.png for easy pairing.
- Paired: Compare outputs against
paired_gt/using LPIPS, PSNR, SSIM, H-Corr - Unpaired: Use
unpaired_style/for qualitative assessment - Video: Test temporal consistency with video sequences
SA-LUT/
βββ core/ # Model implementation
β βββ module/
β β βββ model.py # SA-LUT architecture
β β βββ clut4d.py # 4D LUT operations
β β βββ interpolation.py # Interpolation layers
β βββ dataset/
βββ ckpts/
β βββ vgg_normalised.pth # VGG encoder weights
β βββ salut_ckpt/
β βββ epoch=100-step=4127466.ckpt.state.pt # Main checkpoint
βββ data/
β βββ PST50/ # Evaluation dataset (download separately)
βββ quadrilinear_cpp/ # Custom CUDA extension for 4D interpolation
βββ trilinear_cpp_torch1.11/ # Custom CUDA extension for 3D interpolation
βββ inference_cli.py # Main inference script (interactive)
βββ download_checkpoints.py # Checkpoint downloader
βββ Makefile # Convenience commands
βββ environment.yml # Conda dependencies
βββ README.md # This file
If you use SA-LUT in your research, please cite:
@InProceedings{Gong_2025_ICCV,
author = {Gong, Zerui and Wu, Zhonghua and Tao, Qingyi and Li, Qinyue and Loy, Chen Change},
title = {SA-LUT: Spatial Adaptive 4D Look-Up Table for Photorealistic Style Transfer},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2025},
pages = {18294-18303}
}