Skip to content

Y-Research-SBU/LoTTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Scale Where It Matters:
Training-Free Localized Scaling for Diffusion Models

Qin Ren1, Yufei Wang2,6, Lanqing Guo3, Wen Zhang4, Zhiwen Fan5, Chenyu You1

1Stony Brook University   2Nanyang Technological University   3UT Austin
4Johns Hopkins University   5Texas A&M University   6SparcAI Research

arXiv Project Page GitHub Hugging Face



Where should extra inference go? Typical TTS perturbs or resamples the whole image, even when only a small region is wrong. LoTTS uses quality-aware attention to find those weak regions and runs test-time scaling only there, leaving high-quality pixels fixed—training-free, and a much smaller search space.

Overview

Test-time scaling for diffusion models usually perturbs the entire image, yet quality is often uneven across the canvas.

Defects are typically localized: additional compute is better spent on weak regions than on restarting the whole sample.

LoTTS is training-free: attention-derived masks for localization, masked resampling with consistency controls.

  • Localization. Contrast cross-/self-attention under quality prompts; form a coherent defect mask.
  • Resampling. Noise injection and denoising inside the mask; brief global harmonization.
  • Efficiency. Plug-and-play across backbones; ~2–4× fewer samples than Best-of-N at matched budgets.

Method


Overview of LoTTS. Given a text prompt, LoTTS first generates candidate images from different noise seeds. It then localizes defective regions using high-/low-quality prompt contrast and constructs a quality-aware mask. Noise is injected only inside the masked regions, followed by localized denoising with spatial and temporal consistency. A verifier finally selects the best refined sample.

Acknowledgements

This project builds upon the following excellent open-source works:

Citation

If you find this work useful, please consider citing:

@article{ren2025lotts,
  title   = {Scale Where It Matters: Training-Free Localized Scaling for Diffusion Models},
  author  = {Ren, Qin and Wang, Yufei and Guo, Lanqing and Zhang, Wen and Fan, Zhiwen and You, Chenyu},
  journal = {arXiv preprint arXiv:2511.19917},
  year    = {2025}
}

License

This project is released under the MIT License.

About

Official Repository for LoTTS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors