Scale Where It Matters:
Training-Free Localized Scaling for Diffusion Models

Qin Ren¹, Yufei Wang^2,6, Lanqing Guo³, Wen Zhang⁴, Zhiwen Fan⁵, Chenyu You¹

¹Stony Brook University ²Nanyang Technological University ³UT Austin
⁴Johns Hopkins University ⁵Texas A&M University ⁶SparcAI Research

Where should extra inference go? Typical TTS perturbs or resamples the whole image, even when only a small region is wrong. LoTTS uses quality-aware attention to find those weak regions and runs test-time scaling only there, leaving high-quality pixels fixed—training-free, and a much smaller search space.

Overview

Test-time scaling for diffusion models usually perturbs the entire image, yet quality is often uneven across the canvas.

Defects are typically localized: additional compute is better spent on weak regions than on restarting the whole sample.

LoTTS is training-free: attention-derived masks for localization, masked resampling with consistency controls.

Localization. Contrast cross-/self-attention under quality prompts; form a coherent defect mask.
Resampling. Noise injection and denoising inside the mask; brief global harmonization.
Efficiency. Plug-and-play across backbones; ~2–4× fewer samples than Best-of-N at matched budgets.

Method

Overview of LoTTS. Given a text prompt, LoTTS first generates candidate images from different noise seeds. It then localizes defective regions using high-/low-quality prompt contrast and constructs a quality-aware mask. Noise is injected only inside the masked regions, followed by localized denoising with spatial and temporal consistency. A verifier finally selects the best refined sample.

Acknowledgements

This project builds upon the following excellent open-source works:

Diffusers — Hugging Face diffusion model library
ImageReward — Human preference reward model
HPSv2 — Human Preference Score v2
ConceptAttention — Attention map extraction
attention-map-diffusers — Attention map utilities

Citation

If you find this work useful, please consider citing:

@article{ren2025lotts,
  title   = {Scale Where It Matters: Training-Free Localized Scaling for Diffusion Models},
  author  = {Ren, Qin and Wang, Yufei and Guo, Lanqing and Zhang, Wen and Fan, Zhiwen and You, Chenyu},
  journal = {arXiv preprint arXiv:2511.19917},
  year    = {2025}
}

License

This project is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scale Where It Matters:
Training-Free Localized Scaling for Diffusion Models

Overview

Method

Acknowledgements

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Scale Where It Matters:Training-Free Localized Scaling for Diffusion Models

Overview

Method

Acknowledgements

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Scale Where It Matters:
Training-Free Localized Scaling for Diffusion Models

Packages