LUSD: Localized Update Score Distillation for Text-Guided Image Editing

¹VISTEC ²Siam Commercial Bank ³Faculty of Medicine Siriraj Hospital ⁴Pixiv
^*Equal contributions

ICCV 2025

Abstract

While diffusion models show promising results in image editing given a target prompt, achieving both prompt fidelity and background preservation remains difficult. Recent works have introduced score distillation techniques that leverage the rich generative prior of text-to-image diffusion models to solve this task without additional fine-tuning. However, these methods often struggle with tasks such as object insertion. Our investigation of these failures reveals significant variations in gradient magnitude and spatial distribution, making hyperparameter tuning highly input-specific or unsuccessful. To address this, we propose two simple yet effective modifications: attention-based spatial regularization and gradient filtering-normalization, both aimed at reducing these variations during gradient updates. Experimental results show our method outperforms state-of-the-art score distillation techniques in prompt fidelity, improving successful edits while preserving the background. Users also preferred our method over state-of-the-art techniques across three metrics, and by 58-64% overall.

Main Idea

Key Ideas

An implicit editing mask derived from attention features to reduce spatial variation of gradient updates
A normalization and thresholding mechanism to filter out "counterproductive" gradients with low standard deviation.

Goal and Challenge

Given an input source image and a target prompt describing how the image should be modified, our goal is to modify the image to match the prompt. Our method builds upon score distillation sampling with a simple L2 regularization. We address a key challenge: how to modulate variations in gradient magnitudes and their spatial distributions.

Standard Deviation of Gradient Magnitudes

BibTeX

@inproceedings{chinchuthakun2025lusd, author = {Chinchuthakun, Worameth and Saengja, Tossaporn and Tritrong, Nontawat and Rewatbowornwong, Pitchaporn and Khungurn, Pramook and Suwajanakorn, Supasorn}, title = {LUSD: Localized Update Score Distillation for Text-Guided Image Editing}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision}, year = {2025} }

LUSD: Localized Update Score Distillation for Text-Guided Image Editing

ICCV 2025

LUSD is a novel score distillation technique for object insertion and image editing tasks.
Based on Stable Diffusion 1.5, without any additional training, and using a single configuration.

Abstract

Main Idea

Key Ideas

Goal and Challenge

In-the-wild Results

BibTeX

LUSD: Localized Update Score Distillation for Text-Guided Image Editing

ICCV 2025

LUSD is a novel score distillation technique for object insertion and image editing tasks. Based on Stable Diffusion 1.5, without any additional training, and using a single configuration.

Abstract

Main Idea

Key Ideas

Goal and Challenge

In-the-wild Results

BibTeX

LUSD is a novel score distillation technique for object insertion and image editing tasks.
Based on Stable Diffusion 1.5, without any additional training, and using a single configuration.