UDT : Unsupervised Discovery of Transformations between Fine-Grained Classes in Diffusion Models


Youngjae Choi (Soongsil University), Hyunsuh Koh (Soongsil University), Hojae Jeong (Soongsil University), ByungKwan Chae (Soongsil University), Sungyong Park (Soongsil University), Heewon Kim (Soongsil University)
The 35th British Machine Vision Conference

Abstract

Diffusion models achieve impressive image synthesis, yet unsupervised methods for latent space exploration remain limited in fine-grained class translation. Existing approaches struggle with fine-grained class translation, often producing low-diversity outputs within parent classes or inconsistent child-class mappings across images. We propose UDT (Unsupervised Discovery of Transformations), a framework that incorporates hierarchical structure into unsupervised direction discovery. UDT leverages parent-class prompts to decompose predicted noise into class-general and class-specific components, ensuring translations remain within the parent domain while enabling disentangled child-class transformations. A hierarchy-aware contrastive loss further enforces consistency, with each direction corresponding to a distinct child class. Experiments on dogs, cats, birds, and flowers show that UDT outperforms state-of-the-art methods both qualitatively and quantitatively. Moreover, UDT supports controllable interpolation, allowing for the smooth generation of intermediate classes (e.g., mixed breeds). These results demonstrate UDT as a general and effective solution for fine-grained image translation. Our project website is available at: https://ssu-reality-lab.github.io/UDT.

Citation

@inproceedings{Choi_2025_BMVC,
author    = {Youngjae Choi and Hyunsuh Koh and Hojae Jeong and ByungKwan Chae and Sungyong Park and Heewon Kim},
title     = {UDT : Unsupervised Discovery of Transformations between Fine-Grained Classes in Diffusion Models},
booktitle = {36th British Machine Vision Conference 2025, {BMVC} 2025, Sheffield, UK, November 24-27, 2025},
publisher = {BMVA},
year      = {2025},
url       = {https://bmva-archive.org.uk/bmvc/2025/assets/papers/Paper_784/paper.pdf}
}


Copyright © 2025 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection