UFD-KD: Unified Frequency Decoupled Knowledge Distillation


Lu Sihan (Beijing Institute of Technology), Yang Zheng (Institute of automation, Chinese Academy of Sciences), Jie Liu (Beijing University of Posts and Telecommunications), Zhenghao Xi (Shanghai University of Engineering Science)
The 35th British Machine Vision Conference

Abstract

In this paper, we question whether the feature distillation has the same property as logits distillation, that its learning mainly focuses on one or several principal components, easily missing minor but informative features. In principle, the distribution of feature maps is unlikely to the logits intuitively, whose distribution is naturally imbalanced. However, we surprisingly find that the feature distribution becomes extremely imbalanced after applying the discrete cosine transform (DCT), which shows attributes similar to logit distillation. Inspired by this, we propose Unified Frequency Decoupled Knowledge Distillation (UFD-KD), which is designed to address the negative effect of the imbalance of feature distribution. Specifically, UFD-KD applies DCT to spatial and channel dimensions respectively, attending to decouple the features based on the properties of frequency domain. For each dimension, we design a parameterized weight strategy to emphasize the minor and sparse features which used to be out of notice because of the principal features. Considering the orthogonality of the two dimensions, we assign them different weights to balance the overall feature alignment. In the validation experiments on Cifar-100, 11 teacher-student model pairs demonstrates excellent performance, achieving 3.1\% accuracy gains for ResNet50→ResNet18. Furthermore, in large-scale validation experiments on ImageNet-1K, UFD maintained competitive performance improvement (71.98\% in Swin-T→ResNet18). Additionally, we validate the task transfer capability on ADE20K, achieving 36.89\% mIoU for DeepLabv3- ResNet101→DeepLabv3-ResNet18.

Citation

@inproceedings{Sihan_2025_BMVC,
author    = {Lu Sihan and Yang Zheng and Jie Liu and Zhenghao Xi},
title     = {UFD-KD: Unified Frequency Decoupled Knowledge Distillation},
booktitle = {36th British Machine Vision Conference 2025, {BMVC} 2025, Sheffield, UK, November 24-27, 2025},
publisher = {BMVA},
year      = {2025},
url       = {https://bmva-archive.org.uk/bmvc/2025/assets/papers/Paper_500/paper.pdf}
}


Copyright © 2025 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection