Modular Embedding Recomposition for Incremental Learning


Aniello Panariello (University of Modena and Reggio Emilia), Emanuele Frascaroli (University of Modena and Reggio Emilia), Pietro Buzzega (University of Modena and Reggio Emilia), Lorenzo Bonicelli (University of Modena and Reggio Emilia), Angelo Porrello (University of Modena and Reggio Emilia), Simone Calderara (University of Modena and Reggio Emilia)
The 35th British Machine Vision Conference

Abstract

The advent of pre-trained Vision-Language Models (VLMs) has significantly transformed Continual Learning (CL), mainly due to their zero-shot classification abilities. Such proficiency makes VLMs well-suited for real-world applications, enabling robust performance on novel unseen classes without requiring adaptation. However, fine-tuning remains essential when downstream tasks deviate significantly from the pre-training domain. Prior CL approaches primarily focus on preserving the zero-shot capabilities of VLMs during incremental fine-tuning on a downstream task. We take a step further by devising an approach that transforms preservation into enhancement of the zero-shot capabilities of VLMs. Our approach, named MoDular Embedding Recomposition (MoDER), introduces a modular framework that trains multiple textual experts, each specialized in a single seen class, and stores them in a foundational hub. At inference time, for each unseen class, we query the hub and compose the retrieved experts to synthesize a refined prototype that improves classification. We show the effectiveness of our method across two popular zero-shot incremental protocols, Class-IL and MTIL, comprising a total of 14 datasets. The codebase is available at https://github.com/aimagelab/mammoth.

Citation

@inproceedings{Panariello_2025_BMVC,
author    = {Aniello Panariello and Emanuele Frascaroli and Pietro Buzzega and Lorenzo Bonicelli and Angelo Porrello and Simone Calderara},
title     = {Modular Embedding Recomposition for Incremental Learning},
booktitle = {36th British Machine Vision Conference 2025, {BMVC} 2025, Sheffield, UK, November 24-27, 2025},
publisher = {BMVA},
year      = {2025},
url       = {https://bmva-archive.org.uk/bmvc/2025/assets/papers/Paper_825/paper.pdf}
}


Copyright © 2025 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection