Language-Guided Reinforcement Learning for Hard Attention in Few-Shot Learning


Bahareh Nikpour (McGill University), Narges Armanfard (McGill University)
The 35th British Machine Vision Conference

Abstract

Attention mechanisms have demonstrated significant potential in enhancing learning models by identifying key portions of input data, particularly in scenarios with limited training samples. Inspired by human perception, we propose that focusing on essential data segments, rather than the entire dataset, can improve the accuracy and reliability of the learning models. However, identifying these critical data segments, or "hard attention finding," is challenging, especially in few-shot learning, due to the scarcity of training data and the complexity of model parameters. To address this, we introduce LaHA, a novel framework that leverages language-guided deep reinforcement learning to identify and utilize informative data regions, thereby improving both interpretability and performance. Extensive experiments on benchmark datasets validate the effectiveness of LaHA.

Citation

@inproceedings{Nikpour_2025_BMVC,
author    = {Bahareh Nikpour and Narges Armanfard},
title     = {Language-Guided Reinforcement Learning for Hard Attention in Few-Shot Learning},
booktitle = {36th British Machine Vision Conference 2025, {BMVC} 2025, Sheffield, UK, November 24-27, 2025},
publisher = {BMVA},
year      = {2025},
url       = {https://bmva-archive.org.uk/bmvc/2025/assets/papers/Paper_1209/paper.pdf}
}


Copyright © 2025 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection