Bridging Visual-Textual Modalities: Weakly Supervised Histopathology Segmentation


Sicong Gao (University of New South Wales), Matthew AB Baker (University of New South Wales), Maurice Pagnucco (University of New South Wales), Zhiwei Gao (Hebei Key Laboratory of Electromagnetic Environmental Effects and Information Processing), Yang Song (University of New South Wales)
The 35th British Machine Vision Conference

Abstract

In weakly supervised histopathology tissue segmentation, Class Activation Maps (CAMs) are commonly used to generate pseudo-masks. However, CAMs typically highlight only the most discriminative regions, leading to inaccurate tissue boundaries. Existing visual-based refinement strategies often exacerbate information loss, while text-based methods suffer from high inter-class similarity and a semantic gap between pixel-level features and text labels. To overcome these limitations, we propose a two-stage segmentation framework that jointly models a bidirectional shared latent space between visual and textual modalities to enhance pseudo-mask quality. In the segmentation (second) stage, we incorporate complex tissue textual descriptions as external discriminative knowledge to compensate for insufficient supervision. We further develop a multi-stage modality fusion strategy based on learnable query tokens and Fourier transforms. Experiments conducted on the LUAD-HistoSeg and BCSS-WSSS datasets demonstrate that our method surpasses state-of-the-art weakly supervised tissue segmentation approaches.

Citation

@inproceedings{Gao_2025_BMVC,
author    = {Sicong Gao and Matthew AB Baker and Maurice Pagnucco and Zhiwei Gao and Yang Song},
title     = {Bridging Visual-Textual Modalities: Weakly Supervised Histopathology Segmentation},
booktitle = {36th British Machine Vision Conference 2025, {BMVC} 2025, Sheffield, UK, November 24-27, 2025},
publisher = {BMVA},
year      = {2025},
url       = {https://bmva-archive.org.uk/bmvc/2025/assets/papers/Paper_218/paper.pdf}
}


Copyright © 2025 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection