PME3D: An Adaptive and Efficient Multi-modal Feature Extraction Plug-in for 3D Object Detection


TianyiYu (University of Glasgow), Lei Yu (Beihang University)
The 35th British Machine Vision Conference

Abstract

3D object detection is fundamental to autonomous driving systems, where efficiently fusing multi-modal sensor data remains a critical challenge. Although current approaches predominantly focus on improving detection accuracy through Bird's Eye View representations, they often overlook computational efficiency, a crucial factor for real-world deployment on resource-constrained automotive platforms. To bridge this gap, we propose a lightweight plug-in module that enhances feature fusion efficiency through two key mechanisms: 1) dimensionality inversion of feature extraction outputs, and 2) dynamic selection of camera features for optimal fusion with point cloud data. Our experiments on nuScenes demonstrate that this approach maintains competitive detection performance while significantly reducing computational overhead, offering a practical solution for real-time autonomous driving applications.

Citation

@inproceedings{TianyiYu_2025_BMVC,
author    = {TianyiYu and Lei Yu},
title     = {PME3D: An Adaptive and Efficient Multi-modal Feature Extraction Plug-in for 3D Object Detection},
booktitle = {36th British Machine Vision Conference 2025, {BMVC} 2025, Sheffield, UK, November 24-27, 2025},
publisher = {BMVA},
year      = {2025},
url       = {https://bmva-archive.org.uk/bmvc/2025/assets/papers/Paper_310/paper.pdf}
}


Copyright © 2025 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection