The 36th British Machine Vision Conference 2025: Keynotes

Philip Torr

University of Oxford

AI for the People

Abstract: As artificial intelligence — and soon robotics — becomes woven into every aspect of our lives, it will rapidly reshape our world in profound ways. This transformation could bring immense benefits, or it could lead us toward a future marked by deepening inequality and the growing concentration of power. As technologists, we have a duty to advocate for a positive path — one in which AI enhances human potential, promotes fairness, and strengthens rather than undermines society. In this talk, Professor Philip Torr examines the mounting social and political challenges posed by AI, and explores how we might steer these technologies toward a fairer, more open, and more human future.

Bio: Professor Philip Torr did his PhD (DPhil) at the Robotics Research Group of the University of Oxford under Professor David Murray of the Active Vision Group. He worked for another three years at Oxford as a research fellow, and still maintains close contact as visiting fellow there. He left Oxford to work for six years as a research scientist for Microsoft Research, first in Redmond, USA, in the Vision Technology Group, then in Cambridge founding the vision side of the Machine Learning and Perception Group. He then became a Professor in Computer Vision and Machine Learning at Oxford Brookes University. In 2013, Philip returned to Oxford as full professor where he has established the Torr Vision group. He won several awards including the Marr prize (the highest honour in vision) in 1998. He is a Royal Society Wolfson Research Merit Award Holder. Recently, together with members of his group, he has won several other awards including an honorary mention at the NIPS 2007 conference for the paper 'P. Kumar, V. Kolmorgorov, and P.H.S. Torr, An Analysis of Convex Relaxations for MAP Estimation', in NIPS 21, Neural Information Processing Conference, and (oral) Best Paper at Conference for 'O. Woodford, P.H.S. Torr, I. Reid, and A.W. Fitzgibbon, Global Stereo Reconstruction under Second Order Smoothness Priors', in Proceedings IEEE Conference of Computer Vision and Pattern Recognition, 2008 . More recently he has been awarded best science paper at BMVC 2010 and ECCV 2010. He was involved in the algorithm design for Boujou released by 2D3. Boujou has won a clutch of industry awards, including Computer Graphics World Innovation Award, IABM Peter Wayne Award, and CATS Award for Innovation, and a technical EMMY. He then worked closely with this Oxford based company as well as other companies such as Sony on the Wonderbook project. He has been involved in numerous spin-outs as founder or advisor including: FiveAI, Onfido, Oxsight, Eigent, DreamTech, Visionary Machines, CamelAI, as well as working closely with big tech companies like Google, Meta, Apple, Microsoft, and Sony. He was elected Fellow of the Royal Academy of Engineering (FREng) in 2019, and Fellow of the Royal Society (FRS) in 2021 for contributions to computer vision. In 2021 he was made Turing AI world leading researcher fellow.

Marc Pollefeys

ETH Zurich

Spatial AI

Abstract: In this talk we’ll discuss how to build rich 3D representations of the environment to assist people and robots to perform tasks. We’ll first discuss how to build visual 3D maps of environments and use those for visual (re)localization, spatial data access and navigation. We’ll cover recent methods based on geometry, learning and combining both. One of the questions we will consider is what is best learned and where we should use explicit geometric concepts. We’ll also discuss how to build rich 3D semantic representations that enable queries and interactions with the scene. Our approach allows open vocabulary queries by leveraging foundation models. While these models are very powerful in recognizing arbitrary objects, there are some aspects that are still missing to enable robotic interactions. We’ll also briefly cover some of our work on action recognition which is key in building AI assistants and could also be useful to enable robots to learn from examples.

Bio: Marc Pollefeys is a Professor of Computer Science at ETH Zurich and the Director of the Microsoft Spatial AI Lab in Zurich. He is a Fellow of IEEE, ACM, AAIA and ELLIS, as well as a member of the Academia Europaea. His work received several prizes and awards, including the Marr Prize and several best paper awards. He obtained his PhD from the KU Leuven in 1999 and was a professor at UNC Chapel Hill before joining ETH Zurich. He is best known for his work in 3D computer vision, having been the first to develop a software pipeline to automatically turn photographs into 3D models, but also works on robotics, graphics and machine learning problems. Other noteworthy projects he worked on are real-time 3D scanning with mobile devices (2013), a real-time pipeline for 3D reconstruction of cities from vehicle mounted-cameras (2007), camera-based self-driving cars and the first fully autonomous vision-based drone (2012). More recently his academic research has focused on combining 3D reconstruction with semantic scene understanding. He served as the program chair for CVPR 2009 and general chair for ECCV 2014 and ICCV 2019 and was the founding president of the European Computer Vision Foundation.

Angela Dai

Technical University of Munich

Can Transformers Speak Geometry?

Abstract: What if generating a 3D mesh were as natural as predicting the next word in a sentence? Autoregressive modeling has rapidly become a unifying learning paradigm across data modalities, across language to images, and now offers a compelling approach for 3D geometry. This talk explores how transformer-based autoregressive models enable mesh generation by representing meshes as sequences. Framing mesh generation as a next-token prediction problem enables new ways to handle the compact, irregular structure of human-designed 3D assets, directly compatible with downstream graphics and vision applications. We explore sequence formulation and data representation, and address practical challenges in scaling to high-resolution meshes and interactive synthesis. This will enable more accessible and democratized 3D content creation, paving the way for interactive design, rapid prototyping, and simulation-ready assets, and unlocking new possibilities for both creative and computational exploration of 3D geometry.

Bio: Angela Dai is an Associate Professor at the Technical University of Munich where she leads the 3D AI Lab. Angela's research focuses on understanding how real-world 3D scenes around us can be modeled and semantically understood. Previously, she received her PhD in computer science from Stanford in 2018, advised by Pat Hanrahan, and her BSE in computer science from Princeton in 2013. Her research has been recognized through an ECVA Young Researcher Award, ERC Starting Grant, Eurographics Young Researcher Award, German Pattern Recognition Award, Google Research Scholar Award, and an ACM SIGGRAPH Outstanding Doctoral Dissertation Honorable Mention.