Skip to the content.

Humans rely on stereo vision and motion parallax to estimate depth in their near surroundings. However, these cues become weaker as depth increases. As a result, humans rely profoundly on monocular cues when estimating depth in the far range.

Depth estimation is one of the central challenges of monocular 3D reconstruction. Computer vision algorithms for 3D reconstruction from monocular images have advanced substantially over the past few years. However, depth estimation in the far range still suffers from poor accuracy. This can be partly attributed to the insufficient cues used by current approaches. Moreover, the benchmarking procedure for these algorithms has remained largely unchanged relying on simple metrics and sparse LiDAR data. This prevents insights into the performance of each method, especially where the ground-truth is incorrect.

This tutorial will serve as an introduction to the field of monocular 3D reconstruction, discussing both fundamental approaches and recent State-of-the-Art. The focus will be on various approaches to depth estimation, from the use of graphs to implicitly reason about depth, to more explicit representations. Additionally, a core component of the tutorial will be centred on a novel Monocular Depth Estimation (MDE) benchmarking procedure. This will cover important topics such as training different baselines in a fair and comparable manner, the selection of metrics and a new evaluation dataset containing a variety of complex urban and natural scenes.


:open_book: Slides


:construction_worker: Organizers

Jaime Spencer
Jaime Spencer
Research Fellow
University of Surrey
Avi Saha
Avishkar Saha
PhD Student
University of Surrey
Chris Russell
Chris Russell
Senior Applied Scientist
Amazon
Simon Hadfield
Simon Hadfield
Senior Lecturer
University of Surrey
Richard Bowden
Richard Bowden
Professor
University of Surrey