Skip to the content.

:wave: Welcome to the 4th Monocular Depth Estimation Challenge Workshop organized at :wave: cvpr2025

image_0026 image_0254 image_0698 depth_0026 depth_0254 depth_0698

Monocular depth estimation (MDE) is an important low-level vision task with applications in fields such as augmented reality, robotics, and autonomous vehicles. In 2024, the field was dominated by generative approaches, with DepthAnything representing the transformer-based solution and Marigold being a denoising diffusion model based on the popular Text-to-Image LDM Stable Diffusion. Even before that, there has been an increased interest in self-supervised systems capable of predicting the 3D scene structure without requiring ground-truth LiDAR training data. The automotive industry accelerated the development of these systems thanks to the vast quantities of data and the ubiquity of stereo camera rigs. However, the evaluation process has remained focused on in-domain evaluation, relying on simple metrics and sparse LiDAR data.

This workshop seeks to answer the following questions:

  1. How well do networks generalize beyond their training distribution relative to humans?
  2. What metrics provide the most insight into the model’s performance?
  3. How do the predictions made by the models differ from how humans perceive depth?

The workshop will consist of two parts: invited keynote talks discussing current developments in MDE and a challenge organized around a benchmarking procedure using the SYNS dataset.

:newspaper: News


:microphone: Keynote Speakers

Peter Wonka
Peter Wonka
Full Professor
KAUST
Yiyi Liao
Yiyi Liao
Assistant Professor
Zhejiang University
Konrad Schindler
Konrad Schindler
Full Professor
ETH Zurich

Peter Wonka is a full professor of computer science at King Abdullah University of Science and Technology (KAUST). Peter Wonka received his doctorate in computer science from the Technical University of Vienna. Additionally, he received a Master of Science in Urban Planning from the same institution. After his PhD, Dr. Wonka worked as a postdoctoral researcher at the Georgia Institute of Technology and as faculty at Arizona State University. His research publications tackle various computer vision, computer graphics, and machine learning topics. The current research focuses on deep learning, generative models, and 3D shape analysis and reconstruction.

Yiyi Liao is an assistant professor at Zhejiang University. Prior to that, she received her Ph.D. degree from Zhejiang University and subsequently worked as a Postdoc at MPI for Intelligent Systems. Her research interest lies in 3D computer vision and immersive media, including reconstruction, generation, and compression. She received the Best Robot Vision Paper award at ICRA 2024. She serves as a program chair for 3DV 2025 and an area chair for CVPR and NeurIPS.

Konrad Schindler received the Diplomingenieur (M.Tech.) degree in photogrammetry from the Vienna University of Technology, Vienna, Austria, in 1999 and the Ph.D. degree from the Graz University of Technology, Graz, Austria, in 2003. He was a Photogrammetric Engineer in the private industry and held researcher positions at the Computer Graphics and Vision Department, Graz University of Technology, the Digital Perception Laboratory, Monash University, Melbourne, VIC, Australia, and the Computer Vision Laboratory, ETH Zürich, Zürich, Switzerland. He was an Assistant Professor of Image Understanding with TU Darmstadt, Darmstadt, Germany, in 2009. Since 2010, he has been a Tenured Professor of Photogrammetry and Remote Sensing with ETH Zürich. His research interests include computer vision, photogrammetry, and remote sensing, with a focus on image understanding and information extraction reconstruction. Dr. Schindler has been serving as an Associate Editor of the Journal of Photogrammetry and Remote Sensing of the International Society for Photogrammetry and Remote Sensing (ISPRS) since 2011, and previously served as an Associate Editor of the Image and Vision Computing Journal from 2011 to 2016. He was the TC President of the ISPRS from 2012 to 2016.


:checkered_flag: Challenge

The challenge focuses on evaluating novel MDE techniques on the SYNS-Patches dataset. This dataset provides a challenging variety of urban and natural scenes, including forests, agricultural settings, residential streets, industrial estates, lecture theatres, offices, and more. Furthermore, the high-quality, dense ground-truth LiDAR allows for the computation of more informative evaluation metrics, such as those focused on depth discontinuities.

[GitHub Starter Pack] — [CodaLab Challenge]

image_0551 image_0893 image_1114 depth_0551 depth_0893 depth_1114

⚡ What’s new in MDEC 2025?

🚀 How to participate?

  1. Check out the new starter pack GitHub. The mdec_2025 folder contains scripts generating valid submissions for Marigold (affine-invariant) and Depth Anything v2 (disparity).
  2. Identify the prediction type of your method and generate a valid submission: val split for the “Development” phase and test split for the “Final” phase.
  3. Register at the CodaLab Challenge site, check the submission constraints and extra conditions, and submit to the leaderboard.

The phases are open according to the following schedule:

📊 Evaluation

Submissions will be evaluated on a variety of metrics:

The leading metric is F-Score (based on the point cloud), denoted as F (↑) in the leaderboard. Challenge winners will be determined based on the performance ranked by the leading metric on the withheld validation (“Development” phase) and the test (“Final” phase) sets of the SYNS-Patches dataset.

To measure the performance locally with other datasets or troubleshoot scoring issues within the challenge, refer to the evaluation code.

📈 Baselines

This year, we switched to LSE-based alignment between predictions and ground truth maps to accept various types of predictions. In addition to previously accepted disparity prediction methods, we welcome affine-invariant, scale-invariant, and metric types.

Accordingly, we updated the benchmark with more recent baselines, such as Marigold (affine-invariant), Depth Anything v2 (disparity), and the winners of the 3rd edition of the MDEC challenge, whose performances are reported below.

  F (↑) F (↑)
(Edges)
MAE (↓) RMSE (↓) AbsRel (↓) Acc (↑)
(Edges)
Comp (↓)
(Edges)
δ<1.25 (↑) δ<1.25^2 (↑) δ<1.25^3 (↑)
PICO-MR 21.07 8.77 3.22 5.60 20.33 3.69 15.41 0.7559 0.9125 0.9590
EVP++ 19.66 9.02 3.20 5.49 19.03 2.66 9.28 0.7553 0.9182 0.9661
Marigold 18.64 9.26 3.87 6.49 24.37 2.90 20.09 0.6903 0.8860 0.9453
Depth Anything v2 14.34 7.94 4.16 7.94 25.48 2.64 30.05 0.6907 0.8849 0.9469
Garg’s Baseline 11.38 6.03 4.62 7.58 31.15 4.01 41.24 0.5842 0.8354 0.9251

📚 Workshop proceedings

As part of the CVPR Workshop Proceedings, we will publish a paper summarizing the results of the challenge. The following conditions must be met to have the method included in the paper:

Once the challenge has finished, we will reach out to the participants meeting the criteria above to request information about their affiliation, a short description of their method, and the method’s source code. Participants not providing this information will not be added to the publication; their submission will stay anonymous in the leaderboard.

Selected top performers will also be invited to present their methods at the workshop. The presentation can be held either in person or virtually. This is mandatory; refusal to do so will result in an invalidated submission and removal from the paper.


🤵 Organizers

Anton Obukhov
Anton Obukhov
Principal Research Scientist
Huawei Research Center Zürich
Ripudaman Singh Arora
Ripudaman Singh Arora
Principal ML Researcher
Blue River Technology
Jaime Spencer
Jaime Spencer
Data Engineer
Oxa
Fabio Tosi
Fabio Tosi
Junior Assistant Professor
University of Bologna
Matteo Poggi
Matteo Poggi
Tenure-Track Assistant Professor
University of Bologna
Chris Russell
Chris Russell
Associate Professor
Oxford Internet Institute
Simon Hadfield
Simon Hadfield
Associate Professor
University of Surrey
Richard Bowden
Richard Bowden
Professor
University of Surrey