| CARVIEW |
Research Scientist Tech Lead & Manager Google DeepMind
Email: contact@msajjadi.com Bluesky: @msajjadi.com Github: msmsajjadi
Mehdi S. M. Sajjadi
I am a Tech Lead at Google DeepMind, where my work focuses on 3D scene understanding and deep generative models. With over a decade of experience in the field, I am passionate about building systems that can perceive and generate the world in 3D. Previously, I completed my PhD at the Max Planck Institute for Intelligent Systems with Bernhard Schölkopf, supported by a fellowship from ETH Zürich CLS. I earned my MSc with distinction in Computer Science & Math from the University of Hamburg.
Publications (Google Scholar)
- 🔥 Efficiently Reconstructing Dynamic Scenes One 🎯 D4RT at a Time
-
🔥 From Image to Video: An Empirical Study of Diffusion Representations (ICCV 2025 Highlight, top 2.3%)
-
LayerLock: Non-collapsing Representation Learning with Progressive Freezing (ICCV 2025)
- TAPNext: End-to-End Tracking Any Point as Next Token Prediction (ICCV 2025)
- Scaling 4D Representations
- TRecViT: A Recurrent Video Transformer (TMLR 2026)
- Direct Motion Models for Assessing Generated Videos (ICML 2025)
-
Moving Off-the-Grid: Scene-Grounded Video Representations (NeurIPS 2024 Spotlight, top 2%)
- DyST: Towards Dynamic Neural Scene Representations on Real-World Videos (ICLR 2024 Spotlight, top 5%)
- 🔥 RUST: Latent Neural Scene Representations from Unposed Imagery (CVPR 2023 Highlight, top 2.5%)
- 🔥 PaLM-E: An Embodied Multimodal Language Model (ICML 2023)
- DORSal: Diffusion for Object-centric Representations of Scenes et al. (ICLR 2024)
-
Sensitivity of Slot-Based Object-Centric Models to their Number of Slots
-
RePAST: Relative Pose Attention Scene Representation Transformer (CVPR 2023 Workshops 3DMV & T4V Spotlight)
- Test-time adaptation with slot-centric models (ICML 2023, NeurIPS 2022 Workshops MetaLearn & DistShift)
-
Spatial Symmetry in Slot Attention (ICML 2023, NeurIPS 2022 Workshop NeurReps)
- Object Scene Representation Transformer (NeurIPS 2022)
- 🔥 Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations (CVPR 2022)
- RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs (CVPR 2022 Oral)
- Kubric (CVPR 2022)
- NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes (TMLR 2022)
- NeRF in the Wild (CVPR 2021 Oral)
-
From Variational to Deterministic Autoencoders (ICLR 2020)
- 🔥 Assessing Generative Models via Precision and Recall (NeurIPS 2018 & ICML 2018 workshop TADGM Oral & Best Poster Award at Bosch AI Con 2018)
-
Photorealistic Video Super-Resolution with Enhanced Temporal Consistency (ECCV 2018 workshop PIRM)
-
Spatio-temporal Transformer Network for Video Restoration (ECCV 2018)
-
Tempered Adversarial Networks (ICML 2018 Oral & ICLR 2018 workshop)
- Frame-Recurrent Video Super-Resolution (CVPR 2018)
- 🔥 EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis (ICCV 2017 Oral, top 2%)
- Project website
- GitHub
- Our work has been covered in the press:
- Depth Estimation Through a Generative Model of Light Field Synthesis (GCPR 2016)
- Peer Grading in a Course on Algorithms and Data Structures: Machine Learning Algorithms do not Improve over Simple Baselines (L@S 2016 Oral & ICML 2015 workshops CrowdML & ML4Ed)