Stefano Esposito (@s-esposito)

🚀 New paper: ConeGS Error-Guided Densification Using Pixel Cones. We improve 3D Gaussian Splatting by placing Gaussians where they matter most: ConeGS adds primitives along pixel-view cones guided by image error, boosting quality with fewer Gaussians. baranowskibrt.github.io/conegs/

12.11.2025 10:50 👍 14 🔁 3 💬 0 📌 1

ConeGS: Error-Guided Densification Using Pixel Cones for Improved Reconstruction with Fewer Primitives

Bartłomiej Baranowski, @s-esposito.bsky.social, @pgschossmann.bsky.social, @apchen.bsky.social, @andreasgeiger.bsky.social

arxiv.org/abs/2511.06810

11.11.2025 15:42 👍 5 🔁 1 💬 1 📌 0

Picture Perspective and Our Eyes YouTube video by Aaron Hertzmann

Here's a recording of my talk on how perspective works! If you're interested in learning about how picture perspective works in human vision, this is the video to watch. #visionscience
www.youtube.com/watch?v=eamc...

29.09.2025 17:17 👍 19 🔁 6 💬 2 📌 1

𝟯𝗗-𝗟𝗔𝗧𝗧𝗘: 𝗟𝗮𝘁𝗲𝗻𝘁 𝗦𝗽𝗮𝗰𝗲 𝟯𝗗 𝗘𝗱𝗶𝘁𝗶𝗻𝗴 𝗳𝗿𝗼𝗺 𝗧𝗲𝘅𝘁𝘂𝗮𝗹 𝗜𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝘀
Maria Parelli, Michael Oechsle, Michael Niemeyer ... Andreas Geiger
arxiv.org/abs/2509.00269
Trending on www.scholar-inbox.com

04.09.2025 06:00 👍 3 🔁 3 💬 0 📌 0

ELLIS PhD Program: Call for Applications 2025 The ELLIS mission is to create a diverse European network that promotes research excellence and advances breakthroughs in AI, as well as a pan-European PhD program to educate the next generation of AI...

ellis.eu/news/ellis-p...

03.09.2025 05:29 👍 17 🔁 9 💬 0 📌 0

🚀 Introducing our new paper, MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models.

📄 Paper: www.scholar-inbox.com/papers/He202...
arxiv.org/pdf/2508.13148
💻 Code: github.com/autonomousvi...
🌐 Project Page: cli212.github.io/MDPO/

20.08.2025 09:30 👍 12 🔁 8 💬 1 📌 2

Today, we moved into our new building on the CyberValley campus. Everyone is super excited. PhD students went right back to work. But wait, is there something missing? ;)

21.07.2025 19:07 👍 35 🔁 1 💬 1 📌 0

Today we had our AVG Deep Cave Expedition Day! Exploring the challenges of the (unlit, narrow, crawling-only) Hofener Höhle near Grabenstetten ..

18.07.2025 15:48 👍 19 🔁 1 💬 0 📌 0

SpatialTrackerV2: 3D Point Tracking Made Easy

Yuxi Xiao, @jianyuanwang.bsky.social, Nan Xue, @nikkar.bsky.social, Yuri Makarov, Bingyi Kang, Xing Zhu, Hujun Bao, Yujun Shen, Xiaowei Zhou

tl;dr: DAv2+VGGT->depths & poses->iterative cross-attention-based optimizer

arxiv.org/abs/2507.12462

17.07.2025 09:42 👍 2 🔁 2 💬 0 📌 0

CaRL: Learning Scalable Planning Policies with Simple Rewards YouTube video by Daniel Dauner

In case you find it as relaxing as we do: Here is a 2h+ video of our autonomous RL driving agent CaRL in action! @danieldauner.bsky.social @bernhard-jaeger.bsky.social @kashyap7x.bsky.social
youtube.com/watch?v=_god...

15.07.2025 06:17 👍 21 🔁 5 💬 0 📌 1

At #ICML, you can just use scholar inbox to help you find your way through the poster sessions. It just sorts the papers according to your preferences and it really works.

www.scholar-inbox.com/conference/i... ICML 2025 Planner

15.07.2025 04:41 👍 50 🔁 5 💬 3 📌 3

I am in Vancouver at ICML, and tomorrow I will present our newest paper "Partially Observable Reinforcement Learning with Memory Traces". We argue that eligibility traces are more effective than sliding windows as a memory mechanism for RL in POMDPs. 🧵

16.07.2025 01:35 👍 59 🔁 12 💬 3 📌 3

GitHub - autonomousvision/CaRL: [ArXiv 2025] CaRL: Learning Scalable Planning Policies with Simple Rewards [ArXiv 2025] CaRL: Learning Scalable Planning Policies with Simple Rewards - autonomousvision/CaRL

We have released the code for our work, CaRL: Learning Scalable Planning Policies with Simple Rewards.

The repository contains the first public code base for training RL agents with the CARLA leaderboard 2.0 and nuPlan.

github.com/autonomousvi...

15.07.2025 16:05 👍 20 🔁 7 💬 0 📌 2

Scaling 4D Representations

Scaling 4D Representations

Self-supervised learning from video does scale! In our latest work, we scaled masked auto-encoding models to 22B params, boosting performance on pose estimation, tracking & more.

Paper: arxiv.org/abs/2412.15212
Code & models: github.com/google-deepmind/representations4d

10.07.2025 11:52 👍 20 🔁 8 💬 0 📌 0

𝗚𝗲𝗼𝗺𝗲𝘁𝗿𝘆-𝗮𝘄𝗮𝗿𝗲 𝟰𝗗 𝗩𝗶𝗱𝗲𝗼 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 𝗳𝗼𝗿 𝗥𝗼𝗯𝗼𝘁 𝗠𝗮𝗻𝗶𝗽𝘂𝗹𝗮𝘁𝗶𝗼𝗻
Zeyi Liu, Shuang Li, Eric Cousineau ... Shuran Song
arxiv.org/abs/2507.01099
Trending on www.scholar-inbox.com

04.07.2025 06:00 👍 4 🔁 4 💬 0 📌 0

MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details

Ruicheng Wang, Sicheng Xu, Yue Dong, Yu Deng, Jianfeng Xiang, Zelong Lv, Guangzhong Sun, Xin Tong, Jiaolong Yang

arxiv.org/abs/2507.02546

04.07.2025 09:44 👍 5 🔁 4 💬 1 📌 0

I am very proud of my group! These are the nationalities of my current and past team members. Diversity is key.
🇩🇪 🇬🇷 🇮🇹 🇮🇳 🇷🇺 🇺🇦 🇨🇳 🇷🇸 🇯🇵 🇧🇪 🇺🇸 🇰🇷 🇹🇷

03.07.2025 10:30 👍 49 🔁 1 💬 1 📌 0

That’s a wrap on #CVPR2025 in Nashville! From online convos to in-person vibes, one thing’s clear: this community is STRONG 💪 Thanks for following along!

Until next time. @deblinaml.bsky.social, @jbhaurum.bsky.social, @csprofkgd.bsky.social signing off.

15.06.2025 22:36 👍 12 🔁 3 💬 0 📌 0

LLM product placement and search optimization is here and it's as dystopian as you expected.

14.06.2025 23:19 👍 37 🔁 10 💬 3 📌 1

Hey #CVPR2025! Curious about this work? I'll be presenting it this morning! Poster 31, from 10:30 to 12:30 🤠

@cvprconference.bsky.social

15.06.2025 14:24 👍 8 🔁 1 💬 0 📌 0

Check out the ScanNet++ workshop @CVPR on June 12 in 211 from 8:50am!

Exciting keynotes on state-of-the-art NVS & 3D understanding from Andrea Vedaldi, Cordelia Schmid, Gordon Wetzstein, Katja Schwarz, Qianqian Wang, and leading methods on the benchmark!

kaldir.vc.in.tum.de/scannetpp/cv...

08.06.2025 13:20 👍 13 🔁 6 💬 0 📌 0

Join us for the 4D Vision Workshop #CVPR on June 11 starting at 9:20am!

We'll have an incredible lineup of speakers discussing the frontier of 3D computer vision techniques for dynamic world modeling across spatial AI, robotics, astrophysics, and more.

4dvisionworkshop.github.io

09.06.2025 08:02 👍 9 🔁 2 💬 0 📌 2

This Wednesday (1-6PM, Room 106A) at CVPR @cvprconference.bsky.social we have a great lineup of keynote speakers, posters, and spotlights on neural fields and beyond: neural-bcc.github.io

Have a question you want answered by a panel of experts in the field? Send it to us via: tinyurl.com/bdddf36f

09.06.2025 20:34 👍 11 🔁 2 💬 0 📌 1

Excited to present our #CVPR2025 paper DepthSplat next week!
DepthSplat is a feed-forward model that achieves high-quality Gaussian reconstruction and view synthesis in just 0.6 seconds.
Looking forward to great conversations at the conference!

05.06.2025 12:09 👍 27 🔁 7 💬 3 📌 0

🚗 Pseudo-simulation combines the efficiency of open-loop and robustness of closed-loop evaluation. It uses real data + 3D Gaussian Splatting synthetic views to assess error recovery, achieving strong correlation with closed-loop simulations while requiring 6x less compute. arxiv.org/abs/2506.04218

05.06.2025 04:21 👍 22 🔁 10 💬 0 📌 1

SpAItial AI: Building Spatial Foundation Models YouTube video by SpAItial AI

🚀🚀🚀Announcing our $13M funding round to build the next generation of AI: 𝐒𝐩𝐚𝐭𝐢𝐚𝐥 𝐅𝐨𝐮𝐧𝐝𝐚𝐭𝐢𝐨𝐧 𝐌𝐨𝐝𝐞𝐥𝐬 that can generate entire 3D environments anchored in space & time. 🚀🚀🚀

Interested? Join our world-class team:
🌍 spaitial.ai

youtu.be/FiGX82RUz8U

27.05.2025 09:26 👍 52 🔁 9 💬 4 📌 0

"ILM "artists" are now being paid to make shimpanzini bananini and bombardiro crocodilo"

16.05.2025 11:20 👍 1 🔁 0 💬 0 📌 0

Can you train a model for pose estimation directly on casual videos without supervision?

Turns out you can!

In our #CVPR2025 paper AnyCam, we directly train on YouTube videos and achieve SOTA results by using an uncertainty-based flow loss and monocular priors!

⬇️

13.05.2025 08:11 👍 25 🔁 10 💬 1 📌 1

New Paper: Continuous Thought Machines

pub.sakana.ai/ctm/

Neurons in brains use timing and synchronization in the way that they compute, but this is largely ignored in modern neural nets. We believe neural timing is key for the flexibility and adaptability of biological intelligence.

Thread ↓

12.05.2025 02:38 👍 129 🔁 29 💬 5 📌 1

📣 Excited to share our #CVPR2025 Spotlight paper and my internship project @wayve: SimLingo.
A Vision-Language-Action (VLA) model that achieves state-of-the-art driving performance with language capabilities.

Code: github.com/RenzKa/simli...
Paper: arxiv.org/abs/2503.09594

08.05.2025 15:25 👍 25 🔁 9 💬 1 📌 0

Stefano Esposito

Latest posts by Stefano Esposito @s-esposito