Top 10 Robotics Papers of 2024 - Community Survey (Open nomination)
We are gathering the Top 10 Robotics Papers for 2024 in the following areas:
Navigation
Manipulation
Whole-body (Humanoids/Locomotion)
Foundation Models for Robotics
Systems
Benchmarks & Simulation
Pl...
π Open Call for Nominations: Top 10 Robotics Papers of 2024 π
π Categories:
1οΈβ£ Navigation π§
2οΈβ£ Manipulation π€
3οΈβ£ Whole-Body Motion (Humanoids/Locomotion) πΆββοΈ
4οΈβ£ Foundation Models for Robotics π§
5οΈβ£ Robotic Systems βοΈ
6οΈβ£ Benchmarks & Simulation π§ͺ
docs.google.com/forms/d/e/1F...
18.12.2024 00:54
π 2
π 0
π¬ 0
π 0
10/π§΅Curious for more? Check out our paper for the full breakdown: "SAT: Spatial Aptitude Training for Multimodal Language Models" by @ARRay693 @ehsanik @anikembhavi @rosemhendrix @RanjayKrishna @KuoHaoZeng @kate_saenko_ @drbashkirova and et al.
11.12.2024 16:12
π 2
π 0
π¬ 1
π 0
9/π§΅The takeaway: Dynamic spatial QAs improve static QA performance too!
Mixing static & dynamic training data results in significant accuracy gains across all tasks. π
11.12.2024 16:12
π 0
π 0
π¬ 1
π 0
8/π§΅Challenges MLMs face:
Even strong models perform near-randomly on SAT's dynamic tasks.
Egocentric movement and multiview reasoning remain tough nuts to crack.
11.12.2024 16:12
π 0
π 0
π¬ 1
π 0
7/π§΅SAT enables five complex spatial tasks:
Egocentric Movement
Object Movement
Allocentric Perspective
Goal Aiming
Action Consequence
Each task tests unique dimensions of spatial cognition.π§
11.12.2024 16:12
π 0
π 0
π¬ 1
π 0
6/π§΅How does SAT generate data?
Uses ProcTHOR for 3D scenes.
Procedurally generates static & dynamic QAs.
Scalable, cost-effective, & adaptable for new tasks. π
11.12.2024 16:12
π 0
π 0
π¬ 1
π 0
5/π§΅Here's the kicker: Fine-tuning on SAT makes the open-source LLaVA-13B model match or surpass proprietary giants like GPT4-V in spatial reasoning! π―
11.12.2024 16:12
π 0
π 0
π¬ 1
π 0
4/π§΅ Results? SAT improves performance not only on its own dataset but also boosts zero-shot spatial reasoning:
+23% on CVBench
+9% on BLINK (harder benchmarks)
+18% on Visual Spatial Relations (VSR) dataset. πͺ
11.12.2024 16:12
π 0
π 0
π¬ 1
π 0
3/π§΅Example tasks SAT tackles:
Static: Is object X to the left of object Y?
Dynamic: How did the camera move between frames? Did the object get closer or further?
Perspective: What does object placement look like from point X?
11.12.2024 16:12
π 0
π 0
π¬ 1
π 0
2/π§΅SAT introduces 218K question-answer pairs for 22K synthetic scenes created using a photorealistic physics engine. It goes beyond static benchmarks to tackle dynamic reasoning tasks like egocentric actions, object movement, & perspective-taking. π
11.12.2024 16:12
π 0
π 0
π¬ 1
π 0
1/π§΅ Why does spatial reasoning matter? π Cognitive science shows spatial reasoning is foundational to intelligence, impacting geometry, physics, and physical world reasoning. Yet, MLMs struggle with it, especially in dynamic real-world scenarios. Enter SAT! βοΈ
11.12.2024 16:12
π 2
π 0
π¬ 1
π 0
πExcited to introduce our latest work- SAT: Spatial Aptitude Training, a groundbreaking approach to enhance spatial reasoning in Multimodal Language Models (MLMs). SAT isn't just about understanding static object positions but dives deep into dynamic spatial reasoning. π§΅π
11.12.2024 16:12
π 1
π 1
π¬ 1
π 0
A scene from maniskill,
Prompt: Move the mobile robot to the table and place the red bowl onto the table.
10.12.2024 18:51
π 0
π 0
π¬ 0
π 0
I think text-2-video is not that bad, at least we see some good robot motion for humanoid properly cause they trained on a lot of human video. But what is not good, is image-2-video generation
10.12.2024 18:50
π 1
π 0
π¬ 0
π 0
I am impressed by Sora, and seeing potential for using it in robotics.
10.12.2024 05:40
π 1
π 0
π¬ 2
π 0
Weβve been investigating how sim, while wrong, can be useful for real-world robotic RL! In our #NeurIPS2024 work, we theoretically showed how naive sim2real transfer can be inefficient, but if you *learn to explore* in sim, this transfers to the real world! We show this works on real robots! π§΅(1/6)
06.12.2024 00:46
π 13
π 5
π¬ 2
π 0
Everyone I see that moved from X to here seems to gain back the same number of following except me π€£
04.12.2024 10:25
π 1
π 0
π¬ 1
π 0
Is there a robotics starter pack?
28.11.2024 04:45
π 1
π 0
π¬ 1
π 0
βHello, World!β
27.11.2024 08:11
π 3
π 0
π¬ 1
π 0