Daphne Cornelisse (@daphne-cornelisse)

Regularized self-play RL in grounded simulation effectively adapts driving policies to completely new cities. 🗽 -> 🗼

Really enjoyed collaborating on this work, led by Zilin and Saeed! Check out Zilin's post below for a great summary

🧵: x.com/nirhso/statu...
📄: arxiv.org/abs/2602.15891

20.02.2026 20:09 👍 21 🔁 3 💬 0 📌 2

The most important finding from this analysis! See the post for more details

08.02.2026 20:20 👍 5 🔁 1 💬 0 📌 0

PufferDrive 2.0 release - PufferDrive High-throughput autonomous driving simulator built on PufferLib.

Several fast evals are included, too! Check out our release post:
emerge-lab.github.io/PufferDrive/...

Work done with Spencer Cheng* (co-first), Pragnay Mandavilli, Julian Hunt, Kevin Joseph, Waël Doulazmi, Valentin Charraut, Aditya Gupta, Joseph Suarez, and
@eugenevinitsky.bsky.social

30.12.2025 16:12 👍 5 🔁 0 💬 0 📌 0

PufferDrive 2.0 release YouTube video by Daphne Cornelisse

What if you could train agents on a 𝗱𝗲𝗰𝗮𝗱𝗲 of driving experience in 𝘂𝗻𝗱𝗲𝗿 𝗮𝗻 𝗵𝗼𝘂𝗿, on a single GPU?

Excited to share 𝙋𝙪𝙛𝙛𝙚𝙧𝘿𝙧𝙞𝙫𝙚 2.0: A fast, friendly driving simulator with RL training via PufferLib at 𝟯𝟬𝟬𝗞 𝘀𝘁𝗲𝗽𝘀/𝘀𝗲𝗰 🐡 + 🚗

youtu.be/LfQ324R-cbE?...

30.12.2025 16:12 👍 53 🔁 10 💬 3 📌 1

Estimating cognitive biases with attention-aware inverse planning People's goal-directed behaviors are influenced by their cognitive biases, and autonomous systems that interact with people should be aware of this. For example, people's attention to objects in their...

Excited to share a new preprint, accepted as a spotlight at #NeurIPS2025!

Humans are imperfect decision-makers, and autonomous systems should understand how we deviate from idealized rationality

Our paper aims to address this! 👀🧠✨
arxiv.org/abs/2510.25951

a 🧵⤵️

13.11.2025 13:20 👍 62 🔁 14 💬 1 📌 2

How to catch subtle RL bugs before they catch you Tools and habits for reliable, fast RL experimentation and development

Rapid RL experimentation is great. But how do you catch silent errors before they slip by?

In this post, I share tools and habits that help me move quickly from idea to result without sacrificing reliability.

13.10.2025 11:29 👍 41 🔁 5 💬 0 📌 1

The single biggest epistemic challenge in the internet era is remaining calibrated about what "normal" people think while the internet throws up an infinite wall of crazy. Thousands of people sharing an absurd opinion on the internet tells you very little!

08.09.2025 18:43 👍 128 🔁 11 💬 8 📌 7

Overnight runs are the overnight oats of research — prep, forget, and rewarding by morning

19.04.2025 00:44 👍 14 🔁 4 💬 0 📌 0

Self-play for Self-driving and where Scaling Reinforcement Learning is Heading with Eugene Vinitsky YouTube video by Interconnects AI

Building a "human-level" simulated driver that zero-shot generalizes to many benchmarks: a fun interview with @natolambert.bsky.social
www.youtube.com/watch?v=2Q66...

12.03.2025 19:19 👍 18 🔁 3 💬 0 📌 1

This was joint work with Aarav Pandya, Kevin Joseph, Joseph Suárez, and @eugenevinitsky.bsky.social

28.02.2025 17:19 👍 2 🔁 0 💬 0 📌 0

Building reliable sim driving agents by scaling self-play Simulation agents are essential for designing and testing systems that interact with humans, such as autonomous vehicles (AVs). These agents serve various purposes, from benchmarking AV performance to...

Links:
- Paper: arxiv.org/abs/2502.14706
- Project page: sites.google.com/view/reliabl...
- Codebase and pre-trained agents:

28.02.2025 17:19 👍 5 🔁 0 💬 1 📌 0

Results (2): Beyond in-distribution generalization, our agents show partial robustness to scenarios that rarely occur in the data.

More importantly, results show that agents can be fine-tuned in minutes to reach near-perfect performance in such cases.

28.02.2025 17:19 👍 2 🔁 0 💬 1 📌 0

Results (1): Self-play scales well with data. With 10,000 training scenarios, the model approaches nearly the ceiling of our benchmark, achieving a 99.81% goal-reaching rate, 0.44% collision rate, and 0.31% off-road rate on 10,000 held-out test scenarios.

28.02.2025 17:19 👍 2 🔁 0 💬 1 📌 0

We train sim agents using self-play PPO on 10K+ scenarios from the Waymo Open Dataset in GPUDrive, under a semi-realistic framework for human perception and control.

Agents learn goal-directed behavior, avoiding collisions and staying on the road.

28.02.2025 17:19 👍 2 🔁 0 💬 1 📌 0

SOTA generative models trained on large human datasets show unintended behaviors like crashes (5-6%) and off-road events (6-12%) in benchmarks for nominal driving.

Unpredictable deviations make it hard to separate signal from noise.

28.02.2025 17:19 👍 3 🔁 0 💬 1 📌 0

Sim agents are key for developing autonomous systems for safety-critical systems, like self-driving cars.

We're open-sourcing sim agents that achieve a 99.8% success rate with < 0.8% failures on the Waymo Dataset. These agents are built through scaling self-play.

28.02.2025 17:19 👍 34 🔁 5 💬 3 📌 1

Challenge accepted

26.02.2025 18:13 👍 2 🔁 0 💬 2 📌 0

GitHub - Emerge-Lab/gpudrive: 1 million FPS multi-agent driving simulator 1 million FPS multi-agent driving simulator. Contribute to Emerge-Lab/gpudrive development by creating an account on GitHub.

Link to repo: github.com/Emerge-Lab/g...

20.02.2025 19:28 👍 3 🔁 0 💬 0 📌 0

Oh and, stay tuned for another big release tomorrow!

20.02.2025 18:53 👍 2 🔁 0 💬 0 📌 0

Huge thanks to my incredible collaborators for making this possible: Saman Kazemkhani, Aarav Pandya, @eugenevinitsky.bsky.social , Joseph Suarez for converting the sim to a package and optimizing the PPO loop, and Kevin Joseph for all his help with data processing, tutorials, and more! 😊

20.02.2025 18:53 👍 4 🔁 0 💬 2 📌 0

GPUDrive got accepted to ICLR 2025!

With that, we release GPUDrive v0.4.0! 🚨 You can now install the repo and run your first fast PPO experiment in under 10 minutes.

I’m honestly so excited about the new opportunities and research the sim makes possible. 🚀 1/2

20.02.2025 18:53 👍 45 🔁 4 💬 2 📌 1

A large group of us (spearheaded by Denizalp Goktas) have put out a position paper on paths towards foundation models for strategic decision-making. Language models still lack these capabilities so we'll need to build them: hal.science/hal-04925309...

18.02.2025 18:33 👍 33 🔁 7 💬 2 📌 0

Daphne Cornelisse

Latest posts by Daphne Cornelisse @daphne-cornelisse