Miguel Suau (@miguelsuau)

Excited to be attending EWRL again this year! I'll be giving a talk on Thursday (Sep 8) about my work on policy confounding

13.09.2025 09:07 👍 1 🔁 0 💬 0 📌 0

PheedLoop PheedLoop: Hybrid, In-Person & Virtual Event Software

📣 Early bird registration ends today!
Register and join us in Tübingen for EWRL 2025: site.pheedloop.com/event/EWRL/h...

03.09.2025 07:44 👍 2 🔁 3 💬 0 📌 0

It achieves this by reweighting samples according to the likelihood of state-action pairs under the agent’s state representation, effectively breaking the spurious correlations introduced by the policy.

18.06.2025 19:55 👍 0 🔁 0 💬 0 📌 0

Here, we show that the advantage function not only reduces the variance of gradient estimates but also helps mitigate the effects of policy confounding.

18.06.2025 19:55 👍 0 🔁 0 💬 1 📌 0

This paper builds on our work published last year at RLC, where we showed that agents can develop policies that exploit spurious correlations induced by their own policies, a phenomenon we call policy confounding.

18.06.2025 19:55 👍 0 🔁 0 💬 1 📌 0

Excited to share my new preprint, 'Breaking Habits: On the Role of the Advantage Function in Learning Causal State Representations,' which I presented last week at @rldmdublin2025.bsky.social.
Link: arxiv.org/abs/2506.11912

18.06.2025 19:55 👍 5 🔁 0 💬 1 📌 0

AI Research Scientist (Sequential Decision Making) Remote

Phaidra is hiring a Research Scientist to work on sequential decision-making problems. I'm at the RLDM conference in Dublin this week. If you're attending and would like to learn more about the role or the company, feel free to reach out!
job-boards.greenhouse.io/phaidra/jobs...

11.06.2025 10:40 👍 4 🔁 0 💬 1 📌 0

AI benchmarking culture is completely out of control. Tables with dozens of methods, datasets, and bold numbers, trying to answer a question that perhaps no one should be asking anymore.

30.05.2025 21:55 👍 18 🔁 5 💬 1 📌 1

🚨🚨 RLC deadline has been extended by a week! Abstract deadline is Feb. 21 with a paper deadline of Feb. 28 🚨🚨. Please spread the word!

08.02.2025 18:05 👍 25 🔁 13 💬 1 📌 2

We've built a simulated driving agent that we trained on 1.6 billion km of driving with no human data.
It is SOTA on every planning benchmark we tried.
In self-play, it goes 20 years between collisions.

06.02.2025 18:34 👍 298 🔁 55 💬 22 📌 8

RLC 2025 is looking for reviewers and reviewer nominations, for folks looking to innovate on the RL reviewing process. If you know someone qualified, please nominate them (but read the docs below): forms.gle/3yCeBjn4Yhi7...
And please help us spread the word!

31.01.2025 17:08 👍 5 🔁 3 💬 1 📌 4

x.com

🤔🤷
x.com/SuauMiguel/s...

25.01.2025 16:26 👍 0 🔁 0 💬 0 📌 0

A First Introduction to Cooperative Multi-Agent Reinforcement Learning Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years. While numerous approaches have been developed, they can be broadly categorized into three main types: centralized ...

I have a draft of my introduction to cooperative multi-agent reinforcement learning on arxiv. Check it out and let me know any feedback you have. The plan is to polish and extend the material into a more comprehensive text with Frans Oliehoek.

arxiv.org/abs/2405.06161

07.01.2025 16:25 👍 78 🔁 19 💬 3 📌 3

If you're at NeurIPS, RLC is hosting an RL event from 8 till late at The Pearl on Dec. 11th. Join us, meet all the RL researchers, and spread the word!

10.12.2024 21:55 👍 63 🔁 18 💬 2 📌 4

Hello, Bluesky! The entire Phaidra research team is excited to be attending #NeurIPS2024 this year. I arrived in Canada early to enjoy a few days of skiing in Whistler before the conference kicks off. If you’re attending and would like to connect, feel free to drop me a message!

08.12.2024 23:41 👍 5 🔁 0 💬 0 📌 0

Miguel Suau

Latest posts by Miguel Suau @miguelsuau