Excited to be attending EWRL again this year! I'll be giving a talk on Thursday (Sep 8) about my work on policy confounding
Excited to be attending EWRL again this year! I'll be giving a talk on Thursday (Sep 8) about my work on policy confounding
π£ Early bird registration ends today!
Register and join us in TΓΌbingen for EWRL 2025: site.pheedloop.com/event/EWRL/h...
It achieves this by reweighting samples according to the likelihood of state-action pairs under the agentβs state representation, effectively breaking the spurious correlations introduced by the policy.
Here, we show that the advantage function not only reduces the variance of gradient estimates but also helps mitigate the effects of policy confounding.
This paper builds on our work published last year at RLC, where we showed that agents can develop policies that exploit spurious correlations induced by their own policies, a phenomenon we call policy confounding.
Excited to share my new preprint, 'Breaking Habits: On the Role of the Advantage Function in Learning Causal State Representations,' which I presented last week at @rldmdublin2025.bsky.social.
Link: arxiv.org/abs/2506.11912
Phaidra is hiring a Research Scientist to work on sequential decision-making problems. I'm at the RLDM conference in Dublin this week. If you're attending and would like to learn more about the role or the company, feel free to reach out!
job-boards.greenhouse.io/phaidra/jobs...
AI benchmarking culture is completely out of control. Tables with dozens of methods, datasets, and bold numbers, trying to answer a question that perhaps no one should be asking anymore.
π¨π¨ RLC deadline has been extended by a week! Abstract deadline is Feb. 21 with a paper deadline of Feb. 28 π¨π¨. Please spread the word!
We've built a simulated driving agent that we trained on 1.6 billion km of driving with no human data.
It is SOTA on every planning benchmark we tried.
In self-play, it goes 20 years between collisions.
RLC 2025 is looking for reviewers and reviewer nominations, for folks looking to innovate on the RL reviewing process. If you know someone qualified, please nominate them (but read the docs below): forms.gle/3yCeBjn4Yhi7...
And please help us spread the word!
I have a draft of my introduction to cooperative multi-agent reinforcement learning on arxiv. Check it out and let me know any feedback you have. The plan is to polish and extend the material into a more comprehensive text with Frans Oliehoek.
arxiv.org/abs/2405.06161
If you're at NeurIPS, RLC is hosting an RL event from 8 till late at The Pearl on Dec. 11th. Join us, meet all the RL researchers, and spread the word!
Hello, Bluesky! The entire Phaidra research team is excited to be attending #NeurIPS2024 this year. I arrived in Canada early to enjoy a few days of skiing in Whistler before the conference kicks off. If youβre attending and would like to connect, feel free to drop me a message!