David Heurtel-Depeiges (@heurteldepeiges)

The Expressive Limits of Diagonal SSMs for State-Tracking State-Space Models (SSMs) have recently been shown to achieve strong empirical performance on a variety of long-range sequence modeling tasks while remaining efficient and highly-parallelizable....

📝 openreview.net/forum?id=5bg...
Joint work of Mehran Shakerinava, Behnoush Khavari, Siamak Ravanbakhsh and @sarath-chandar.bsky.social @mila-quebec.bsky.social .

10.02.2026 16:54 👍 1 🔁 1 💬 0 📌 0

New work, just accepted @ICLR: "The Expressive Limits of Diagonal SSMs for State-Tracking"
We give a complete characterization of what diagonal SSMs can and cannot compute on state-tracking tasks and the answer is deeply connected to group theory.
🧵👇

10.02.2026 16:54 👍 2 🔁 2 💬 1 📌 0

NeoBERT: A Next-Generation BERT Recent innovations in architecture, pre-training, and fine-tuning have led to the remarkable in-context learning and reasoning abilities of large auto-regressive language models such as LLaMA and Deep...

NeoBERT: A Next-Generation BERT (TMLR Journal-to-Conference Track)

We modernized BERT (RoPE, SwiGLU, 4k context). At just 250M params, it outperforms RoBERTa and ModernBERT on the MTEB benchmark.

📄 arxiv.org/abs/2502.19587

03.02.2026 15:02 👍 2 🔁 1 💬 0 📌 0

The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning Reinforcement learning (RL) has recently become a strong recipe for training reasoning LLMs that produce long chains of thought (LongCoT). Yet the standard RL "thinking environment", where the state i...

The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning

We achieve linear-complexity reasoning. Our "Delethink" decouples thought length from context, matching LongCoT performance with ≈25% of the compute.

📄 arxiv.org/abs/2510.06557

03.02.2026 15:02 👍 1 🔁 1 💬 1 📌 0

The Expressive Limits of Diagonal SSMs for State-Tracking State-Space Models (SSMs) have recently been shown to achieve strong empirical performance on a variety of long-range sequence modeling tasks while remaining efficient and highly-parallelizable....

The Expressive Limits of Diagonal SSMs for State-Tracking

We prove a tight bound: Diagonal SSMs are theoretically incapable of tracking non-Abelian groups. A critical look at where efficient models fail vs. where they succeed.

📄 openreview.net/forum?id=5bg...

03.02.2026 15:02 👍 1 🔁 1 💬 1 📌 0

Excited to share that we have 3 papers accepted at #ICLR2026! 🇧🇷

Our work this year focuses on efficiency and expressivity: deriving theoretical limits for SSMs, achieving linear scaling for reasoning, and modernizing encoder architectures.

A summary of our work 👇 🧵

03.02.2026 15:02 👍 2 🔁 1 💬 1 📌 0

At Chandar Lab, we are happy to announce the third edition of our assistance program to provide feedback for members of communities underrepresented in AI who want to apply to high-profile graduate programs. Want feedback? Details: chandar-lab.github.io/grad_app/. Deadline: Nov 01!

03.10.2025 15:20 👍 1 🔁 1 💬 0 📌 1

Have a look at NovoMolGen implementation on our lab's HF page! Easy to work with and generate new molecules in no time.

08.09.2025 17:14 👍 0 🔁 0 💬 0 📌 0

We just made NovoMolGen easy to play with: Transformers-native checkpoints on the Hub and small notebooks that let you load, sample, and fine-tune in minutes. The few lines of code that load the model, plug in a reward, run a short RL finetune, and plot the curve.

08.09.2025 16:07 👍 3 🔁 2 💬 1 📌 1

Collaborative Multi Agent Reinforcement Learning is key for AI in the future. Check out R3D2, a generalist agent working on text-based Hanabi, accepted at ICLR 2025.

Website: chandar-lab.github.io/R3D2-A-Gener...

04.04.2025 17:16 👍 2 🔁 2 💬 0 📌 0

I am excited to share that our BindGPT paper won the best poster award at #AAAI2025! Congratulations to the team! Work led by @artemzholus.bsky.social!

05.03.2025 14:54 👍 9 🔁 4 💬 0 📌 0

chandar-lab/NeoBERT · Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

The best part? We are open-sourcing everything, including the intermediary model checkpoints. The main model is already on HuggingFace, be sure to check it out! (6/n)

Model: huggingface.co/chandar-lab/...
Paper: arxiv.org/abs/2502.19587
Code and checkpoints to be released soon!

28.02.2025 16:30 👍 7 🔁 1 💬 1 📌 0

NeoBERT is very strong compared to all baselines, is fully open source and open weights (including intermediary checkpoints)...and it has higher tokens/s throughput. Give it a try and substitute your favorite encoder with this new model!

28.02.2025 16:40 👍 0 🔁 0 💬 0 📌 0

Great work by great colleagues! Have a look at the paper.

28.02.2025 16:38 👍 0 🔁 0 💬 0 📌 0

Hi Rupali, could you please add me? Thanks a lot.

27.11.2024 16:48 👍 0 🔁 0 💬 0 📌 0

David Heurtel-Depeiges

Latest posts by David Heurtel-Depeiges @heurteldepeiges