π openreview.net/forum?id=5bg...
Joint work of Mehran Shakerinava, Behnoush Khavari, Siamak Ravanbakhsh and @sarath-chandar.bsky.social @mila-quebec.bsky.social .
π openreview.net/forum?id=5bg...
Joint work of Mehran Shakerinava, Behnoush Khavari, Siamak Ravanbakhsh and @sarath-chandar.bsky.social @mila-quebec.bsky.social .
New work, just accepted @ICLR: "The Expressive Limits of Diagonal SSMs for State-Tracking"
We give a complete characterization of what diagonal SSMs can and cannot compute on state-tracking tasks and the answer is deeply connected to group theory.
π§΅π
NeoBERT: A Next-Generation BERT (TMLR Journal-to-Conference Track)
We modernized BERT (RoPE, SwiGLU, 4k context). At just 250M params, it outperforms RoBERTa and ModernBERT on the MTEB benchmark.
π arxiv.org/abs/2502.19587
The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning
We achieve linear-complexity reasoning. Our "Delethink" decouples thought length from context, matching LongCoT performance with β25% of the compute.
π arxiv.org/abs/2510.06557
The Expressive Limits of Diagonal SSMs for State-Tracking
We prove a tight bound: Diagonal SSMs are theoretically incapable of tracking non-Abelian groups. A critical look at where efficient models fail vs. where they succeed.
π openreview.net/forum?id=5bg...
Excited to share that we have 3 papers accepted at #ICLR2026! π§π·
Our work this year focuses on efficiency and expressivity: deriving theoretical limits for SSMs, achieving linear scaling for reasoning, and modernizing encoder architectures.
A summary of our work π π§΅
At Chandar Lab, we are happy to announce the third edition of our assistance program to provide feedback for members of communities underrepresented in AI who want to apply to high-profile graduate programs. Want feedback? Details: chandar-lab.github.io/grad_app/. Deadline: Nov 01!
Have a look at NovoMolGen implementation on our lab's HF page! Easy to work with and generate new molecules in no time.
We just made NovoMolGen easy to play with: Transformers-native checkpoints on the Hub and small notebooks that let you load, sample, and fine-tune in minutes. The few lines of code that load the model, plug in a reward, run a short RL finetune, and plot the curve.
Collaborative Multi Agent Reinforcement Learning is key for AI in the future. Check out R3D2, a generalist agent working on text-based Hanabi, accepted at ICLR 2025.
Website: chandar-lab.github.io/R3D2-A-Gener...
I am excited to share that our BindGPT paper won the best poster award at #AAAI2025! Congratulations to the team! Work led by @artemzholus.bsky.social!
The best part? We are open-sourcing everything, including the intermediary model checkpoints. The main model is already on HuggingFace, be sure to check it out! (6/n)
Model: huggingface.co/chandar-lab/...
Paper: arxiv.org/abs/2502.19587
Code and checkpoints to be released soon!
NeoBERT is very strong compared to all baselines, is fully open source and open weights (including intermediary checkpoints)...and it has higher tokens/s throughput. Give it a try and substitute your favorite encoder with this new model!
Great work by great colleagues! Have a look at the paper.
Hi Rupali, could you please add me? Thanks a lot.