Nathan Lambert's Avatar

Nathan Lambert

@natolambert

A LLN - large language Nathan - (RL, RLHF, society, robotics), athlete, yogi, chef Writes http://interconnects.ai At Ai2 via HuggingFace, Berkeley, and normal places

13,688
Followers
277
Following
1,905
Posts
30.04.2023
Joined
Posts Following

Latest posts by Nathan Lambert @natolambert

People overuse the singularity as the term to describe the large acceleration in ai progress that'll come due to better agents. There are still real points of friction that the current models don't address. The singularity is catchy but misleading. Progress isn't infinite.

09.03.2026 01:49 πŸ‘ 11 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

GPT 5.4 is the first time I've used codex for multiple hours straight and not ragequit back to claude code.

07.03.2026 23:59 πŸ‘ 31 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

GPT 5.4 in codex cli/app is much more approachable than any of their models that came before. This is really big for them, excited to keep trying it vis a vis Claude as my agent daily driver.

07.03.2026 22:29 πŸ‘ 11 πŸ” 0 πŸ’¬ 0 πŸ“Œ 1

Hoping OpenClaw convinces OpenAI to build GPT OSS 2. It’d be a great fit.

07.03.2026 19:13 πŸ‘ 27 πŸ” 1 πŸ’¬ 1 πŸ“Œ 2
Preview
Chinese Open Source: A Definitive History Open source used to be a niche topic.

Must read on Chinese open source from Kevin Xu with the very similarly named substack (story for another time)

interconnect.substack.com/p/chinese-op...

06.03.2026 16:49 πŸ‘ 14 πŸ” 5 πŸ’¬ 0 πŸ“Œ 2

We talk about open models as political insurance, the widening frontier gap, and the ever weirder futures of AI. This is a very important time for open models, the weight of it is obvious, but the economic challenges are so extreme.

06.03.2026 15:16 πŸ‘ 6 πŸ” 0 πŸ’¬ 1 πŸ“Œ 1
Preview
Dean Ball on open models and government control Subtle precedents on the future of open models set by the unfolding Anthropic v. Department of War case.

New conversation on Interconnects w Dean Ball on why the Anthropic v DoW moment could strengthen the long-run case for open models - even if the next few years get rough for open.
www.interconnects.ai/p/how-anthro...

06.03.2026 15:15 πŸ‘ 15 πŸ” 3 πŸ’¬ 1 πŸ“Œ 2
Post image

Waiting for deepseek v4

05.03.2026 16:58 πŸ‘ 42 πŸ” 2 πŸ’¬ 4 πŸ“Œ 0
Preview
Olmo Hybrid and future LLM architectures The latest Olmo model and discussions at the frontier of open-source post training tools.

I've written up a blog post that explains why this matters and hybrid models didn't work a few years ago when Mamba was super popular. Plus, this paper is a great entry point for modern deep learning / language modeling scaling theory. Enjoy and send feedback!

www.interconnects.ai/p/olmo-hybri...

05.03.2026 16:26 πŸ‘ 15 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0

In particular, the OSS tools for these new architectures is really limited. New architectures are much slower than standard transformers or popular models like DeepSeek MoEs. This is work that we can do together to keep pushing the frontier of efficient, open models.

05.03.2026 16:26 πŸ‘ 8 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

It's incredible timing to release a fully open model so people can study how these architecture changes impact the full stack.

Personally, I learned a lot in making the post-training work. Even with the data being identical for pretraining, post-training is very different!

05.03.2026 16:26 πŸ‘ 7 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Excited to share the latest Olmo model: Olmo Hybrid. This is a model with gated delta net (GDN) layers in a 3:1 ratio with full attention. It follows lots of other developments like Qwen 3.5 and Kimi Linear.

05.03.2026 16:26 πŸ‘ 67 πŸ” 8 πŸ’¬ 6 πŸ“Œ 4
Post image

I’m doing my part to save Qwen.
Yes they dm me regularly.

05.03.2026 03:19 πŸ‘ 57 πŸ” 0 πŸ’¬ 3 πŸ“Œ 0

Lots of core team members of Alibaba Qwen are resigning publicly on X.

The gaping hole that Qwen imploding would leave in the open research ecosystem will be hard to fill. The small models are irreplaceable.

I’ll do my best to keep carrying that torch. Every bit matters.

03.03.2026 18:10 πŸ‘ 105 πŸ” 11 πŸ’¬ 3 πŸ“Œ 2
Preview
Latest open artifacts (#19): Qwen 3.5, GLM 5, MiniMax 2.5 β€” Chinese labs' latest push of the frontier Welcome to the year of the horse!

Latest open artifacts (#19): Qwen 3.5, GLM 5, MiniMax 2.5 β€” Chinese labs' latest push of the frontier
Welcome to the year of the horse! I always learn something new doing these w Florian.
www.interconnects.ai/p/latest-ope...

03.03.2026 16:33 πŸ‘ 21 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0

All of this happening with Anthropic/DoW etc will push a lot more investment in open models, so there’s transparency in the tools that’re being used across high stakes domains.

At the same time, these models won’t be received well if they’re built in an overly prescriptive way by any government.

28.02.2026 18:52 πŸ‘ 25 πŸ” 0 πŸ’¬ 2 πŸ“Œ 1
Post image
28.02.2026 04:19 πŸ‘ 112 πŸ” 1 πŸ’¬ 2 πŸ“Œ 3

It gives me a glimmer of hope in challenging times to see such a deeply respectable, principled stance being held in face of unjust pressure.

Doubly so to see so many I respect and admire standing in support of it.

Stay the course and stand with Anthropic.

28.02.2026 02:05 πŸ‘ 52 πŸ” 5 πŸ’¬ 1 πŸ“Œ 0

If people are working on open research for scaling RL in llms i'd love to talk to you.

27.02.2026 18:32 πŸ‘ 13 πŸ” 5 πŸ’¬ 1 πŸ“Œ 0
Preview
How much does distillation really matter for Chinese LLMs? Reacting to Anthropic's post on "distillation attacks."

How much does distillation really matter for Chinese LLMs?

DeepSeek's usage was a rounding error. MiniMax's was substantial. But distillation is getting less important as RL takes over β€” it's easier to access "banned" APIs than to smuggle GPUs.

www.interconnects.ai/p/how-much-d...

24.02.2026 16:14 πŸ‘ 52 πŸ” 5 πŸ’¬ 1 πŸ“Œ 6
Post image

Made a language model RL cheatsheet for the extra page on the inside back cover of the physical edition RLHF Book.

24.02.2026 00:48 πŸ‘ 25 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0
Post image

The RLHF Book should be sent off for printing in the next month or two.
Working on final edits and reviews :D.
Thanks all for your patience.

21.02.2026 19:26 πŸ‘ 58 πŸ” 0 πŸ’¬ 0 πŸ“Œ 1
Preview
Task-Completion Time Horizons of Frontier AI Models Our most up-to-date measurements of the time horizons for public frontier language models.

metr.org/time-horizons/

20.02.2026 19:21 πŸ‘ 10 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

now that exceeded even my expectations, props Claude

20.02.2026 19:18 πŸ‘ 74 πŸ” 12 πŸ’¬ 6 πŸ“Œ 7
Post image

Open models are in a perpetual race to stay relevant at the frontier. While they're doing better than I, and many experts would expect given the cost of models, I don't see evidence that open models are accelerating and surpassing the best closed models.
www.interconnects.ai/p/open-model...

17.02.2026 17:38 πŸ‘ 35 πŸ” 4 πŸ’¬ 2 πŸ“Œ 3

Using the Claude app more due to the personality of the latest Opus and it’s under the radar how much better Claude’s search has gotten. The top end isn’t as good as GPT Thinking/Pro for research, but the speed is a big upside.

16.02.2026 20:06 πŸ‘ 21 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
[02132026, CMU LTI] Agentic Olmos Building Olmo in the Era of Agents Nathan Lambert Allen Institute for AI LTI Colloquium @ Carnegie Mellon University 13 February 2026 Lambert | Agentic Olmo 1 slides available at…

Here are my slides from my recent CMU talk, as I'm transitioning from the Olmo 3 era of just building a reasoning model to thinking about how to do impactful research for agentic systems.
docs.google.com/presentation...

14.02.2026 20:47 πŸ‘ 38 πŸ” 5 πŸ’¬ 0 πŸ“Œ 0
Post image

First time at CMU

13.02.2026 15:35 πŸ‘ 8 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

Fun to set up real analytics and learn that my RLHF Book pdf is downloaded 50-100 times a day from my site (doesnt include Arxiv downloads/views).

Thanks for reading!

12.02.2026 14:51 πŸ‘ 26 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0

Codex app is nice.
Im just a few minutes in and think it'll make some of the crazy things i was doing way easier to monitor.

11.02.2026 23:37 πŸ‘ 4 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0