People overuse the singularity as the term to describe the large acceleration in ai progress that'll come due to better agents. There are still real points of friction that the current models don't address. The singularity is catchy but misleading. Progress isn't infinite.
09.03.2026 01:49
π 11
π 0
π¬ 1
π 0
GPT 5.4 is the first time I've used codex for multiple hours straight and not ragequit back to claude code.
07.03.2026 23:59
π 31
π 1
π¬ 0
π 0
GPT 5.4 in codex cli/app is much more approachable than any of their models that came before. This is really big for them, excited to keep trying it vis a vis Claude as my agent daily driver.
07.03.2026 22:29
π 11
π 0
π¬ 0
π 1
Hoping OpenClaw convinces OpenAI to build GPT OSS 2. Itβd be a great fit.
07.03.2026 19:13
π 27
π 1
π¬ 1
π 2
Chinese Open Source: A Definitive History
Open source used to be a niche topic.
Must read on Chinese open source from Kevin Xu with the very similarly named substack (story for another time)
interconnect.substack.com/p/chinese-op...
06.03.2026 16:49
π 14
π 5
π¬ 0
π 2
We talk about open models as political insurance, the widening frontier gap, and the ever weirder futures of AI. This is a very important time for open models, the weight of it is obvious, but the economic challenges are so extreme.
06.03.2026 15:16
π 6
π 0
π¬ 1
π 1
Dean Ball on open models and government control
Subtle precedents on the future of open models set by the unfolding Anthropic v. Department of War case.
New conversation on Interconnects w Dean Ball on why the Anthropic v DoW moment could strengthen the long-run case for open models - even if the next few years get rough for open.
www.interconnects.ai/p/how-anthro...
06.03.2026 15:15
π 15
π 3
π¬ 1
π 2
Waiting for deepseek v4
05.03.2026 16:58
π 42
π 2
π¬ 4
π 0
Olmo Hybrid and future LLM architectures
The latest Olmo model and discussions at the frontier of open-source post training tools.
I've written up a blog post that explains why this matters and hybrid models didn't work a few years ago when Mamba was super popular. Plus, this paper is a great entry point for modern deep learning / language modeling scaling theory. Enjoy and send feedback!
www.interconnects.ai/p/olmo-hybri...
05.03.2026 16:26
π 15
π 2
π¬ 0
π 0
In particular, the OSS tools for these new architectures is really limited. New architectures are much slower than standard transformers or popular models like DeepSeek MoEs. This is work that we can do together to keep pushing the frontier of efficient, open models.
05.03.2026 16:26
π 8
π 0
π¬ 1
π 0
It's incredible timing to release a fully open model so people can study how these architecture changes impact the full stack.
Personally, I learned a lot in making the post-training work. Even with the data being identical for pretraining, post-training is very different!
05.03.2026 16:26
π 7
π 0
π¬ 1
π 0
Excited to share the latest Olmo model: Olmo Hybrid. This is a model with gated delta net (GDN) layers in a 3:1 ratio with full attention. It follows lots of other developments like Qwen 3.5 and Kimi Linear.
05.03.2026 16:26
π 67
π 8
π¬ 6
π 4
Iβm doing my part to save Qwen.
Yes they dm me regularly.
05.03.2026 03:19
π 57
π 0
π¬ 3
π 0
Lots of core team members of Alibaba Qwen are resigning publicly on X.
The gaping hole that Qwen imploding would leave in the open research ecosystem will be hard to fill. The small models are irreplaceable.
Iβll do my best to keep carrying that torch. Every bit matters.
03.03.2026 18:10
π 105
π 11
π¬ 3
π 2
Latest open artifacts (#19): Qwen 3.5, GLM 5, MiniMax 2.5 β Chinese labs' latest push of the frontier
Welcome to the year of the horse!
Latest open artifacts (#19): Qwen 3.5, GLM 5, MiniMax 2.5 β Chinese labs' latest push of the frontier
Welcome to the year of the horse! I always learn something new doing these w Florian.
www.interconnects.ai/p/latest-ope...
03.03.2026 16:33
π 21
π 3
π¬ 0
π 0
All of this happening with Anthropic/DoW etc will push a lot more investment in open models, so thereβs transparency in the tools thatβre being used across high stakes domains.
At the same time, these models wonβt be received well if theyβre built in an overly prescriptive way by any government.
28.02.2026 18:52
π 25
π 0
π¬ 2
π 1
28.02.2026 04:19
π 112
π 1
π¬ 2
π 3
It gives me a glimmer of hope in challenging times to see such a deeply respectable, principled stance being held in face of unjust pressure.
Doubly so to see so many I respect and admire standing in support of it.
Stay the course and stand with Anthropic.
28.02.2026 02:05
π 52
π 5
π¬ 1
π 0
If people are working on open research for scaling RL in llms i'd love to talk to you.
27.02.2026 18:32
π 13
π 5
π¬ 1
π 0
How much does distillation really matter for Chinese LLMs?
Reacting to Anthropic's post on "distillation attacks."
How much does distillation really matter for Chinese LLMs?
DeepSeek's usage was a rounding error. MiniMax's was substantial. But distillation is getting less important as RL takes over β it's easier to access "banned" APIs than to smuggle GPUs.
www.interconnects.ai/p/how-much-d...
24.02.2026 16:14
π 52
π 5
π¬ 1
π 6
Made a language model RL cheatsheet for the extra page on the inside back cover of the physical edition RLHF Book.
24.02.2026 00:48
π 25
π 3
π¬ 0
π 0
The RLHF Book should be sent off for printing in the next month or two.
Working on final edits and reviews :D.
Thanks all for your patience.
21.02.2026 19:26
π 58
π 0
π¬ 0
π 1
now that exceeded even my expectations, props Claude
20.02.2026 19:18
π 74
π 12
π¬ 6
π 7
Open models are in a perpetual race to stay relevant at the frontier. While they're doing better than I, and many experts would expect given the cost of models, I don't see evidence that open models are accelerating and surpassing the best closed models.
www.interconnects.ai/p/open-model...
17.02.2026 17:38
π 35
π 4
π¬ 2
π 3
Using the Claude app more due to the personality of the latest Opus and itβs under the radar how much better Claudeβs search has gotten. The top end isnβt as good as GPT Thinking/Pro for research, but the speed is a big upside.
16.02.2026 20:06
π 21
π 0
π¬ 0
π 0
First time at CMU
13.02.2026 15:35
π 8
π 0
π¬ 2
π 0
Fun to set up real analytics and learn that my RLHF Book pdf is downloaded 50-100 times a day from my site (doesnt include Arxiv downloads/views).
Thanks for reading!
12.02.2026 14:51
π 26
π 2
π¬ 0
π 0
Codex app is nice.
Im just a few minutes in and think it'll make some of the crazy things i was doing way easier to monitor.
11.02.2026 23:37
π 4
π 0
π¬ 2
π 0