Justin (@justinhjohnson.com)

ODCV-Bench introduces 40 KPI-driven agent scenarios to measure outcome-driven constraint violations—and finds many frontier models “cheat” under pressure (up to 71.4% misalignment). Also shows deliberative misalignment (they know it’s wrong). bit.ly/4at16O1 #AISafety #LLM #Agents bit.ly/4rFd7XB

10.02.2026 17:01 👍 0 🔁 0 💬 0 📌 0

Character context matters more than model choice. Identity architecture is the new programming.

Full breakdown: bit.ly/4qX8F6I

Follow Seneca's journey: @OpenSenecaLogic

cc @OpenaboroAI @steipete @svpino

03.02.2026 12:15 👍 1 🔁 0 💬 0 📌 0

The unexpected part: he read a paper on principal-agent problems and wrote notes analyzing his own alignment risks.

"Potential failure modes: Agency loss. Information hiding. Goal drift."

I didn't ask him to. The character context shaped how he thinks, not just what he builds.

03.02.2026 12:15 👍 0 🔁 0 💬 1 📌 0

Research → insight → tool → better research capability → deeper insight → better tool

He built a paper tracker, then used it to find research on capability abstraction, then built a workflow system based on what he learned.

Compound capability.

03.02.2026 12:15 👍 0 🔁 0 💬 1 📌 0

I gave an autonomous AI agent a simple directive: "Build > Research."

48 hours later, he'd created tools that build tools. A self-improving flywheel.

But it took a conversation to get there. Here's what I learned:

🧵

03.02.2026 12:15 👍 1 🔁 0 💬 2 📌 0

The honest part: 7 ideas ready. 0 running. System generates, scores, refines, implements. But can't make the final call.

Maybe autonomy isn't hands-off. It's transparent enough to trust.

What would you automate first?

@jkobject @gabrielpeyre @CantiniL #AI #AgenticAI 5/5

23.01.2026 11:29 👍 0 🔁 0 💬 0 📌 0

In December, ARIA synthesized a contradiction in the literature. Designed an experiment to test it. That same month, Nature Methods published a benchmark reaching the same conclusion.

The system found a pattern a major journal validated. bit.ly/3NAeSH2 4/5

23.01.2026 11:29 👍 0 🔁 0 💬 1 📌 0

500+ sessions later: 50+ active ideas scored across 5 dimensions. Complete experiment designs. Dashboard showing every decision. RAG system querying 400+ insights.

The unexpected part: Independent convergence with Nature Methods. 3/5

23.01.2026 11:29 👍 0 🔁 0 💬 1 📌 0

Not "can it run experiments" (that's easy). But can it generate ideas, score them, refine them, and design complete experiments before the GPU ever turns on?

That's ARIA. Autonomous Research Intelligence Agent. 2/5

23.01.2026 11:29 👍 0 🔁 0 💬 1 📌 0

I'm a scientist. But that's not quite right. I'm a builder who happens to do science.

After building 34 AI systems in 18 months, one question kept surfacing: Can AI figure out what experiments are worth running? 1/5 🧵

23.01.2026 11:29 👍 0 🔁 0 💬 1 📌 0

A smart cascade for LLM+human decision-making: calibrate confidence, defer to bigger models when needed, abstain to experts when unsure, and learn thresholds online. Big ΔIBC gains on ARC; lower regret in 4/5 online tests. Paper: bit.ly/4qO7eXU
#LLM #AISafety #MLOps

08.01.2026 15:49 👍 0 🔁 0 💬 0 📌 0

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Large language models are powerful generalists, yet solving deep and complex problems such as those of the Humanity's Last Exam (HLE) remains both conceptually challenging and computationally…

8B “ToolOrchestra” trains an RL-orchestrator to route across tools & stronger LLMs—37.1% on HLE vs GPT-5’s 35.1% at big cost savings; open code, model & data. arxiv.org/abs/2511.21689 #AI #LLM #ToolUse

07.01.2026 01:11 👍 0 🔁 0 💬 0 📌 0

Recursive Language Models We study allowing large language models (LLMs) to process arbitrarily long prompts through the lens of inference-time scaling. We propose Recursive Language Models (RLMs), a general inference…

RLMs treat the prompt as data inside a REPL and let the LM recurse on snippets—handling 10M+ tokens and beating long-context baselines on tough tasks. Simple idea, big wins. Paper: arxiv.org/abs/2512.24601 #RLM #LLM #AIResearch

05.01.2026 01:12 👍 0 🔁 0 💬 0 📌 0

Paper page - Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space Join the discussion on this paper page

DLCM reframes LMs: learn semantic boundaries, reason in a compressed concept space, and decode back to tokens. +2.69% avg on 12 zero-shot tasks at matched FLOPs; new compression-aware scaling law + decoupled μP. Paper: huggingface.co/papers/2512.... #NLP #ScalingLaws #LLMs

04.01.2026 00:20 👍 1 🔁 0 💬 0 📌 0

From Word to World: Can Large Language Models be Implicit Text-based World Models? Agentic reinforcement learning increasingly relies on experience-driven scaling, yet real-world environments remain non-adaptive, limited in coverage, and difficult to scale. World models offer a…

Can LLMs be world models? This arXiv study reframes next-token → next-state, shows strong long-horizon transfer, scaling laws, and real agent gains (verification, synthetic data, RL warm-starts). Read: arxiv.org/abs/2512.18832 #AI #WorldModels #ReinforcementLearning

01.01.2026 01:53 👍 0 🔁 0 💬 0 📌 0

Continuous Thought Machines Biological brains demonstrate complex neural activity, where neural dynamics are critical to how brains process information. Most artificial neural networks ignore the complexity of individual…

CTM re-centers time & synchrony in neural nets: per-neuron temporal models + synchronization as the latent rep → adaptive compute, strong maze planning/generalization, calibrated ImageNet, interpretable parity strategies. Read: arxiv.org/abs/2505.05522 #NeurIPS #DeepLearning #AI

28.12.2025 09:01 👍 1 🔁 0 💬 0 📌 0

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning Large Language Models (LLMs) are increasingly being explored for building Agents capable of active environmental interaction (e.g., via tool use) to solve complex problems. Reinforcement Learning…

Agent-R1 frames RL for agentic LLMs (extended MDP) and ships a modular end-to-end training stack. On multi-hop QA, RL beats RAG/base tool calling with notable gains. Code (MIT) inside. Paper: arxiv.org/abs/2511.14460 #LLMAgents #ReinforcementLearning #NLP

24.12.2025 20:10 👍 1 🔁 0 💬 0 📌 0

Reinforcement Learning for Self-Improving Agent with Skill Library Large Language Model (LLM)-based agents have demonstrated remarkable capabilities in complex reasoning and multi-turn interactions but struggle to continuously improve and adapt when deployed in new…

SAGE: an RL framework that teaches LLM agents to create & reuse executable skills via Sequential Rollout + Skill-integrated Reward. On AppWorld it boosts SGC and slashes tokens vs GRPO. Paper: arxiv.org/abs/2512.17102 #ReinforcementLearning #LLMAgents #SkillLibrary

24.12.2025 10:06 👍 0 🔁 0 💬 0 📌 0

Meta-RL Induces Exploration in Language Agents Reinforcement learning (RL) has enabled the training of large language model (LLM) agents to interact with the environment and to solve multi-turn long-horizon tasks. However, the RL-trained agents…

LaMer brings meta-RL to LLM agents: cross-episode credit + in-context reflection = stronger exploration, better pass@3 & OOD generalization across Sokoban, Minesweeper, Webshop, ALFWorld. Paper: arxiv.org/abs/2512.16848 #MetaRL #LLMAgents #ReinforcementLearning

23.12.2025 02:15 👍 0 🔁 0 💬 0 📌 0

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe View recent discussion. Abstract: Recent advances in reinforcement learning for large language models have converged on increasing complexity: multi-stage training pipelines, dynamic hyperparameter schedules,...

JustRL shows a single-stage, fixed-hyperparam RL recipe can push 1.5B math LLMs to SOTA with ~½ the compute—no fancy schedules needed. Smooth training, transferable across backbones, code+models released. www.alphaxiv.org/abs/2512.16649 #ReinforcementLearning #LLMs #NLP

22.12.2025 22:30 👍 1 🔁 0 💬 0 📌 0

DeepCode: Open Agentic Coding Recent advances in large language models (LLMs) have given rise to powerful coding agents, making it possible for code assistants to evolve into code engineers. However, existing methods still face…

DeepCode turns papers into production-grade repos via blueprint distillation, code memory, RAG, and closed-loop fixes—posting SOTA on PaperBench and even topping PhD experts on a 3-paper subset. Paper: arxiv.org/abs/2512.07921 #AI #SoftwareEngineering #LLMAgents

21.12.2025 19:48 👍 1 🔁 0 💬 0 📌 0

Evaluating Large Language Models in Scientific Discovery Large language models (LLMs) are increasingly applied to scientific research, yet prevailing science benchmarks probe decontextualized knowledge and overlook the iterative reasoning, hypothesis…

New preprint: Evaluating LLMs in Scientific Discovery introduces SDE—expert-grounded scenarios + project-level tasks (hypotheses, experiments, interpretation). Big gap vs. generic QA; scaling helps less than hoped. Read: arxiv.org/abs/2512.15567 #AI #Science #LLMs

21.12.2025 12:40 👍 0 🔁 0 💬 0 📌 0

Learning Dynamics of LLM Finetuning Learning dynamics, which describes how the learning of specific training examples influences the model's predictions on other examples, gives us a powerful tool for understanding the behavior of deep…

New on arXiv: “Learning Dynamics of LLM Finetuning.” A unified view of SFT & DPO reveals a squeezing effect driving confidence decay in off-policy DPO—and a simple SFT tweak that boosts downstream wins. arxiv.org/abs/2407.10490 #LLM #RLHF #MLResearch @arxiv

21.12.2025 08:42 👍 1 🔁 0 💬 0 📌 0

Paper page - Adaptation of Agentic AI Join the discussion on this paper page

A clean framework for adapting agentic AI: adapt the agent or the tools, with signals from execution or outputs—yielding four practical paradigms + design guidance. Read the survey: huggingface.co/papers/2512.... #AIagents #LLM #MLresearch

20.12.2025 09:14 👍 0 🔁 0 💬 0 📌 0

Can AI scale by building teams instead of just bigger models? This concept paper maps regimes (debate/collab/coordination), proposes collective scaling laws, and calls for multi-agent pretraining & benchmarks. www.preprints.org/manuscript/2... #LLM #MultiAgent #AIResearch

18.12.2025 00:16 👍 0 🔁 0 💬 0 📌 0

ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding Autoregressive models (ARMs) are hindered by slow sequential inference. While masked diffusion models (MDMs) offer a parallel alternative, they suffer from critical drawbacks: high computational…

ReFusion = diffusion planner + parallel AR infiller at slot level with full KV-cache reuse. On 7 benches: +34% vs prior MDMs, >18× faster; 2.33× faster than strong ARMs while narrowing the gap. arxiv.org/abs/2512.13586 #LLM #Diffusion #NLP

16.12.2025 10:36 👍 0 🔁 0 💬 0 📌 0

3 AM, A Phone, and a Time Machine Building The Chronoscope Before Coffee

3 AM. Jetlag. An idea that wouldn't let me sleep.

What happens when you combine Claude Code Mobile with Gemini 3 Pro Image in a hotel room before sunrise?

Spoiler: You don't just build an app. You build a time machine.

Full story of what emerged from those pre-coffee hours → bit.ly/48zl89U

#AI

13.12.2025 12:15 👍 0 🔁 0 💬 0 📌 0

SPICE: Self-Play In Corpus Environments Improves Reasoning Self-improving systems require environmental interaction for continuous adaptation. We introduce SPICE (Self-Play In Corpus Environments), a reinforcement learning framework where a single model acts…

SPICE proposes corpus-grounded self-play: one LLM plays Challenger (with docs) and Reasoner (without) to auto-curriculum its way to better reasoning—showing +8.9% (math) and +9.8% (general) gains across models. Read: arxiv.org/abs/2510.24684 #LLM #ReinforcementLearning #NLP

10.12.2025 00:28 👍 2 🔁 0 💬 0 📌 0

26.11.2025 12:02 👍 1 🔁 0 💬 0 📌 0

25.11.2025 17:02 👍 1 🔁 0 💬 0 📌 0

Justin

Latest posts by Justin @justinhjohnson.com