Aran Nayebi's Avatar

Aran Nayebi

@anayebi

Assistant Professor of Machine Learning, Carnegie Mellon University (CMU) Building a Natural Science of Intelligence πŸ§ πŸ€–β€¨ Prev: ICoN Postdoctoral Fellow @MIT, PhD @Stanford NeuroAILab Personal Website: https://cs.cmu.edu/~anayebi

1,112
Followers
523
Following
226
Posts
17.11.2023
Joined
Posts Following

Latest posts by Aran Nayebi @anayebi

If you're attending @cosynemeeting.bsky.social, come check out our NeuroAgents workshop on Tuesday March 17!

Speakers: Omri Barak, Cristina Savin, @lilweb.bsky.social @reecedkeller.bsky.social Caroline Haimerl, Hannah Choi @xaqlab.bsky.social Srini Turaga, Yanan Sui, @trackingskills.bsky.social

πŸ‘‡

05.03.2026 14:31 πŸ‘ 10 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0
Preview
What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty As artificial agents become increasingly capable, what internal structure is *necessary* for an agent to act competently under uncertainty? Classical results show that optimal control can be *implemen...

15/ Paper: arxiv.org/abs/2603.02491

Thanks to @lenoreblum.bsky.social & Manuel Blum, @dhadfieldmenell.bsky.social, & @dyamins.bsky.social, @leokoz8.bsky.social, @reecedkeller.bsky.social, Noushin Quazi for discussions and feedback, & @bwfund.bsky.social & @protocollabs.bsky.social for funding.

04.03.2026 16:37 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

14/ Therefore, the selection-theoretic approach we develop here helps to set ground truth & guidance as to what signatures we can expect to look for in more capable systems.

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

13/ Altogether, these results have implications for the emerging science of AI alignment/welfare. As AI systems become more robustly agentic, we should expect signatures like world models, belief-like memory, and under task-distribution assumptions: modularity & regime-tracking variables to emerge.

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

12/ This connects to the Contravariance Principle / Platonic Representation Hypothesis that similar representations develop with high-performing models, and helps explain why capable models often develop brain-aligned representations, as the past decade of NeuroAI has consistently observed.

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

11/ Finally: if two agents both achieve vanishing regret on the same task family, their internal representations must match up to an *invertible* recoding.

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

10/ Structure in the task distribution further shapes internal organization:
β€’ block-structured tasks β†’ informational modularity
β€’ mixtures of task regimes β†’ persistent regime-tracking variables that globally modulate behavior (functionally analogous to affective modulators)

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

9/ Combining our same betting framework with predictive-state style tests (PSRs), we address an *open question* recently posed by Jonathan Richens & @tom4everitt.bsky.social 2025: even in POMDPs, low regret forces a predictive state and belief-like memory via a quantitative no-aliasing result.

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

8/ Partial observability is harder because the same observation can come from multiple latent states, mixing together different underlying dynamics. No amount of training data scale can resolve this.

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

7/ But we also highlight a limit: We show that counterfactual reasoning generally *cannot* be recovered from this alone, echoing critiques from Judea Pearl and others on the limits on causal reasoning of standard world models.

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

6/ This error bound improves w goal depth n (longer-horizon competence demands tighter dynamics estimates). And it highlights a pitfall: myopic (n=1) competence doesn’t force world models, echoing a recent result of Richens & Everitt, but w/o assuming worst-case competence or deterministic policies.

04.03.2026 16:37 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

5/ In fully observed environments, we show even stochastic policies with only average-case competence implicitly encode an approximate interventional transition model (β€œwhat happens if I do a?”).

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

4/ Main idea: reduce prediction to binary bets.

If a test isn’t a coin flip, regret bounds limit how often an agent can bet wrong. So strong performance forces internal state to track the predictive distinctions that matter.

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

3/ In RL, classic results show belief states are sufficient statistics for optimal control, but they don’t show such predictive structure is *necessary*.

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

2/ Cybernetics argued that β€œevery good regulator is a model” (Good Regulator Theorem). But this has pitfalls: even a constant policy can regulate trivial goals without modeling anything.

04.03.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

1/ As AI agents become increasingly capable, what must *inevitably* emerge inside them?

We prove selection theorems: strong task performance forces world models, belief-like memory andβ€”under task mixturesβ€”persistent variables resembling core primitives associated with emotion.

04.03.2026 16:37 πŸ‘ 14 πŸ” 4 πŸ’¬ 1 πŸ“Œ 0
Preview
Google Colab

PyTorchTNN tutorial (prepared by my students @trinityjchung.com and Yuchen Shen): colab.research.google.com/drive/11QuXu...

Slides from today's talk: anayebi.github.io/files/slides...

26.02.2026 03:42 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Want to learn how to build your own biologically-plausible temporal neural networks (TNNs)?

Check out the PyTorchTNN tutorial, prepared by my students @trinityjchung.com and Yuchen Shen! πŸ‘‡

colab.research.google.com/drive/11QuXu...

Check out the thread below for a high-level overview πŸ‘‡

26.02.2026 03:33 πŸ‘ 5 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Google Colab

Colab notebook tutorial: colab.research.google.com/drive/11QuXu...

26.02.2026 03:22 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

PyTorchTNN tutorial (prepared by my students @trinityjchung.com and Yuchen Shen): colab.research.google.com/drive/11QuXu...

Slides from today's talk: anayebi.github.io/files/slides...

26.02.2026 03:16 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
Neuroscience and Machine Learning Workshop

All details can be found at the link below. Be sure check out the other talks by @cpehlevan.bsky.social and @engeltatiana.bsky.social! neuroscience.uchicago.edu/neuroscience...

23.02.2026 23:36 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Finally, I'll end on giving a tutorial on our PyTorchTNN library: bsky.app/profile/anay...

23.02.2026 23:36 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Then I'll talk about how similar principles of recurrence emerge in tactile sensing, suggesting shared organization across sensory cortex: bsky.app/profile/trin...

23.02.2026 23:36 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I'll first be talking about our work on recurrence in vision: x.com/aran_nayebi/...

23.02.2026 23:36 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Looking forward to presenting on "How behavior shapes recurrent circuits across sensory systems and species: from vision to touch" at the University of Chicago Neuroscience and ML workshop on Wednesday! Details below πŸ‘‡πŸ§΅

23.02.2026 23:36 πŸ‘ 15 πŸ” 4 πŸ’¬ 1 πŸ“Œ 1

It was breathtaking to see this view from your balcony in real life yesterday! :)

22.02.2026 16:37 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
TRIBE: TRImodal Brain Encoder for whole-brain fMRI response prediction Historically, neuroscience has progressed by fragmenting into specialized domains, each focusing on isolated modalities, tasks, or brain regions. While fruitful, this approach hinders the development ...

This figure in the first tweet above^ is taken from @jeanremiking.bsky.social & collaborators' very cool recent paper, definitely check it out!: arxiv.org/abs/2507.22229

(thanks @dyamins.bsky.social for pointing me to it!)

12.02.2026 16:37 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Thus, for building the next generation of foundation models, task pre-training + brain fine-tuning seems like the pragmatic sweet spot β€” efficient but individualizable. Time will tell!

12.02.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

One can therefore think of β€œtask-optimization” as understanding the principles of the intelligent system (hence its efficiency) vs. data-driven finetuning that happens after, as fitting to the specifics of β€œindividuals” (which will be important for translational/biomedical purposes).

12.02.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Now, brains of course aren’t as simple, but at least we get 3 substrate-independent entry points to reason about in task-optimized modeling: task (dataset + objective), architecture, and learning rule. These 3 generate the pre-trained neural network seeded for the neural foundation model.

12.02.2026 16:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0