Owen Marschall (@omarschall)

this looks extremely awesome!

04.03.2026 18:09 👍 7 🔁 0 💬 0 📌 0

Our paper is out in @natneuro.nature.com!

www.nature.com/articles/s41...

We develop a geometric theory of how neural populations support generalization across many tasks.

@zuckermanbrain.bsky.social
@flatironinstitute.org
@kempnerinstitute.bsky.social

1/14

10.02.2026 15:56 👍 273 🔁 100 💬 7 📌 1

(eg see @ulisespereirao.bsky.social 's recent pre print on using low-rank models (R = 10 iirc) to model mouse 2AFC data.)

19.12.2025 18:31 👍 1 🔁 0 💬 1 📌 0

One other thought: one network tracking different variables could be thought of as different "tasks," but they could also be different modes within the same task in our framework. We used R = 2 throughout the paper but R could be larger and probably would need to be for modeling neural data

19.12.2025 18:31 👍 1 🔁 0 💬 1 📌 0

Flexibly selecting among a large (extensive) number of things you *could* do, to perform only a small (non-extensive) number of such computations at a given time, can still be achieved with just a superposition of connectivities if gains are modulated in the right way.

19.12.2025 18:31 👍 2 🔁 0 💬 1 📌 0

... we can't have more than a small, O(1) number of such directions before running out of state norm. So the false "one task at a time" take-home message of our work is backed by a more accurate "only a few tasks at a time" reality from simple norm accounting.

19.12.2025 18:31 👍 1 🔁 0 💬 1 📌 0

But even in this case with more populations, a network can still only do so many things at a time, if by "do a thing" we mean have strong (\sqrt{N}) alignment of the network state with some task-relevant directions. This is trivial because the state vector norm is bounded by \sqrt{N}, and so...

19.12.2025 18:31 👍 1 🔁 0 💬 1 📌 0

We kept it at "one population, one task" to not overcomplicate things, but there's certainly interesting work to be done in that direction! I speculate that the core ideas of population-specific gain factors stabilizing different task subspaces while suppressive inactive ones still applies.

19.12.2025 18:31 👍 1 🔁 0 💬 1 📌 0

In this case, we can do one thing at a time because there's just one population gain factor that marginally stabilizes one task. If we were to extend to multiple population models (see Beiran 2021 Neural Computation), we'd have multiple gain factors that can separately stabilize multiple dynamics.

19.12.2025 18:31 👍 1 🔁 0 💬 1 📌 0

Hey great questions! This "one task only" thing is a consequence of using Gaussian loadings in the model, which is nice for simplicity in math and presentation. Certainly not a general take-home message.

19.12.2025 18:31 👍 1 🔁 0 💬 1 📌 0

37/X Overall, really enjoyed working on this with @david-g-clark.bsky.social and Ashok, and I’d love to chat about any part of it—the details behind the theory, the experimental implications, etc. Thanks for reading!

15.12.2025 19:41 👍 5 🔁 0 💬 1 📌 0

36/X Pooled across behavioral syllables, we should recover high dimensionality, with higher dimension the more types of behavioral syllables that are observed. But we should observe lower growth of dimension per unit recording time, compared to pooling across periods of no behavior.

15.12.2025 19:41 👍 2 🔁 0 💬 1 📌 0

35/X Borrowing the language of “behavioral syllables” (Markowitz et al. Nature 2023), we can formulate a few predictions. Measured during periods of no behavior, neural activity should be fairly high-D. Measured over many repeats of a single behavioral syllable, neural activity should be low-D.

15.12.2025 19:41 👍 1 🔁 0 💬 1 📌 0

34/X In the former, high dimensionality emerges simply because the network is big. In the latter, high dimensionality emerges because the network is doing a lot of different things. Both are viable!

15.12.2025 19:41 👍 2 🔁 0 💬 1 📌 0

33/X This model elucidates clearly two hypotheses for the origin of high-dimensional neural activity. One is spontaneous fluctuations in the absence of some coherent behavior. Another is switching among many different, individually low-dimensional, behavioral states.

15.12.2025 19:41 👍 1 🔁 0 💬 1 📌 0

32/X We conservatively used fairly quick task-switching intervals to not artificially magnify this effect, and we still see slower dimension growth in this setup than in the spontaneous state. Spontaneous activity maximally quickly explores the dimensions available to it.

15.12.2025 19:41 👍 1 🔁 0 💬 1 📌 0

31/X Even in cases when the “switching among different tasks” setup has higher overall dimension than the spontaneous state, the rate at which this measurement grows wrt recording time is slower (previous post). This is especially true the longer the network lingers in each task-specific subspace.

15.12.2025 19:41 👍 2 🔁 0 💬 1 📌 0

30/X This can even exceed the dimension of the spontaneous state, of course depending on a few things (how big N is, how many different task-selected states are chosen, etc.).

15.12.2025 19:41 👍 1 🔁 0 💬 1 📌 0

29/X Although any one task component generates low-D activity when selected, recall that these task manifolds are randomly oriented with respect to one another. If we measure over sequential activation of many different, individually low-D task-selected states, we recover high-D activity overall.

15.12.2025 19:41 👍 2 🔁 0 💬 1 📌 0

28/X This suggests that, even if we include trial-to-trial fluctuations that are approximately independent between neurons, we won’t get high dimensionality just from the existence (and measurement) of a large number of neurons, if we measure while restricted to just a single task context.

15.12.2025 19:41 👍 2 🔁 0 💬 1 📌 0

27/X The task-selected states are much lower D, as so much of their variance is captured by just the handful of selected-task dimensions. In the chaotic task-selected states, fluctuations lead to marginally higher (less low?) dim. that can exceed task dim. R, but not in a way that scales with N.

15.12.2025 19:41 👍 2 🔁 0 💬 1 📌 0

26/X Nonetheless, in our spontaneous state each neuron adds a new (fraction of a) dimension since neurons fluctuate approximately independently of one another, with a proportionality constant that is a nonlinear function of the network parameters and can be quite small (see Clark et al PRX 2025).

15.12.2025 19:41 👍 2 🔁 0 💬 1 📌 0

25/X A different but similar-in-vibe observation has been made in experimental work, that *measuring* more neurons leads to higher dimension (figure from Manley et al Neuron 2024), which admittedly isn’t quite the same as increasing the number of neurons that exist.

15.12.2025 19:41 👍 3 🔁 0 💬 1 📌 0

24/X As promised, let’s examine the dimension (participation ratio) of these states' activity patterns. The spontaneous state is high-dimensional, in the sense that its dimension scales with the size of the network N. For larger and larger networks, this dimension can grow without bound.

15.12.2025 19:41 👍 2 🔁 0 💬 1 📌 0

23/X But 1) Maybe the brain does this? 2) The modulation required is *subtle,* a vanishingly small fraction of the overall weight matrix, and is itself low-dimensional. And yet is sufficient to induce large-scale activity changes, because it operates via a phase transition.

15.12.2025 19:41 👍 5 🔁 0 💬 1 📌 0

22/X About the selection mechanism: yes we are modulating connectivity itself. Yes this is arguably "cheating" the multi-task challenge, traditionally thought of as a fixed-connectivity network prompted by inputs to do different things (Yang et al Nat Neuro 2019, Driscoll et al Nat Neuro 2024).

15.12.2025 19:41 👍 3 🔁 0 💬 1 📌 0

21/X In both task-selected states, there is strong activity in the subspace of the selected task. The chaotic task-selected state features both coherent task dynamics (noiseless to leading order) as well as fluctuations in single-neuron rates comparable in magnitude to their task-related tuning.

15.12.2025 19:41 👍 3 🔁 0 💬 1 📌 0

20/X This provides a mechanism for selecting dynamics. We identify 3 regimes *per task,* over modulation of the overall strength of that task’s connectivity component: the spontaneous state, then the chaotic and nonchaotic task-selected states.

15.12.2025 19:41 👍 3 🔁 0 💬 1 📌 0

19/X This can be achieved through modulating the strength of the associated connectivity component. Because any one connectivity component is low rank, this can be biologically implemented via gain modulation of an external loop, eg through thalamus (as in Logiaco et al Cell Reports 2021).

15.12.2025 19:41 👍 5 🔁 0 💬 1 📌 0

18/X We think of this chaotic state as the “spontaneous” state of the network, where no task is activated. A task activates when the noisy, linearized dynamics lose stability, so that the associated latent variables grow exponentially (before nonlinearly self-stabilizing) to dominate the network.

15.12.2025 19:41 👍 5 🔁 0 💬 1 📌 0

Owen Marschall

Latest posts by Owen Marschall @omarschall