this looks extremely awesome!
this looks extremely awesome!
Our paper is out in @natneuro.nature.com!
www.nature.com/articles/s41...
We develop a geometric theory of how neural populations support generalization across many tasks.
@zuckermanbrain.bsky.social
@flatironinstitute.org
@kempnerinstitute.bsky.social
1/14
(eg see @ulisespereirao.bsky.social 's recent pre print on using low-rank models (R = 10 iirc) to model mouse 2AFC data.)
One other thought: one network tracking different variables could be thought of as different "tasks," but they could also be different modes within the same task in our framework. We used R = 2 throughout the paper but R could be larger and probably would need to be for modeling neural data
Flexibly selecting among a large (extensive) number of things you *could* do, to perform only a small (non-extensive) number of such computations at a given time, can still be achieved with just a superposition of connectivities if gains are modulated in the right way.
... we can't have more than a small, O(1) number of such directions before running out of state norm. So the false "one task at a time" take-home message of our work is backed by a more accurate "only a few tasks at a time" reality from simple norm accounting.
But even in this case with more populations, a network can still only do so many things at a time, if by "do a thing" we mean have strong (\sqrt{N}) alignment of the network state with some task-relevant directions. This is trivial because the state vector norm is bounded by \sqrt{N}, and so...
We kept it at "one population, one task" to not overcomplicate things, but there's certainly interesting work to be done in that direction! I speculate that the core ideas of population-specific gain factors stabilizing different task subspaces while suppressive inactive ones still applies.
In this case, we can do one thing at a time because there's just one population gain factor that marginally stabilizes one task. If we were to extend to multiple population models (see Beiran 2021 Neural Computation), we'd have multiple gain factors that can separately stabilize multiple dynamics.
Hey great questions! This "one task only" thing is a consequence of using Gaussian loadings in the model, which is nice for simplicity in math and presentation. Certainly not a general take-home message.
37/X Overall, really enjoyed working on this with @david-g-clark.bsky.social and Ashok, and Iβd love to chat about any part of itβthe details behind the theory, the experimental implications, etc. Thanks for reading!
36/X Pooled across behavioral syllables, we should recover high dimensionality, with higher dimension the more types of behavioral syllables that are observed. But we should observe lower growth of dimension per unit recording time, compared to pooling across periods of no behavior.
35/X Borrowing the language of βbehavioral syllablesβ (Markowitz et al. Nature 2023), we can formulate a few predictions. Measured during periods of no behavior, neural activity should be fairly high-D. Measured over many repeats of a single behavioral syllable, neural activity should be low-D.
34/X In the former, high dimensionality emerges simply because the network is big. In the latter, high dimensionality emerges because the network is doing a lot of different things. Both are viable!
33/X This model elucidates clearly two hypotheses for the origin of high-dimensional neural activity. One is spontaneous fluctuations in the absence of some coherent behavior. Another is switching among many different, individually low-dimensional, behavioral states.
32/X We conservatively used fairly quick task-switching intervals to not artificially magnify this effect, and we still see slower dimension growth in this setup than in the spontaneous state. Spontaneous activity maximally quickly explores the dimensions available to it.
31/X Even in cases when the βswitching among different tasksβ setup has higher overall dimension than the spontaneous state, the rate at which this measurement grows wrt recording time is slower (previous post). This is especially true the longer the network lingers in each task-specific subspace.
30/X This can even exceed the dimension of the spontaneous state, of course depending on a few things (how big N is, how many different task-selected states are chosen, etc.).
29/X Although any one task component generates low-D activity when selected, recall that these task manifolds are randomly oriented with respect to one another. If we measure over sequential activation of many different, individually low-D task-selected states, we recover high-D activity overall.
28/X This suggests that, even if we include trial-to-trial fluctuations that are approximately independent between neurons, we wonβt get high dimensionality just from the existence (and measurement) of a large number of neurons, if we measure while restricted to just a single task context.
27/X The task-selected states are much lower D, as so much of their variance is captured by just the handful of selected-task dimensions. In the chaotic task-selected states, fluctuations lead to marginally higher (less low?) dim. that can exceed task dim. R, but not in a way that scales with N.
26/X Nonetheless, in our spontaneous state each neuron adds a new (fraction of a) dimension since neurons fluctuate approximately independently of one another, with a proportionality constant that is a nonlinear function of the network parameters and can be quite small (see Clark et al PRX 2025).
25/X A different but similar-in-vibe observation has been made in experimental work, that *measuring* more neurons leads to higher dimension (figure from Manley et al Neuron 2024), which admittedly isnβt quite the same as increasing the number of neurons that exist.
24/X As promised, letβs examine the dimension (participation ratio) of these states' activity patterns. The spontaneous state is high-dimensional, in the sense that its dimension scales with the size of the network N. For larger and larger networks, this dimension can grow without bound.
23/X But 1) Maybe the brain does this? 2) The modulation required is *subtle,* a vanishingly small fraction of the overall weight matrix, and is itself low-dimensional. And yet is sufficient to induce large-scale activity changes, because it operates via a phase transition.
22/X About the selection mechanism: yes we are modulating connectivity itself. Yes this is arguably "cheating" the multi-task challenge, traditionally thought of as a fixed-connectivity network prompted by inputs to do different things (Yang et al Nat Neuro 2019, Driscoll et al Nat Neuro 2024).
21/X In both task-selected states, there is strong activity in the subspace of the selected task. The chaotic task-selected state features both coherent task dynamics (noiseless to leading order) as well as fluctuations in single-neuron rates comparable in magnitude to their task-related tuning.
20/X This provides a mechanism for selecting dynamics. We identify 3 regimes *per task,* over modulation of the overall strength of that taskβs connectivity component: the spontaneous state, then the chaotic and nonchaotic task-selected states.
19/X This can be achieved through modulating the strength of the associated connectivity component. Because any one connectivity component is low rank, this can be biologically implemented via gain modulation of an external loop, eg through thalamus (as in Logiaco et al Cell Reports 2021).
18/X We think of this chaotic state as the βspontaneousβ state of the network, where no task is activated. A task activates when the noisy, linearized dynamics lose stability, so that the associated latent variables grow exponentially (before nonlinearly self-stabilizing) to dominate the network.