Flavio Martinelli (@flavioh)

Measuring and Controlling Solution Degeneracy Across Task-Trained Recurrent Neural Networks - Kempner Institute Despite reaching equal performance success when trained on the same task, artificial neural networks can develop dramatically different internal solutions, much like different students solving the sam...

🤖📊 NEW in the Deeper Learning blog: @annhuang42.bsky.social & @kanakarajanphd.bsky.social break down their recent work examining how #RNNs solve the same task in different ways, and why that matters. Joint work with @satpreetsingh.bsky.social & @flavioh.bsky.social bit.ly/4kj4fVd #NeuroAI

03.02.2026 15:01 👍 26 🔁 6 💬 0 📌 1

[bonus] Here's a function that two neurons in a channel can implement

04.12.2025 17:26 👍 1 🔁 0 💬 0 📌 0

Flat Channels to Infinity in Neural Loss Landscapes The loss landscapes of neural networks contain minima and saddle points that may be connected in flat regions or appear in isolation. We identify and characterize a special structure in the loss lands...

More interesting details can be found in the paper: arxiv.org/abs/2506.14951

Or come by our poster if at Neurips (Session 3, poster #4200)

Wonderful team with Alex Van Meegen @avm.bsky.social, Berfin Simsek, Wulfram Gerstner @gerstnerlab.bsky.social and Johanni Brea

04.12.2025 17:26 👍 1 🔁 0 💬 1 📌 0

But what happens with standard gradient descent?

Channels to infinity get sharper with O(γ^2), this is a clear example of the edge of stability phenomenon:
gradient descent does not converge to a minimum (at infinity) but gets stuck where the sharpness of the channel is 2/η (η: learning rate)

04.12.2025 17:26 👍 1 🔁 0 💬 1 📌 0

These channels are surprisingly common in MLPs, we find them to be a significant proportion of all minima reached in our training runs

But they can only be spotted by training for a long time, by following the gradient flow with ODE solvers

04.12.2025 17:26 👍 1 🔁 0 💬 1 📌 0

But what do these pairs of neurons compute?
In the limit of γ→∞ and ε→0 (where ε is the distance of the two neurons input weights) they compute a directional derivative!

The MLP is learning to implement a Gated Linear Unit, with a non-linearity that is the derivative of the original

04.12.2025 17:26 👍 1 🔁 0 💬 1 📌 0

Here’s some more pictures from different angles

04.12.2025 17:26 👍 1 🔁 0 💬 1 📌 0

When perturbing networks from their saddle points, gradient trajectories get stuck in nearby channels that run parallel to the saddle line

The gradient dynamics are simple: after a first phase of alignment, trajectories are straight and γ→∞

04.12.2025 17:26 👍 2 🔁 0 💬 1 📌 0

These channels are parallel to lines of saddle points arising from permutation symmetries, as described by Fukumizu & Amari in 2000

Saddles can be formed by taking a network at a local minimum and splitting a neuron's contribution into two, with splitting factor γ

04.12.2025 17:26 👍 1 🔁 0 💬 1 📌 0

🧵Excited to present our latest work at #Neurips25! Together with @avm.bsky.social, we discover 𝐜𝐡𝐚𝐧𝐧𝐞𝐥𝐬 𝐭𝐨 𝐢𝐧𝐟𝐢𝐧𝐢𝐭𝐲: regions in neural networks loss landscapes where parameters diverge to infinity (in regression settings!)

We find that MLPs in these channels can take derivatives and compute GLUs 🤯

04.12.2025 17:26 👍 14 🔁 6 💬 2 📌 0

📍Excited to share that our paper was selected as a Spotlight at #NeurIPS2025!

arxiv.org/pdf/2410.03972

It started from a question I kept running into:

When do RNNs trained on the same task converge/diverge in their solutions?
🧵⬇️

24.11.2025 16:43 👍 108 🔁 27 💬 5 📌 6

Male CNS Connectome A team of researchers has unveiled the complete connectome of a male fruit fly central nervous system —a seamless map of all the neurons in the brain and nerve cord of a single male fruit fly and the ...

Exciting news for #drosophila #connectomics and #neuroscience enthusiasts: the Drosophila male central nervous system connectome is now live for exploration. Find out more at the landing page hosted by our Janelia FlyEM collaborators www.janelia.org/project-team....

05.10.2025 15:40 👍 143 🔁 69 💬 2 📌 8

Lab members are at the Bernstein conference @bernsteinneuro.bsky.social with 9 posters! Here’s the list:

TUESDAY 16:30 – 18:00

P1 62 “Measuring and controlling solution degeneracy across task-trained recurrent neural networks” by @flavioh.bsky.social

30.09.2025 09:29 👍 9 🔁 3 💬 1 📌 0

To our fellow researchers at Harvard and elsewhere. 🧪🧠

I have funds for visiting PhDs or postdocs at TU in Vienna. For short stay or full PhD email me.

For professors, check for instance, this tenure track opening or ask in private for options
informatics.tuwien.ac.at/news/2909

23.05.2025 11:33 👍 11 🔁 3 💬 0 📌 0

Isn't NeuroAI a modern rebranding of computational neuroscience?
My take is that NeuroAI just sounds a little broader as a term, incorporating cognition and behaviour in the picture (that were not so accurately modelled before ANNs).
To me the goals of compneuro and NeuroAI are fully overlapping.

21.11.2024 19:46 👍 1 🔁 0 💬 1 📌 0

Flavio Martinelli

Latest posts by Flavio Martinelli @flavioh