Sparse Coding and Autoencoders
In "Dictionary Learning" one tries to recover incoherent matrices $A^* \in \mathbb{R}^{n \times h}$ (typically overcomplete and whose columns are assumed to be normalized) and sparse vectors $x^* \in ...
With all renewed discussion about "Sparse AutoEncoders (#SAE)" as a way of doing #MechanisticInterpretability of #LLMs, I am resharing a part of my PhD where we proved years ago about how sparsity automatically emerges in autoencoding.
arxiv.org/abs/1708.03735
03.10.2025 14:16
๐ 0
๐ 0
๐ฌ 0
๐ 0
Registrations close for #DRSciML by Noon (Manchester time). Do register soon to ensure you get the Zoom links to attend this exciting event on the foundations of #ScientificML ๐ฅ
08.09.2025 08:41
๐ 0
๐ 0
๐ฌ 0
๐ 0
Provable Size Requirements for Operator Learning and PINNs, by Anirbit Mukherjee
YouTube video by CSAChannel IISc
Recently I gave an online talk @
India's premier institute IISc 's "Bangalore Theory Seminars" where I explained our results on size lowerbounds for neural models of solving PDEs via neural nets. #SciML #AI4SCIENCE I cover work by one of my, 1st year PhD student, Sebastien.
youtu.be/CWvnhv1nMRY?...
04.09.2025 20:16
๐ 1
๐ 0
๐ฌ 0
๐ 0
Today is 70th anniversary of the summer meeting at Dartmouth which officially marked the beginning of AI research ๐ฅ Interestingly "Objective 3" in 1955 was already about having theory of neural nets. ๐
stanford.io/2WJJJGN
31.08.2025 17:52
๐ 0
๐ 0
๐ฌ 0
๐ 0
Langevin Monte-Carlo Provably Learns Depth Two Neural Nets at Any Size and Data
In this work, we will establish that the Langevin Monte-Carlo algorithm can learn depth-2 neural nets of any size and for any data and we give non-asymptotic convergence rates for it. We achieve this ...
Why does noisy gradient-descent train neural nets? This fundamental question in ML remains unclear.
In our hugely revised draft my student @dkumar9.bsky.social gives the full proof that a form of noisy-GD, Langevin Monte-Carlo (#LMC), can learn arbitrary depth 2 nets.
arxiv.org/abs/2503.10428
22.08.2025 15:59
๐ 2
๐ 1
๐ฌ 0
๐ 0
DRSciML
Registrations are now open for the international workshop on foundations of #AI4Science #SciML that we are hosting with Prof. Jakob Zech. In-person seats are very limited, please do register to join online ๐ฅ
drsciml.github.io/drsciml/
21.08.2025 10:26
๐ 0
๐ 1
๐ฌ 0
๐ 1
Please do get in touch if you have published paper(s) on solving singularly perturbed PDEs using neural nets. #AI4Science #SciML
18.08.2025 09:05
๐ 0
๐ 0
๐ฌ 0
๐ 0
@aifunmcr.bsky.social
07.08.2025 16:35
๐ 0
๐ 0
๐ฌ 0
๐ 0
Some luck to be hosted by a Godel Prize winner, Prof. Sebastien Pokutta, and to present our work in their group ๐ฅ Sebastien heads this "Zuse Institute Berlin (#ZIB) " which is an amazing oasis of applied mathematics bringing together experts from different institutes in Berlin.
07.08.2025 16:34
๐ 1
๐ 1
๐ฌ 1
๐ 0
Interested in statistics? Prof Subhashis Ghoshal will be delivering the below public lecture tomorrow:
Title: Immersion posterior: Meeting Frequentist Goals under Structural Restrictions
Time: Aug 5 16:00-17:00
Abstract: www.newton.ac.uk/seminar/45562/
Livestream: www.newton.ac.uk/news/watch-l...
04.08.2025 10:45
๐ 1
๐ 1
๐ฌ 0
๐ 0
Hello #FAU. Thanks for the quick plan to host me and letting me present our exciting mathematics of ML in infinite-dimensions, #operatorlearning. #sciML Their "Pattern Recognition Laboratory" is completing 50 years! @andreasmaier.bsky.social ๐ฅ
02.08.2025 18:31
๐ 1
๐ 0
๐ฌ 0
๐ 0
@aifunmcr.bsky.social
24.07.2025 13:27
๐ 0
๐ 0
๐ฌ 0
๐ 0
University of Manchester has a 1 year post-doc position that I am happy to support in our group if you are currently an #EPSRC funded PhD student - and have the required specialization for work in our group. Typicall we prefer candidates who have published in deep-learning theory or fluid theory.
24.07.2025 13:23
๐ 1
๐ 0
๐ฌ 1
๐ 0
@aifunmcr.bsky.social
23.07.2025 16:53
๐ 0
๐ 0
๐ฌ 0
๐ 0
#aiforscience
23.07.2025 16:53
๐ 0
๐ 0
๐ฌ 0
๐ 0
DRSciML
Do mark your calendars for "DRSciML" (Dr. Scientific ML ๐) on September 9 and 10 ๐ฅ
drsciml.github.io/drsciml/
- We are hosting a 2 day international workshop on understanding scientific-ML.
- We have leading experts from around the world giving talks.
- There might be ticketing. Watch this space!
23.07.2025 16:52
๐ 0
๐ 0
๐ฌ 2
๐ 0
@aifunmcr.bsky.social
23.07.2025 16:49
๐ 0
๐ 0
๐ฌ 0
๐ 0
Major ML journals that have come up in the recent years,
- dl.acm.org/journal/topml
- jds.acm.org
- link.springer.com/journal/44439
- academic.oup.com/rssdat
- jmlr.org/tmlr/
- data.mlr.press
No reason why these cant replace everything the current conferences are doing and most likely better.
06.07.2025 19:41
๐ 0
๐ 0
๐ฌ 0
๐ 0
Thanks. No, AutoSGD is not going as far as delta-GClip goes. It's Theorem 4.5 is where they have any global minima convergence happening - but it uses assumptions which are not known to be true for nets. Our convergence holds for *all* nets wide enough.
01.07.2025 10:42
๐ 1
๐ 0
๐ฌ 1
๐ 0
Do link to the paper! I can have a look and check.
01.07.2025 09:02
๐ 1
๐ 0
๐ฌ 1
๐ 0
So, the next time you train a deep-learning model, it's probably worthwhile to have a baseline for the only provable adaptive gradient deep-learning algorithm - our delta-GClip ๐
01.07.2025 08:55
๐ 1
๐ 0
๐ฌ 1
๐ 0
Our insight is to introduce an intermediate form of gradient clipping that can leverage the PL* inequality of wide nets - something not known for standard clipping. Given our algorithm works for transformers maybe that points to some yet unkown algebraic property of them. #TMLR
29.06.2025 22:38
๐ 0
๐ 0
๐ฌ 0
๐ 0
Our "delta-GCLip" is the *only* known adaptive gradient algorithm that provably trains deep-nets AND is practically competitive. That's the message of our recently accepted #TMLR paper - and my 4th TMLR journal ๐
openreview.net/pdf?id=ABT1X...
#optimization #deeplearningtheory
29.06.2025 22:36
๐ 0
๐ 1
๐ฌ 0
๐ 2
GitHub - Anirbit-AI/Slides-from-Team-Anirbit: Slide Presentations of Our Works
Slide Presentations of Our Works. Contribute to Anirbit-AI/Slides-from-Team-Anirbit development by creating an account on GitHub.
An updated version of our slides on necessary conditions for #SciML,
- and more specially,
"Machine Learning in Function Spaces/Infinite Dimensions".
Its all about the 2 key inequalities on slides 27 and 33.
Both come via similar proofs.
github.com/Anirbit-AI/S...
23.06.2025 22:02
๐ 0
๐ 0
๐ฌ 0
๐ 0
Now our research group has a logo to succinctly convey what we do - prove theorems about using ML to solve PDEs, leaning towards operator learning. Thanks to #ChatGPT4o for converting my sketches into a digital image ๐ฅ #AI4Science #SciML
07.06.2025 19:03
๐ 0
๐ 0
๐ฌ 0
๐ 0
It would be great to be able to see a compiles list of useful PDEs that #PINNs struggle to solve - and how would we measure success there.
We know of edge-cases with simple PDEs, where PINNs struggle, but then often those aren't the cutting-edge of use-cases of PDEs.
24.05.2025 16:28
๐ 1
๐ 0
๐ฌ 0
๐ 0
@prochetasen.bsky.social @mingfei.bsky.social @omarrivasplata.bsky.social
09.04.2025 09:18
๐ 1
๐ 0
๐ฌ 0
๐ 0
@aifunmcr.bsky.social ๐
09.04.2025 09:18
๐ 0
๐ 0
๐ฌ 0
๐ 0