#softmax

@tmlr-pub.bsky.social

4 days ago

Softmax is $1/2$-Lipschitz: A tight bound across all $\ell_p$ norms

Pravin Nair

Action editor: Murat Erdogdu

https://openreview.net/forum?id=6dowaHsa6D

#softmax #lipschitz #norms

2 0 0 0

MathiasTCK.bsky.social

@mathiastck.bsky.social

2 weeks ago

File:The-Transformer-model-architecture.png Output Probabilities Softmax Linear Add&Norm Feed Forward Add&Norm Multi-Head Attention Add&Norm Masked Multi-Head Attention Positional Encoding Output Embedding ↑ Add&Norm Feed Forward Add&Norm Multi-Head Attention Positional Encoding Input Embedding ↑ Input Output (shifted right)

"The transformer approach it describes has become the main architecture of a wide variety of AI, such as #LargeLanguageModels"

#OutputProbabilities
#Softmax
Linear
#Add&Norm
#FeedForward
#MultiHead Attention
#MaskedMultiHead Attention
#PositionalEncoding
#OutputEmbedding
#FeedForward

0 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

3 weeks ago

Training More Robust Classification Model via Discriminative Loss and Gaussian Noise Injection

Hai-Vy Nguyen, Fabrice Gamboa, Sixin Zhang, Reda CHHAIBI, Serge Gratton, Thierry Giaccone

Action editor: Lei Feng

https://openreview.net/forum?id=RnLfJgvST2

#softmax #robust #classification

0 0 0 0

@blackplasticmusic.bsky.social

1 month ago

Listen: No ID BY Softmax — BlackPlastic.co.uk

Softmax returns with No ID — a dark, fuzzy reflection on disconnection and the music industry’s demands. Chunky guitars and robotic grooves carry a refrain that lingers: “Nothing I do, will ever be good enough for you.”

🎧 Listen & read: www.blackplastic.co.uk/alternative-... #NewMusic #Softmax

1 1 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

1 month ago

Harmonic Loss Trains Interpretable AI Models

David D. Baek, Ziming Liu, Riya Tyagi, Max Tegmark

Action editor: Quanshi Zhang

https://openreview.net/forum?id=ZpSZ7pNoCs

#softmax #representations #trained

1 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

1 month ago

On the Expressiveness of Softmax Attention: A Recurrent Neural Network Perspective

New #TMLR-Paper-with-Video:

On the Expressiveness of Softmax Attention: A Recurrent Neural Network Perspective

Gabriel Mongaras, Eric C. Larson

https://tmlr.infinite-conf.org/paper_pages/PHcITOi3vV

#softmax #rnns #attention

1 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

2 months ago

On the Expressiveness of Softmax Attention: A Recurrent Neural Network Perspective

Gabriel Mongaras, Eric C. Larson

Action editor: Lingpeng Kong

https://openreview.net/forum?id=PHcITOi3vV

#softmax #rnns #attention

1 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

2 months ago

Is isotropy a good proxy for generalization in time series forecasting with transformers?

Rashed Shelim, Shengzhe Xu, Walid Saad, Naren Ramakrishnan

Action editor: Jacek Cyranka

https://openreview.net/forum?id=iUtDYVQzFq

#softmax #forecasting #embeddings

1 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

4 months ago

High-Dimensional Gaussian Process Regression with Soft Kernel Interpolation

Chris L Camaño, Daniel Huang

Action editor: Geoff Pleiss

https://openreview.net/forum?id=U9b2FIjvWU

#softmax #interpolation #interpolating

1 1 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

4 months ago

Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A

Benjamin Plaut, Khanh Xuan Nguyen, Tu Trinh

Action editor: Kamalika Chaudhuri

https://openreview.net/forum?id=E6LOh5vz5x

#softmax #accuracy #language

0 0 0 0

TMLR Published Papers

@tmlr-pub.bsky.social

4 months ago

Toward Linearly Regularizing the Geometric Bottleneck of Linear Generalized Attention

Jiaxu Liu, Xinping Yi, Xiangyu Yin, Yuhang Song, Gaojie Jin, Xiaowei Huang

Action editor: Shuangfei Zhai

https://openreview.net/forum?id=Vpyg3fqXbl

#attention #softmax #memory

0 0 0 0

Jens Mönig

@jmoenig.bsky.social

4 months ago

What is #softmax and why is it important for machine learning? Check out my refresher tutorial on multiclass classification in neural networks and how you can build your own from scratch in Snap!, (or) your favorite programming language:
snap.berkeley.edu/project?user...

1 1 2 1

TMLR Published Papers

@tmlr-pub.bsky.social

5 months ago

TRA: Better Length Generalisation with Threshold Relative Attention

Mattia Opper, Roland Fernandez, Paul Smolensky, Jianfeng Gao

Action editor: Petar Veličković

https://openreview.net/forum?id=yNiBUc2hMW

#attention #softmax #tasks

1 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Beyond Softmax: New Gradient Bandit Framework Expands Learning

Bandit framework swaps softmax’s independence for nested‑logit models, enabling correlated actions. Authors prove sublinear regret bounds, experiments matching softmax‑based methods. Read more: getnews.me/beyond-softmax-new-gradi... #gradientbandit #softmax

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Catnat Function Introduced as Efficient Alternative to Softmax

Catnat provides a binary‑split alternative to softmax, yielding a diagonal Fisher information matrix; benchmarks in graph learning, VAEs and reinforcement learning report test performance. Read more: getnews.me/catnat-function-introduc... #catnat #softmax

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Softmax Attention Outperforms Linear in Single-Location Regression

Softmax‑based attention achieves Bayes‑optimal error in single‑location regression, while linear attention cannot, and softmax consistently outperforms it in finite‑sample tests. getnews.me/softmax-attention-outper... #softmax #attention

0 0 0 0

ALIFE Conference 2025

@alife2025.bsky.social

5 months ago

#alife2025 #ALIFE #ISAL #AI #artificiallife #STEM #CrossLabs #CrossCompass #XC #AWS #softmax #koin #SakanaAI #Google #MITPress #cafedecolombia #wolfram #insilence #mlst #springer #scicomm #pci #meetkyoto #thedeck #hotelanteroomkyoto

0 0 0 0

ALIFE Conference 2025

@alife2025.bsky.social

5 months ago

The deadline for submission is September 29th, 2025.

#alife2025 #ISAL #AI #artificiallife #STEM #CrossLabs #CrossCompass #XC #AWS #softmax #koin #SakanaAI #Google #MITPress #cafedecolombia #wolfram #insilence #mlst #springer #scicomm #pci #meetkyoto #thedeck #hotelanteroomkyoto

0 0 0 0

ALIFE Conference 2025

@alife2025.bsky.social

5 months ago

ALIFE 2025 Mind-Matching ALIFE 2025 will host a Mind-Matching session to foster new collaborations and friendships! For in-person attendees, Mind-Matching will take place during the banquet. Would you like to meet new people...

Sign-up form: forms.gle/FDBwX9dcovT1...

#alife2025 #ALIFE #ISAL #AI #STEM #CrossLabs #CrossCompass #XC #AWS #softmax #koin #SakanaAI #Google #MITPress #cafedecolombia #wolfram #insilence #insilenceai #mlst #springer #scicomm #pci #meetkyoto #thedeck #hotelanteroomkyoto

3 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

New Similarity-Distance-Magnitude Activation Improves Model Robustness

Similarity-Distance-Magnitude (SDM) activation adds similarity and distance awareness to softmax, improving robustness to out-of-distribution inputs; submitted on 16 Sep 2025. getnews.me/new-similarity-distance-... #sdm #softmax #robustness

0 0 0 0

ALIFE Conference 2025

@alife2025.bsky.social

5 months ago