Donatella Genovese's Avatar

Donatella Genovese

@donatellag

PhD Student | Works on Explainable AI | https://donatellagenovese.github.io/

66
Followers
172
Following
7
Posts
20.11.2024
Joined
Posts Following

Latest posts by Donatella Genovese @donatellag

Post image

Please share it within your circles! edin.ac/3DDQK1o

13.03.2025 11:59 πŸ‘ 14 πŸ” 9 πŸ’¬ 0 πŸ“Œ 1
Post image

Really cool paper by @kayoyin.bsky.social about interpretability of In Context Learning, they found that Function Vectors (FV) heads are crucial for few-shot ICL.
www.arxiv.org/abs/2502.14010

28.02.2025 16:52 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

A really nice resource for understanding how to parallelize LLM training.

21.02.2025 11:12 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

3/ Interleaving Concepts with Token Embeddings

πŸ”Ή Predicted concepts are compressed into a continuous vector 🎯
πŸ”Ή They are then inserted into hidden states alongside token embeddings 🧩

14.02.2025 13:12 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

2/ Training the Model with Dual Objectives

πŸ”Ή Next-token prediction – the standard LLM training objective.
πŸ”Ή Concept prediction – the model learns to reproduce extracted concepts from its hidden state.

14.02.2025 13:11 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

1/ Concept Extraction with SAE

πŸ”Ή A Sparse Autoencoder (SAE) extracts high-level concepts from the hidden states of a pretrained LLM.
πŸ”Ή Only the most important concepts are selected based on their attribution score (impact on model output).

14.02.2025 13:10 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

πŸš€ Meta’s new LLM pretraining framework predicts concepts and integrates them into its hidden state to enhance next-token prediction. πŸš€

It achieves the same performance with 21.5% fewer tokens and better generalization! 🎯

πŸ“: arxiv.org/abs/2502.08524

14.02.2025 13:07 πŸ‘ 1 πŸ” 0 πŸ’¬ 3 πŸ“Œ 0

A very interesting work that explores the possibility of having a unified interpretation across multiple models

09.02.2025 09:13 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Post image

*MoE Graph Transformers for Interpretable Particle Collision Detection*
by @alessiodevoto.bsky.social @sgiagu.bsky.social et al.

We propose a MoE graph transformer for particle collision analysis, with many nice interpretability insights (e.g., expert specialization).

arxiv.org/abs/2501.03432

10.01.2025 14:12 πŸ‘ 12 πŸ” 4 πŸ’¬ 0 πŸ“Œ 0