Please share it within your circles! edin.ac/3DDQK1o
Please share it within your circles! edin.ac/3DDQK1o
Really cool paper by @kayoyin.bsky.social about interpretability of In Context Learning, they found that Function Vectors (FV) heads are crucial for few-shot ICL.
www.arxiv.org/abs/2502.14010
A really nice resource for understanding how to parallelize LLM training.
3/ Interleaving Concepts with Token Embeddings
πΉ Predicted concepts are compressed into a continuous vector π―
πΉ They are then inserted into hidden states alongside token embeddings π§©
2/ Training the Model with Dual Objectives
πΉ Next-token prediction β the standard LLM training objective.
πΉ Concept prediction β the model learns to reproduce extracted concepts from its hidden state.
1/ Concept Extraction with SAE
πΉ A Sparse Autoencoder (SAE) extracts high-level concepts from the hidden states of a pretrained LLM.
πΉ Only the most important concepts are selected based on their attribution score (impact on model output).
π Metaβs new LLM pretraining framework predicts concepts and integrates them into its hidden state to enhance next-token prediction. π
It achieves the same performance with 21.5% fewer tokens and better generalization! π―
π: arxiv.org/abs/2502.08524
A very interesting work that explores the possibility of having a unified interpretation across multiple models
*MoE Graph Transformers for Interpretable Particle Collision Detection*
by @alessiodevoto.bsky.social @sgiagu.bsky.social et al.
We propose a MoE graph transformer for particle collision analysis, with many nice interpretability insights (e.g., expert specialization).
arxiv.org/abs/2501.03432