Join me and @mariusmosbach.bsky.social to chat about our work on frequency effects in unlearning β and how @ai2.bsky.social's Olmo helped us gain key insights.
π¬ AMA: Tue, Oct 28 β 8:00 PT / 16:00 CEST
π‘ Bring your questions!
π discord.gg/ai2
26.10.2025 16:12
π 2
π 0
π¬ 0
π 0
We're presenting βNot all data are unlearned equallyβ at #COLM2025!
We show that data properties shape how LLMs forget β stop by to chat more!
π Wednesday, Oct 8
π 4:30β6:30 pm
π poster #710 (session 4)
paper: arxiv.org/abs/2504.05058
Work with @mariusmosbach.bsky.social @sivareddyg.bsky.social
05.10.2025 15:55
π 0
π 0
π¬ 0
π 0
very happy to see the trend of a Behind the Scenes section catching on! transparent & honest science π
love the detailed montreal spots mentioned
consider including such a section in your next appendix!
(paper by @a-krishnan.bsky.social arxiv.org/pdf/2504.050...)
13.08.2025 12:19
π 8
π 1
π¬ 1
π 1
Our new paper in #PNAS (bit.ly/4fcWfma) presents a surprising findingβwhen words change meaning, older speakers rapidly adopt the new usage; inter-generational differences are often minor.
w/ Michelle Yang, βͺ@sivareddyg.bsky.socialβ¬ , @msonderegger.bsky.socialβ¬ and @dallascard.bsky.socialβ¬π(1/12)
29.07.2025 12:05
π 34
π 17
π¬ 3
π 2
Announcements
Keynote Speaker Announcement π 30.07.2025We are delighted to announce the keynote speech t`hat will happen at the special session!Speaker: Prof. Karen Livescu, Toyota Technological Institute at Ch...
π’ #SpeechTech & #SpeechScience researchers!
We are thrilled to announce that Prof. Karen Livescu will keynote our Special Session on Interpretable Audio and Speech Models at #Interspeech2025:
"What can interpretability do for us (and what can it not)?"
ποΈ Aug 18, 11:00
@interspeech.bsky.social
30.07.2025 18:25
π 3
π 1
π¬ 0
π 1
Cool work! See you @interspeech.bsky.social ππ
27.05.2025 13:55
π 1
π 0
π¬ 0
π 0
On the reliability of feature attribution methods for speech classification
As the capabilities of large-scale pre-trained models evolve, understanding the determinants of their outputs becomes more important. Feature attribution aims to reveal which parts of the input elemen...
I am excited to announce that my paper "On the reliability of feature attribution methods for speech classification" has been accepted to #Interspeech2025!
Co-authors: @hmohebbi.bsky.social, Arianna Bisazza, Afra Alishahi, @grzegorz.chrupala.me
Find the preprint here: arxiv.org/abs/2505.16406
26.05.2025 08:21
π 10
π 2
π¬ 1
π 1
Title slide: Processing Trans Languaging - Vagrant Gautam (they/xe), Saarland University, with a very brightly patterned background featuring colourful people and math symbols.
Come to my keynote tomorrow at the first official @queerinai.com workshop at #NAACL2025 to hear about how trans languaging is complex and cool, and how this makes it extra difficult to process computationally. I will have SO many juicy examples!
03.05.2025 20:52
π 44
π 14
π¬ 3
π 0
Chain-of-Thought (CoT) reasoning lets LLMs solve complex tasks, but long CoTs are expensive. How short can they be while still working? Our new ICML paper tackles this foundational question.
05.05.2025 12:25
π 12
π 2
π¬ 2
π 0
A must-read for anyone in NLP right now
01.05.2025 16:00
π 6
π 1
π¬ 1
π 0
Congratulations to Mila members @adadtur.bsky.social , Gaurav Kamath and @sivareddyg.bsky.social for their SAC award at NAACL! Check out Ada's talk in Session I: Oral/Poster 6. Paper: arxiv.org/abs/2502.05670
01.05.2025 14:30
π 13
π 7
π¬ 0
π 3
Incredibly proud of my students @adadtur.bsky.social and Gaurav Kamath for winning a SAC award at #NAACL2025 for their work on assessing how LLMs model constituent shifts.
01.05.2025 15:11
π 17
π 5
π¬ 1
π 0
π‘ New ICLR paper! π‘
"On Linear Representations and Pretraining Data Frequency in Language Models":
We provide an explanation for when & why linear representations form in large (or small) language models.
Led by @jackmerullo.bsky.social, w/ @nlpnoah.bsky.social & @sarah-nlp.bsky.social
25.04.2025 01:55
π 42
π 12
π¬ 3
π 3
The intern (after loads of feedback) π
17.04.2025 12:40
π 0
π 0
π¬ 0
π 0
DeepSeek-R1 Thoughtology: Letβs <think> about LLM reasoning
142-page report diving into the reasoning chains of R1. It spans 9 unique axes: safety, world modeling, faithfulness, long context, etc.
Now on arxiv: arxiv.org/abs/2504.07128
12.04.2025 16:11
π 6
π 1
π¬ 1
π 0
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
We are releasing the first benchmark to evaluate how well automatic evaluators, such as LLM judges, can evaluate web agent trajectories.
15.04.2025 19:10
π 7
π 4
π¬ 1
π 1
Checkout Benno's notes about our impact of interpretability paper π.
Also, we are organizing a workshop at #ICML2025 which is inspired by some of the questions discussed in the paper: actionable-interpretability.github.io
15.04.2025 23:11
π 11
π 3
π¬ 0
π 0
Diagram illustrating a hypothesis about knowledge unlearning in language models. The left side shows a training corpus with varying frequencies of facts, such as 'Montreal is a city in Quebec' (high frequency) and 'Atlantis is a city in the ocean' (lower frequency). The center shows a language model being trained on this data, then undergoing unlearning. The right side demonstrates the 'Forget Quality' results, where the model more effectively unlearns the less frequent fact ('Atlantis is in Greece') while retaining the more frequent knowledge. Labels A, B, and C mark key points in the hypothesis: A (frequency variations in training data), B (influence of frequency), and C (unlearning effectiveness).
Check out our new paper on unlearning for LLMs π€. We show that *not all data are unlearned equally* and argue that future work on LLM unlearning should take properties of the data to be unlearned into account. This work was lead by my intern @a-krishnan.bsky.social
π: arxiv.org/abs/2504.05058
09.04.2025 13:30
π 33
π 5
π¬ 1
π 2
π’Excited to announce our upcoming workshop - Vision Language Models For All: Building Geo-Diverse and Culturally Aware Vision-Language Models (VLMs-4-All) @CVPR 2025!
π sites.google.com/view/vlms4all
14.03.2025 15:55
π 17
π 11
π¬ 1
π 4
Agents like OpenAI Operator can solve complex computer tasks, but what happens when users use them to cause harm, e.g. spread misinformation?
To find out, we introduce SafeArena (safearena.github.io), a benchmark to assess the capabilities of web agents to complete harmful web tasks. A thread π
10.03.2025 17:45
π 17
π 7
π¬ 1
π 5
Home
Introduction
Audio and speech technology has recently achieved unprecedented success in real-world applications, driven primarily by self-supervised pre-training of large neural networks on massive da...
π’ #SpeechTech & #SpeechScience researchers!
β³ Reminder: The #Interspeech2025 deadline is approaching! π If your work focuses on interpretability in speech & audio, submit through our Special Session and showcase your research! π€
#Interpretability @interspeech.bsky.social
01.02.2025 09:28
π 1
π 0
π¬ 1
π 0
Hello, could I be added to the list?
06.12.2024 20:56
π 1
π 0
π¬ 0
π 0