Our team is hiring a postdoc in (mechanistic) interpretability! The ideal candidate will have research experience in interpretability for text and/or image generation models and be excited about open science!
Please consider applying or sharing with colleagues: metacareers.com/jobs/2223953961352324
15.07.2025 20:11
π 11
π 5
π¬ 0
π 0
Excited to share the results of my recent internship!
We ask π€
What subtle shortcuts are VideoLLMs taking on spatio-temporal questions?
And how can we instead curate shortcut-robust examples at a large-scale?
We release: MVPBench
Details ππ¬
13.06.2025 14:47
π 16
π 5
π¬ 1
π 0
The HuggingFace/Nanotron team just shipped an entire pretraining textbook in interactive format. huggingface.co/spaces/nanot...
Itβs not just a great pedagogic support, but many unprecedented data and experiments presented for the first time in a systematic way.
19.02.2025 19:12
π 39
π 9
π¬ 0
π 0
Excited to have two papers at #NAACL2025!
The first reveals how human over-reliance can be exacerbated by LLM friendliness. The second presents a novel computational method for concept tracing. Check them out!
arxiv.org/pdf/2407.07950
arxiv.org/pdf/2502.05704
19.02.2025 21:58
π 27
π 6
π¬ 2
π 0
Congrats, nice and refreshing papers, especially the word confusion idea! We need better similarity methods, good to see developments in this front! Curious if the confusion similarity depends on the label size of the classifier?
20.02.2025 12:38
π 0
π 0
π¬ 0
π 0
π Hello world! Weβre thrilled to announce the v0.4 release of fairseq2 β an open-source library from FAIR powering many projects at Meta. pip install fairseq2 and explore our trainer API, instruction & preference finetuning (up to 70B), and native vLLM integration.
12.02.2025 12:31
π 4
π 2
π¬ 1
π 2
Many many congratulations!! π₯³ππ
11.02.2025 01:40
π 2
π 0
π¬ 1
π 0
another factor which makes simple mlps work is visual token length. if you care about shorter tokens, you need a better mapper. these days most llms are capable of long context, which reduces the need of compressing visual tokens.
02.02.2025 05:58
π 3
π 0
π¬ 1
π 0
one hypothesis why simple mappers work is 1. unfreezing the LLM provides enough parameters for mapping, 2. richer vision representations are closer to llm internal latent space arxiv.org/abs/2405.07987
02.02.2025 05:58
π 2
π 0
π¬ 1
π 0
good questions! from what I see some folks still use complex mappers like Perceivers, but often simple mlp works good enough. the variable which induces the biggest improvement is almost always the alignment data.
02.02.2025 05:58
π 1
π 0
π¬ 1
π 0
This is actually a cool result - token length being a rough heuristic for confidence of models?
31.01.2025 22:26
π 1
π 0
π¬ 0
π 0
I am shocked by the death of Felix Hill. He was one of the brightest minds of my generation.
His last blog post on the stress of working in AI is very poignant. Apart from the emptiness of working mostly to make billionaires even richer, there's the intellectual emptiness of 'scale is all you need'
14.01.2025 12:41
π 38
π 8
π¬ 0
π 0
Lots of cool findings in our paper as well as in the website: tsb0601.github.io/metamorph/
Excited to see how the community "MetaMorph"'s existing LLMs!
26.12.2024 20:02
π 4
π 0
π¬ 0
π 0
We posted our paper on arxiv recently, sharing this here too: arxiv.org/abs/2412.141... - work led by our amazing intern Peter Tong. Key findings:
- LLMs can be trained to generate visual embeddings!!
- VQA data appears to help a lot in generation!
- Better understanding = better generation!
26.12.2024 20:01
π 8
π 0
π¬ 1
π 0
I wonder if veo-2 would be better at these prompts!
17.12.2024 20:49
π 3
π 0
π¬ 2
π 0
MLRC 2025
Machine Learning Reproducibility Challenge
Co-organized by @randomwalker.bsky.social @peterhenderson.bsky.social, @in4dmatics.bsky.social Naila Murray, @adinawilliams.bsky.social, Angela Fan, Mike Rabbat and Joelle Pineau. Checkout our website for CFP and more details: reproml.org
13.12.2024 19:06
π 1
π 0
π¬ 0
π 0
π¨ We are pleased to announce the first, in-person event for the Machine Learning Reproducibility Challenge, MLRC 2025! Save your dates: August 21st, 2025 at Princeton!
13.12.2024 19:06
π 10
π 1
π¬ 3
π 1
Our paper PRISM alignment won a best paper award at #neurips2024!
All credits to @hannahrosekirk.bsky.social A.Whitefield, P.RΓΆttger, A.M.Bean, K.Margatina, R.Mosquera-Gomez, J.Ciro, @maxbartolo.bsky.social H.He, B.Vidgen, S.Hale
Catch Hannah tomorrow at neurips.cc/virtual/2024/poster/97804
11.12.2024 16:20
π 67
π 9
π¬ 2
π 0
Also, MLRC is now in π¦ as well - do follow! :) @reproml.org
10.12.2024 16:53
π 0
π 0
π¬ 0
π 0
Online Proceedings | MLRC
Machine Learning Reproducibility Challenge
Checkout the MLRC 2023 posters at #NeurIPS 2024 this week: reproml.org/proceedings/ - do drop by to these posters and say hi!
10.12.2024 16:15
π 0
π 0
π¬ 1
π 0
The return of the Autoregressive Image Model: AIMv2 now going multimodal.
Excellent work by @alaaelnouby.bsky.social & team with code and checkpoints already up:
arxiv.org/abs/2411.14402
22.11.2024 09:44
π 46
π 8
π¬ 1
π 0
Yes, that imo is one of the most exciting outcome for this direction - learning a new modality with much less compute. We have some really nice results, canβt wait to share it with everyone, stay tuned!
21.11.2024 05:47
π 1
π 0
π¬ 1
π 0
Writing a good scientific paper
For those who missed this post on the-network-that-is-not-to-be-named, I made public my "secrets" for writing a good CVPR paper (or any scientific paper). I've compiled these tips of many years. It's long but hopefully it helps people write better papers. perceiving-systems.blog/en/post/writ...
20.11.2024 10:18
π 260
π 64
π¬ 4
π 8
π hello! :)
20.11.2024 21:52
π 1
π 0
π¬ 0
π 0
How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledgeπ¦? In our new preprint, we look at the pretraining data and find evidence against this:
Procedural knowledge in pretraining drives LLM reasoning βοΈπ’
π§΅β¬οΈ
20.11.2024 16:31
π 854
π 137
π¬ 36
π 24
When I first read this paper, I instinctively scoffed at the idea. But the more I look at empirical results, the more Iβm convinced this paper highlights something fundamentally amazing. Lots of exciting research on this direction will come very soon!
arxiv.org/abs/2405.07987
20.11.2024 00:29
π 3
π 0
π¬ 3
π 1
All the ACL chapters are here now: @aaclmeeting.bsky.social @emnlpmeeting.bsky.social @eaclmeeting.bsky.social @naaclmeeting.bsky.social #NLProc
19.11.2024 03:48
π 107
π 37
π¬ 1
π 3
Doing good science is 90% finding a science buddy to constantly talk to about the project.
09.11.2024 22:53
π 882
π 215
π¬ 22
π 65
Same here! Lets make a club! π
17.11.2024 17:08
π 0
π 0
π¬ 0
π 0