For folks considering grad school in ML, my advice is to explore programs that mix ML with a domain interest. ML programs are wildly oversubscribed while a lot of the fun right now is in figuring out what you can do with it
For folks considering grad school in ML, my advice is to explore programs that mix ML with a domain interest. ML programs are wildly oversubscribed while a lot of the fun right now is in figuring out what you can do with it
So, what *is* the @ecir2026.eu Information Retrieval for Good track? by Maria Heuss and Bhaskar Mitra:
https://bhaskar-mitra.github.io/posts/2025/09/01/what-is-ir-for-good/
Super important paper and what a nice interdisciplinary group of co authors!!! ๐
We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation". We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks. For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations. Then, we collect 13 million LLM annotations across plausible LLM configurations. These annotations feed into 1.4 million regressions testing the hypotheses. For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions. Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors. Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models. Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.
๐จ New paper alert ๐จ Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.
Paper: arxiv.org/pdf/2509.08825
Curious about my PhD research?
โถ๏ธ Watch a 10-min talk + my defense: lnkd.in/ej_MWDtt
๐ Read the dissertation: lnkd.in/efBW97WB
๐ฐ Or read the short news article: lnkd.in/eizZg5VN
Amazing co-authors broadened my perspective and made me a better scientist. Thank you so much for that! ๐
Also to my doctoral committee: @damiantrilling.net , Annette Hautli-Janisz, reshmi G Pillai, @Khalid Al Khatib & Antal van den Bosch: thank you for your thoughtful (and fun!) questions.
And huge thanks to my incredible paranymphs @urjakh.bsky.social and Selene Baez Santamaria ๐ฏโโ๏ธ. From Zoom rooms to the stage, our journey has been full of growth, laughter, and mutual support. โค๏ธ
In fact, all PhDs from @cltl.bsky.social were a great community of support. ๐
Last week, I defended my dissertation "๐ ๐๐ถ๐ป๐ป๐ญ๐ฆ ๐ฐ๐ง ๐๐ฆ๐ณ๐ด๐ฑ๐ฆ๐ค๐ต๐ช๐ท๐ฆ๐ด: ๐๐ฏ๐ต๐ฆ๐ณ๐ฅ๐ช๐ด๐ค๐ช๐ฑ๐ญ๐ช๐ฏ๐ข๐ณ๐บ ๐๐ข๐ฏ๐จ๐ถ๐ข๐จ๐ฆ ๐๐ฆ๐ค๐ฉ๐ฏ๐ฐ๐ญ๐ฐ๐จ๐บ ๐ง๐ฐ๐ณ ๐๐ฆ๐ด๐ฑ๐ฐ๐ฏ๐ด๐ช๐ฃ๐ญ๐ฆ ๐๐ฆ๐ธ๐ด ๐๐ฆ๐ค๐ฐ๐ฎ๐ฎ๐ฆ๐ฏ๐ฅ๐ข๐ต๐ช๐ฐ๐ฏ" at the Vrije Universiteit Amsterdam. *the* moment: #PhDone! ๐โจ๐
I couldnโt have asked for better supervisors than Antske Fokkens & @suzanv.bsky.social ๐
Its the final countdown ๐ถ๐ค (I am re-reading my dissertation for my defense next week), and actually I realized I had some fun findings hidden in some papers that I myself forgot about! ๐ I donโt know if thatโs a good or bad sign for my defense.. ๐
But then working as a (university) researcher also comes with a lot of downsides, including insecurity and pressure in random โwhich grant or paper winsโ arenas which I do not vibe well with.
But what then? What do?
Btw Iโm serious about this career change comment.
Iโm having a sort of post-PhD career reflection where I realize that these kind of things donโt spark joy for me but seem to be a big part of being an AI dev in industry.
I mean, I have heard people say they enjoy the puzzling aspect and the feeling accomplished when they fix it.
Personally, for me that never weights up against the annoyance and what feels like endless wasted time.
Also, I realize some people really love the โpuzzleโ aspect but I donโt like these kind of puzzles. It makes me stressed and annoyed. Maybe I should find another field to work in. ๐
I also really hate it when people who do not work in NLP/LLMs then say โoh no but with conda and a requirements.txt itโs easy, right?โ, not realizing the morass of ever-new models and architectures I live in.
Realization: I really, really, really hate the part of my job where it is managing conda environments and going through a deep deep cave of issue reports trying to find why something randomly doesnโt work.
Chatbots โ LLMs โ do not know facts and are not designed to be able to accurately answer factual questions. They are designed to find and mimic patterns of words, probabilistically. When theyโre โrightโ itโs because correct things are often written down, so those patterns are frequent. Thatโs all.
Deadline approaching! Workshop on Computational Linguistics for the Political and Social Sciences #KONVENS2025, archival long-short papers (acl anthology) & non-archival abstracts and phd project descriptions (get feedback from a great community!) ! Deadline: June 13th.
My love language is sending my academic friends the papers/datasets/posts on social media that I know align with their research interest. ๐
GESIS Workshop Adapters: Lightweight Machine Learning for Social Science Research 02 to 04 June 2025 | Hybrid (Cologne | Online) Julia Romberg, Vigneshwaran Shankaran, Maximilian Maurer (all GESIS)
Unlock the power of large language models for your research!
Join this #GESISworkshop with Julia Romberg, @vigneshwaran-s.bsky.social, and @mmmaurer.bsky.social to explore adapters โ an efficient alternative to fine-tuning your models.
๐ Book now โก๏ธ t1p.de/adapters-lig...
@gesis.org
While I am not at #NAACL, I gave a talk about this paper (and more work in my dissertation) last Friday at @annarogers.bsky.social โs lab, very nice discussion there! ๐
Paper: lnkd.in/eBBSi6_p
Code: lnkd.in/ezwRGpjP
Slides: lnkd.in/erPP5fpV
Want to know more? Message me!
๐กWe find that:
- Experts use ๐ฑ๐ถ๐ณ๐ณ๐ฒ๐ฟ๐ฒ๐ป๐ ๐๐๐ฟ๐ฎ๐๐ฒ๐ด๐ถ๐ฒ๐ to assess the LLM;
- Surprisingly, ๐น๐ผ๐ป๐ด๐ฒ๐ฟ ๐ฎ๐ป๐ฑ ๐บ๐ผ๐ฟ๐ฒ ๐ป๐๐ฎ๐ป๐ฐ๐ฒ๐ฑ ๐ฑ๐ฒ๐ณ๐ถ๐ป๐ถ๐๐ถ๐ผ๐ป๐ ๐ผ๐ณ ๐๐ฒ๐
๐ถ๐๐บ developed via LLM-human collaboration;
- Some experts improve zero-shot performance with their improved definition.
#NLProc #CSS #computationalsocialscience
Our study consisted of four components:
1) a survey of sexism researchers
two interactive experiments on expert-LLM interactions; 2). assessing the LLM;
3). co-creating of sexism definitions with the LLM;
4) using these definitions in zero-shot detection with LLMs on five sexism datasets: ๐ฉโ๐ฌ + ๐ค
This work was the outcome of my Junior Research Visit grant at @gesis.org last year, and is the final chapter of my dissertation! ๐คฉ
Our method allowed us to measure connections between experts, sexism definition, dataset, & classification performance in zero-shot sexism classification. ๐๐ฌ
A visual description of how our expert survey led to two interactive experiments and finally to definitions that were used in zero-shot sexism detection.
Expert + LLM = Better Sexism Detection? โจ
Paper:
๐๐ฆ๐ญ๐ญ ๐๐ฆ ๐๐ฉ๐ข๐ต ๐ ๐ฐ๐ถ ๐๐ฏ๐ฐ๐ธ ๐๐ฃ๐ฐ๐ถ๐ต ๐๐ฆ๐น๐ช๐ด๐ฎ: ๐๐น๐ฑ๐ฆ๐ณ๐ต-๐๐๐ ๐๐ฏ๐ต๐ฆ๐ณ๐ข๐ค๐ต๐ช๐ฐ๐ฏ ๐๐ต๐ณ๐ข๐ต๐ฆ๐จ๐ช๐ฆ๐ด ๐ข๐ฏ๐ฅ ๐๐ฐ-๐๐ณ๐ฆ๐ข๐ต๐ฆ๐ฅ ๐๐ฆ๐ง๐ช๐ฏ๐ช๐ต๐ช๐ฐ๐ฏ๐ด ๐ง๐ฐ๐ณ ๐ก๐ฆ๐ณ๐ฐ-๐๐ฉ๐ฐ๐ต ๐๐ฆ๐น๐ช๐ด๐ฎ ๐๐ฆ๐ต๐ฆ๐ค๐ต๐ช๐ฐ๐ฏ
w: @indiiigo.bsky.social, @matteo-mls.bsky.social y.social & @gabriellalapesa.bsky.social
@ Findings #NAACL2025 !๐คฉ
Oh it is super common in Amsterdam! I see it all the time.
And even in Mexico I have seen it, so it is definitely a worldwide phenomenon, an international vibe working trend.
I am now doing a lot of stuff locally with M1 on the Mac and while an interesting challenge it also has very obvious limitations. ๐
๐จ Deadline Extended! ๐จ
We've extended the submission deadline to Friday, April 18, 2025 (AoE)!
Please share widely!
www.workshopononlineabuse.com/cfp.html
ACL Rolling Review and the EMNLP PCs are seeking input on the current state of reviewing for *CL conferences. We would love to get your feedback on the current process and how it could be improved. To contribute your ideas and opinions, please follow this link! forms.office.com/r/P68uvwXYqfemn
Join us at the VU Amsterdam's Master's Event, Saturday, March 8, 10:30-15:00!
Learn about our two Master's in Linguistics programs from faculty and students: Language and AI (1 year) and Human Language Technology (2 years).
Programs: home.cltl.labs.vu.nl
Location & details: vu.nl/en/education...