Mila Gorecki (@milago)

Meet me at the Benchmarking workshop (sites.google.com/view/benchma...) at EurIPS on Saturday: We’ll present two works on errors in LLM-as-Judge and their impacts on benchmarking and test-time-scaling:

05.12.2025 08:57 👍 7 🔁 3 💬 1 📌 0

At #NeurIPS in San Diego this week? Interested in XAI, causality, or performative prediction? Come visit our poster!

💬 Performative Validity of Recourse Explanations
📆 Wednesday, 4.30 pm, Poster Session 2
w/ Hidde Fokkema, Timo Freiesleben, Celestine Mendler-Dünner, Ulrike von Luxburg

02.12.2025 18:17 👍 11 🔁 3 💬 0 📌 0

Attending #Neurips2025? Get your personalized Scholar Inbox conference program now to easily navigate the poster sessions and find what you are looking for:
www.scholar-inbox.com/conference/n...

02.12.2025 06:37 👍 34 🔁 12 💬 0 📌 0

I'll be @neuripsconf.bsky.social presenting Strategic Hypothesis Testing (spotlight!)

tldr: Many high-stakes decisions (e.g., drug approval) rely on p-values, but people submitting evidence respond strategically even w/o p-hacking. Can we characterize this behavior & how policy shapes it?

1/n

01.12.2025 20:31 👍 17 🔁 4 💬 1 📌 0

The empirical landscape sits between the two extremes.

- Model similarity is high, yet disagreements let individuals find recourse by switching models.

- Systemic exclusion is rare, yet more likely than under strong multiplicity.

- Even in a single model, prompt variations induce multiplicity.

02.12.2025 15:57 👍 3 🔁 0 💬 0 📌 0

We evaluate 50 LLMs (various sizes & providers) across 6 tasks to assess how well each narrative fits the current LLM landscape, assuming that decision makers will increasingly rely on these models for consequential predictions.

02.12.2025 15:57 👍 1 🔁 0 💬 1 📌 0

There are two narratives about model ecosystems that grew out of the algorithmic fairness debate:

1. Monoculture: models converge toward homogeneity.

2. Multiplicity: many models solve tasks similarly but disagree on individual predictions, creating outcome variation.

02.12.2025 15:57 👍 0 🔁 0 💬 1 📌 0

Excited to be at #Neurips2025 this week to present our paper "Monoculture or Multiplicity: Which is it?", joint work with Moritz Hardt.

📄 Paper #1000: openreview.net/pdf?id=DO5Lt...
📍 Wed, Dec 3, 2025 • 4:30 PM – 7:30 PM

Feel free to come by and reach out!

A short 🧵.

02.12.2025 15:55 👍 16 🔁 4 💬 1 📌 0

Mila Gorecki

Latest posts by Mila Gorecki @milago