Life update: I've joined Max Planck Institute for Software Systems as a research fellow (pre-phd), working with
@lasha.bsky.social on factuality and nuanced forms of misinformation.
Cheers from Germany!
Life update: I've joined Max Planck Institute for Software Systems as a research fellow (pre-phd), working with
@lasha.bsky.social on factuality and nuanced forms of misinformation.
Cheers from Germany!
π¨ New preprint!
One idea, many ways to say it β but does your brain track those options while you speak?
Using LLMs, we put this to the test.
www.biorxiv.org/content/10.1...
We show for the 1st time that the brain represents multiple alternatives simultaneously in both listening and speaking.
π§΅
There's a lot of talk about regulating AI, but do regulators know the technology well enough?
In our new paper, we survey major reg efforts & find they rely on benchmarking, which we know to be problematic. How did this happen & what can we do about it?
arxiv.org/pdf/2501.15693
New preprint! β¨
Interested in LLM-as-a-Judge?
Want to get the best judge for ranking your system?
our new work is just for you:
"JuStRank: Benchmarking LLM Judges for System Ranking"
πΊπ
arxiv.org/abs/2412.09569
π€π