How can we tell when LLMs lie? We designed a domain-agnostic solution to evaluate the correctness of a Dutch LLM-based chatbot at AFAS Software, using limited data. Inspired by human decision-making, the solution is projected to save 15,000 hours per year.
Preprint arxiv.org/abs/2411.00034
13.11.2025 20:15
๐ 1
๐ 0
๐ฌ 0
๐ 0
In this picture, I am receiving award from Prof. Audris Mockus, and MSR 2025 core organizing team - Prof. Bram Adams and Dr. Olga Baysal
Honored to receive the MSR Rich Holt Early Career Achievement Award 2025 in Ottawa ๐จ๐ฆ, recognising my contributions impacting software companies, OSS communities, and society.
Huge thanks to my mentors, collaborators, students, friends, and familyโcouldnโt have done it without you! ๐ @rug.nl
12.05.2025 15:28
๐ 3
๐ 0
๐ฌ 0
๐ 0