Manually creating dialogue benchmarks costs thousands, but our Chatty-Gen automates it with #RAG & LLM-as-a-Judge to deliver human-level quality for <$1! ๐ Our modular approach could automate different synthetic benchmarks & enable LLMs as domain experts for data science & security. More to come ๐๐ปโโ๏ธโโก๏ธ!
04.02.2025 07:16
๐ 0
๐ 0
๐ฌ 0
๐ 0
๐ Our paper Chatty-Gen is accepted at SIGMOD 2025! ๐ Huge thanks to my students Reham Omar & Omij Mangukiya! Our approach prevents costly errors & works across LLMs like GPT-4, Gemini, Llama-3 & Mistral. We combined Llama-3 & CodeLlama2 to match GPT-4o! Read more arxiv.org/abs/2501.09928 #SIGMOD2025
04.02.2025 07:16
๐ 3
๐ 0
๐ฌ 1
๐ 0