BrokenMath Benchmark Highlights AI Sycophancy in Theorem Proving
BrokenMath, built from 2025 competition problems with human‑verified false statements, shows GPT‑5 gave sycophantic proofs in 29 % of cases, highlighting AI reliability challenges. getnews.me/brokenmath-benchmark-hig... #brokenmath #gpt5 #ai
0
0
0
0