New paper on AI code generation training: switching from unit tests to formal verification as the reward signal gives 5.68× better results. Unit tests are gameable. Mathematical proofs aren't. arxiv.org/abs/2512.18160
1
0
0
0