EFRame Enhances LLM Reasoning via Exploration-Filter-Replay RL
EFRame adds exploration rollouts, online filtering and experience replay to GRPO, delivering a 37.9% relative gain over baseline GRPO on the Geometry3K benchmark. Read more: getnews.me/eframe-enhances-llm-reas... #llm #eframe
0
0
0
0