#mathreasoning

2 months ago

Falcon H1R 7B just crushed AIME 2025 with an 83.1% score—out‑reasoning models up to 7× its size. Can open‑source finally beat the big labs? Dive into the details. #FalconH1R7B #AIME2025 #MathReasoning

🔗 aidailypost.com/news/falcon-...

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

AdaR Framework Enhances Adaptive Math Reasoning in LLMs

Researchers introduced AdaR, a framework that trains LLMs on logically equivalent math prompts to boost robustness, with a paper submitted in October 2025. The code is open on GitHub. getnews.me/adar-framework-enhances-... #adar #llm #mathreasoning

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

VCSearch Boosts Detection of Ill-Defined Math Problems for LLMs

VCSearch boosts detection of unsolvable math problems by at least 12% and was released on 28 September 2025. The PMC benchmark holds over 5,000 ill‑defined questions. Read more: getnews.me/vcsearch-boosts-detectio... #vcsearch #mathreasoning #emnlp

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Random Policy Valuation Boosts LLM Math Reasoning

Researchers introduced Random Policy Valuation for Diverse Reasoning (ROVER), which improves LLM math reasoning by +8.2 pp on pass@1 and +16.8 pp on pass@256, while boosting solution diversity. Read more: getnews.me/random-policy-valuation-... #llm #mathreasoning

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Problem‑Aware Strategy Routing Boosts LLM Mathematical Reasoning

PRISM, a new framework for LLM math reasoning, adapts its strategy per problem and boosts benchmark accuracy by up to 7%. The code and MathStrat dataset are open‑source on GitHub. getnews.me/problem-aware-strategy-r... #prism #llm #mathreasoning

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Future Policy Aware Preference Learning Boosts LLM Math Reasoning

Future Policy Aware (FPA) preference learning boosts LLM math performance, with SimPER plus FPA gaining up to 5.75% on MATH and GSM8K benchmarks, while adding minimal overhead. getnews.me/future-policy-aware-pref... #llm #mathreasoning

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

LLMs Learn Better from Incorrect Answers Without Explanations

An EMNLP 2025 paper reports LLMs achieve better math‑reasoning accuracy when given only wrong answers, surpassing chain‑of‑thought prompts; the gap widens with larger models. Read more: getnews.me/llms-learn-better-from-i... #llm #mathreasoning

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

Cross-Lingual Reward Modeling Boosts Multilingual LLM Math Reasoning

A cross‑lingual reward model scores multilingual math answers and beats same‑language baselines on a benchmark, even with few sampled candidates. getnews.me/cross-lingual-reward-mod... #multilingualllm #mathreasoning

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

False Positive Solutions Persist in Scaled Math Reasoning Models

A September 2025 study shows false-positive math solutions stay common across open-source models; scaling or sampling doesn't cut their rate, and pass@N often inflates performance. Read more: getnews.me/false-positive-solutions... #mathreasoning #ai

1 0 0 0

Jainil Prajapati

@enough-jainil.bsky.social

1 year ago

🚨 Microsoft just dropped Phi-4—a small yet mighty LLM excelling in advanced math reasoning! 🧮

🔹 High-quality results at a compact size
🔹 Open-source under the MIT license

The frontier for efficient AI is here. Ready to explore?

#AI #MathReasoning #Phi4