I’ll be in Singapore attending ICLR2025. Looking forward to chatting in person about model post-training, alignment and reasoning! ✈️🇸🇬
I’ll be in Singapore attending ICLR2025. Looking forward to chatting in person about model post-training, alignment and reasoning! ✈️🇸🇬
New base models from NVIDIA - Nemotron-H: mamba-transformer hybrids are now on @hf.co hub huggingface.co/collections/...
New paper from our team. An inference-time scaling approach which can boost non-math benchmarks such as Arena-Hard of existing models. We get Arena-Hard of 92.7 for 70B model. As of 5 Mar 2025, surpassing o1-preview-2024-09- 12 (90.4) and DS-R1 (92.3). arxiv.org/pdf/2503.04378
My favorite AI conference, GTC, is coming back to San Jose, California on March 17-21! Join us and thousands of other developers and innovators. This link gives you 25% off your conference pass www.nvidia.com/gtc/?ncid=GT...
Our team put together a unified mathematical framework to analyze popular model alignment algorithms. “Reward-aware Preference Optimization: A Unified Mathematical Framework
for Model Alignment” arxiv.org/pdf/2502.00203.
pretty sure Apple’s Tim Cook pledged publicly (on twitter) that they’ll donate to LA fires support and recovery efforts
“winer takes all” is also the most dangerous scenario from safety perspective. open source ecosystem is a great antidote to monopoly or duopoly scenarios.
this year timing and conference both were great. thank you!