Saumya Malik's Avatar

Saumya Malik

@saumyamalik

Predoc at Ai2 | prev. Princeton CS '24

58
Followers
8
Following
7
Posts
21.11.2024
Joined
Posts Following

Latest posts by Saumya Malik @saumyamalik

Preview
Reward Bench 2 - a allenai Collection Datasets, spaces, and models for Reward Bench 2 benchmark and paper!

Thank you to co-authors @natolambert.bsky.social, @valentinapy.bsky.social, @jacobcares.bsky.social, Sander Land, @nlpnoah.bsky.social, @hanna-nlp.bsky.social!
Read more in the paper here (ArXiv soon!): github.com/allenai/rewa...
Dataset, leaderboard, and models here: huggingface.co/collections/...

02.06.2025 23:41 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Post image

Interestingly, we find that RLHF performance degrades if the lineages of the reward model and policy model don’t match πŸ€” So, instead of simply taking the top model on RewardBench 2 off-the-shelf, one should take the recipe for that model and integrate it into their RLHF workflow

02.06.2025 23:41 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

We find that RewardBench 2 is highly correlated with downstream performance when RMs are used at inference time in Best-of-N selection and it also provides a helpful signal of downstream performance in RLHF πŸ”₯

02.06.2025 23:41 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

We trained and released 70 reward models to study their performance on RB2 and in downstream applications like inference time Best-of-N sampling and RLHF training. Even top RMs still have plenty of room to improve on RB2, particularly in Precise Instruction Following and Math

02.06.2025 23:41 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

RewardBench 2 spans six domains, sources new human prompts, and carefully constructs and combines completions to build out a best-of-4 dataset. Using fresh prompts is an important step in making reward model evaluation independent from downstream evaluations

02.06.2025 23:41 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

I’m thrilled to share RewardBench 2 πŸ“Šβ€” We created a new multi-domain reward model evaluation that is substantially harder than RewardBench, we trained and released 70 reward models, and we gained insights about reward modeling benchmarks and downstream performance!

02.06.2025 23:41 πŸ‘ 22 πŸ” 6 πŸ’¬ 2 πŸ“Œ 1

I'm having a great time as a PYI at Ai2! Definitely consider applying for this great program :)

04.12.2024 07:51 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0