SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution arxiv.org/abs/2502.18449 by Yuxiang, Sida, and the whole team!
Get started with your favorite model here github.com/facebookrese...
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution arxiv.org/abs/2502.18449 by Yuxiang, Sida, and the whole team!
Get started with your favorite model here github.com/facebookrese...
Awesome work @kjain14.bsky.social!
Thrilled to announce our new work TestGenEval, a benchmark that measures unit test generation and test completion capabilities. This work was done in collaboration with the FAIR CodeGen team.
Preprint: arxiv.org/abs/2410.00752
Leaderboard: testgeneval.github.io/leaderboard....
That's a wrap #neurips2024
Don't transform the code, code the transform ! By Chris Cummins at #neurips2024
Just gave a talk on "Grounding LLMs in Code Execution" at the NeurIPS Hacker-Cup AI Competition, here are the slides docs.google.com/presentation...
Gonna be at NeurIPS starting tomorrow afternoon. See you there, in particular if you want to talk about codegen and (post-)LLM research!
> Quality is free, but only to those willing to pay heavily for it.
> The major problems of our work are not so much technological as sociological in nature.
> Get the best people (cut out the deadwood), and make them happy. Turn them loose.
It's Sunday morning so taking a minute for a nerdy thread (on math, tokenizers and LLMs) of the work of our intern Garreth
By adding a few lines of code to the base Llama 3 tokenizer, he got a free boost in arithmetic performance 😮
[thread]