Katrina Drozdov (Evtimova) (@stochasticdoggo)

9 Lessons I Learned while Doing RL Post-Training for LLMs I recently had the chance to experiment with post-training techniques for large language models, a space that has become central to making LLMs useful and controllable in real-world applications. I us...

Finally dipped my toes into RL post-training. I trained a code generation LLM with GRPO using open-r1. Here are my 9 takeaways: kevtimova.github.io/posts/grpo/

08.07.2025 18:29 👍 2 🔁 0 💬 0 📌 0

I asked ChatGPT, Gemini, and Claude for a clever joke. They all gave me the same one. Either AI is merging into a hive mind… or humor has officially been solved mathematically!

06.03.2025 02:41 👍 2 🔁 0 💬 0 📌 0

The principle of least effort, from psychology, describes how we favor efficiency over effort. It aligns with System 1 (fast, intuitive) vs. System 2 (slow, deliberate) reasoning. AI faces a similar challenge: knowing when to rely on heuristics vs. deeper reasoning.

29.01.2025 03:17 👍 4 🔁 0 💬 0 📌 0

Happy Holidays and a Joyful New Year!

Wall sculpture series from “Random Walks of Happiness”, 2024
Celebrating Art: the Process, the Experiment, the Media.

#randomwalksofhappiness #happynewyear #happyholidays #abstractart #art #experimentalart #drawing #sculpture #wire #foundobjectsart #december

20.12.2024 23:48 👍 10 🔁 1 💬 0 📌 0

If you are into ML theory (RL or not) with a proven track record, and you are interested in an industry research position, PM me. Feel free to spread the word.

19.12.2024 00:55 👍 74 🔁 31 💬 2 📌 0

I gave a talk on Compositional World Models at NeurIPS last week 🌐

The recording is now online: neurips.cc/virtual/2024... (for registered attendees; starts at 6:06:00)

Workshop: compositional-learning.github.io

19.12.2024 01:57 👍 40 🔁 4 💬 1 📌 0

Just 10 days after o1's public debut, we’re thrilled to unveil the open-source version of the technique behind its success: scaling test-time compute

By giving models more "time to think," Llama 1B outperforms Llama 8B in math—beating a model 8x its size. The full recipe is open-source!

16.12.2024 21:42 👍 83 🔁 18 💬 4 📌 2

Transactions on Machine Learning Research

Announcement from TMLR jmlr.org/tmlr/

"""
📣 Heads-up 📣 that TMLR will pause new submissions over the upcoming holiday period from December 2 2024 to January 6 2025 (midnight AoE on both dates). We will resume accepting new submissions on January 7, 2025. Happy Holidays!
"""

26.11.2024 14:05 👍 50 🔁 5 💬 2 📌 1

An screenshot of UnicodeIt website

Write math on 🦋 with UnicodeIt!
For example: θ ∈ ℝⁿ or pp̅ → μ⁺μ⁻
Use website or install system-wide in Linux, macOS, or windows
www.unicodeit.net

(Created several years ago with @svenkreiss.bsky.social)

23.11.2024 21:34 👍 288 🔁 62 💬 19 📌 9

If you post a research paper with a picture of an animal, I will follow you. It's the law.

21.11.2024 19:23 👍 8 🔁 1 💬 4 📌 1

some little bluesky tips 🦋

your blocks, likes, lists, and just about everything except chats are PUBLIC

you can pin custom feeds; i like quiet posters, best of follows, mutuals, mentions

if your chronological feed is overwhelming, you can make and pin make a personal list of "unmissable" people

20.11.2024 11:56 👍 255 🔁 57 💬 17 📌 3

A plot showing that reranking improves recall as we increase the number of reranked docs, but with increasing docs we diminishing returns and eventually a performance dip.

Mat is not on 🦋—posting on his behalf!

It's time to revisit common assumptions in IR! Embeddings have improved drastically, but mainstream IR evals have stagnated since MSMARCO + BEIR.

We ask: on private or tricky IR tasks, are rerankers better? Surely, reranking many docs is best?

20.11.2024 19:44 👍 81 🔁 23 💬 4 📌 5

How many documents should you retrieve when using a reranker? The answer might surprise you!

Check out the excellent work from our intern Mathew on this important retrieval question. 👏

20.11.2024 20:07 👍 11 🔁 3 💬 0 📌 0

Google Scholar is twenty years (and one day) old today https://www.infodocket.com/2024/11/18/google-scholar-turns-20-20-things-you-didnt-know-about-scholar/

19.11.2024 00:35 👍 5 🔁 1 💬 0 📌 0

whiteboard with MUST: - 10k user signups - view post on web - app store - other pds federate - 3p labels/services etc

A whiteboard from our Jan 2023 team retreat where I wrote down our goals, and the top one was “10k user signups.”

We’re growing by 10k users every 10-15 minutes right now.

16.11.2024 02:46 👍 49321 🔁 3671 💬 1398 📌 281

I'm making a list of AI for Science researchers on bluesky — let me know if I missed you / if you'd like to join!

go.bsky.app/AcP9Lix

10.11.2024 00:11 👍 247 🔁 91 💬 160 📌 5

New here? Interested in AI/ML? Check out these great starter packs!

AI: go.bsky.app/SipA7it
RL: go.bsky.app/3WPHcHg
Women in AI: go.bsky.app/LaGDpqg
NLP: go.bsky.app/SngwGeS
AI and news: go.bsky.app/5sFqVNS

You can also search all starter packs here: blueskydirectory.com/starter-pack...

09.11.2024 09:13 👍 553 🔁 212 💬 67 📌 55

A starter pack for #NLP #NLProc researchers! 🎉

go.bsky.app/SngwGeS

04.11.2024 10:01 👍 251 🔁 99 💬 45 📌 13

From Academia to Industry: How a 2018 Paper Foreshadowed OpenAI’s Latest Innovation How a 2018 paper by CDS researchers helped shape OpenAI’s latest innovation, the o1 model.

You can read about my research on emergent communication with adaptive compute at inference time and its connection to OpenAI's o1 model here: nyudatascience.medium.com/from-academi...

#ai #machinelearning #research

14.11.2024 15:41 👍 1 🔁 0 💬 1 📌 0

Hello Bluesky! 👋 I'm an AI researcher with a PhD from NYU's Center for Data Science. My research focuses on representation learning for images and video, with an emphasis on self-supervised learning and regularization methods. Excited to connect and explore here! #ai #deeplearning #computervision

14.11.2024 15:32 👍 4 🔁 1 💬 0 📌 0

Katrina Drozdov (Evtimova)

Latest posts by Katrina Drozdov (Evtimova) @stochasticdoggo