Sergio Paniego (@sergiopaniego)

Scaling Test-Time Compute for Longer Thinking in LLMs - Hugging Face Open-Source AI Cookbook We’re on a journey to advance and democratize artificial intelligence through open source and open science.

🔗Recipe: huggingface.co/learn/cookbo...
🔗Original blog by Edward Beeching, @lewtun.bsky.social and @srushnlp.bsky.social from @hf.co: huggingface.co/spaces/Huggi...

Thanks @stevhliu.hf.co and @lewtun.bsky.social for the feedback 🙏

07.01.2025 10:34 👍 0 🔁 0 💬 0 📌 0

Scaling test-time compute with open models diagram

🧠 Following Hugging Face's blog on scaling test-time compute with open models—letting models "think longer," inspired by OpenAI & DeepMind—I created a recipe to extend inference time for Instruct LLMs, tackling harder tasks like complex math problems.

Links below 👇

07.01.2025 10:34 👍 2 🔁 0 💬 1 📌 0

Fine-tuning SmolVLM with TRL on a consumer GPU - Hugging Face Open-Source AI Cookbook We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Here’s what’s included:
📷 SmolVLM (VLM) by @hf.co
🔧 SFT & DPO fine-tuning methods
⚙️ Runs on consumer GPUs

🔗SFT project: huggingface.co/learn/cookbo...
🔗DPO project: huggingface.co/learn/cookbo...

🙏 @stevhliu.hf.co & @merve.bsky.social & @benburtenshaw.bsky.social

18.12.2024 08:22 👍 1 🔁 0 💬 0 📌 0

I’m a big fan of smol models—compact, efficient, and perfect for inference/training on limited resources. Even better when they’re multimodal! 🤏✨

I explored fine-tuning SmolVLM, a multimodal smol model using TRL with SFT and DPO, creating 2 hands-on projects!

🔗Links below👇

18.12.2024 08:22 👍 2 🔁 0 💬 1 📌 0

Smol Multimodal RAG: Building with ColSmolVLM and SmolVLM on Colab’s Free-Tier GPU - Hugging Face Open-Source AI Cookbook We’re on a journey to advance and democratize artificial intelligence through open source and open science.

🔗Link: huggingface.co/learn/cookbo...

16.12.2024 17:23 👍 0 🔁 0 💬 0 📌 0

💡I've been exploring how to go smol with multimodal RAG.

I've created a project using SmolVLM and ColSmolVLM to create a multimodal RAG that can run on Colab's free tier.

Featuring:
🤏👀 SmolVLM (VLM)
🤏📚ColQwen2 (Doc Retrieval)
⚙️ Runs in Colab's free-tier GPU

Link below

16.12.2024 17:23 👍 3 🔁 2 💬 1 📌 0

Multimodal RAG with ColQwen2, Reranker, and Quantized VLMs on Consumer GPUs - Hugging Face Open-Source AI Cookbook We’re on a journey to advance and democratize artificial intelligence through open source and open science.

🔗 Recipe in @hf.co: huggingface.co/learn/cookbo...

🙏 @stevhliu.hf.co & @merve.bsky.social

12.12.2024 17:29 👍 2 🔁 0 💬 0 📌 0

💡 New Multimodal RAG Recipe with Re-Ranking 💡

I explored how to enhance a multimodal RAG pipeline by integrating a re-ranker!

Featuring:
✨ Qwen2-VL-7B (VLM)
📚 ColQwen2 (Doc Retrieval)
🔍 MonoQwen2 (Re-ranking)
🔥 Optimized for consumer GPUs with quantized VLMs.

Link below:

12.12.2024 17:29 👍 4 🔁 1 💬 1 📌 0

screenshot of the notebook in the link

Learn how to build a complete multimodal RAG pipeline with
ColQwen2 as retriever, MonoQwen2-VL as reranker, Qwen2-VL as VLM in this notebook that runs on a GPU as small as L4 🔥 huggingface.co/learn/cookbo...

12.12.2024 14:31 👍 49 🔁 6 💬 0 📌 0

✨ Gave a talk on autonomous driving today to undergrad students! We covered everything from definitions to real-world examples, plus cutting-edge concepts like Generative World Models and Vision-Language Models (VLMs). Exciting future ahead! 🚗💡

03.12.2024 17:12 👍 3 🔁 1 💬 0 📌 0

This is such a cool project, and it was a truly exciting experience to contribute to it!! 😀

02.12.2024 11:47 👍 0 🔁 0 💬 0 📌 0

We took those TRL notebooks from last week and made a page from them. So if you're upskilling on finetuning or aligning LLMs, and want examples from the community (like Maxime Labonne Philipp Schmid Sergio Paniego Blanco), check it out!

bsky.app/profile/benb...

>> huggingface.co/docs/trl/mai...

02.12.2024 09:18 👍 21 🔁 4 💬 1 📌 0

Thanks to @arig23498.bsky.social, @pcuenq.hf.co, and @reach-vb.hf.co for the collaboration. It's a pleasure working with such talented individuals! 🚀

29.11.2024 10:10 👍 2 🔁 0 💬 0 📌 0

huggingface-llama-recipes/llama_tgi_api_inference/tgi_api_inference_recipe.ipynb at main · huggingface/huggingface-llama-recipes Contribute to huggingface/huggingface-llama-recipes development by creating an account on GitHub.

1️⃣ Tool calling: github.com/huggingface/...

2️⃣ TGI: github.com/huggingface/...

29.11.2024 10:10 👍 1 🔁 0 💬 1 📌 0

I've been exploring the latest Llama 3.2 releases and working on a couple of projects you may find interesting:

1️⃣ Understanding tool calling with Llama 3.2 🔧
2️⃣ Using Text Generation Inference (TGI) with Llama models 🦙

(links in the next post)

29.11.2024 10:10 👍 12 🔁 3 💬 1 📌 0

What is Agentic RAG | Weaviate Learn about Agentic Retrieval Augmented Generation (RAG), including architecture, implementation, and and difference to vanilla RAG.

🔗 Link to the blog post: weaviate.io/blog/what-is... (by Erika Cardenas, @iamleonie.bsky.social)
🔗 Link to the recipe: huggingface.co/learn/cookbo...
🤗 Huge thanks to Aymeric Roucher and @stevhliu.hf.co for their support and insights!

27.11.2024 17:26 👍 1 🔁 0 💬 0 📌 0

In this notebook, I use Qwen2.5-72B-Instruct as the LLM to build a system with:
1️⃣ A manager agent
2️⃣ Three specialized agents: retriever, web search, and image generation

27.11.2024 17:26 👍 0 🔁 0 💬 1 📌 0

🧑‍🍳 The result is this new Hugging Face Cookbook recipe, where I demonstrate how to create a Multi-Agent RAG system leveraging the agent support from the transformers module.

27.11.2024 17:26 👍 0 🔁 0 💬 1 📌 0

💡 A few days ago, I came across a fascinating post about Agentic RAG by Erika Cardenas and Leonie Monigatti, and it inspired me to dive into the concept and bring it to life in code!

27.11.2024 17:26 👍 2 🔁 0 💬 1 📌 0

Fine-Tuning a Vision Language Model (Qwen2-VL-7B) with the Hugging Face Ecosystem (TRL) - Hugging Face Open-Source AI Cookbook We’re on a journey to advance and democratize artificial intelligence through open source and open science.

4/6 More vision skills for complex visual tasks. This tutorial shows how to fine-tune the Qwen2-VL-7B model for visual question answering using the ChartQA dataset.

huggingface.co/learn/cookbo...

by @sergiopaniego.bsky.social

25.11.2024 10:16 👍 2 🔁 1 💬 1 📌 0

TRL is a cornerstone of LLM post training and imo it's the default to learn.

There are great alternatives like Unsloth, Axolotl, and AutoTrain. But if you want a daily drive that does experimentation to production, it's TRL.

🧵 these community notebooks guide you through TRL's core:

25.11.2024 10:16 👍 56 🔁 8 💬 3 📌 3

hola 👋 hi 👋

23.11.2024 18:07 👍 1 🔁 0 💬 0 📌 0

Sergio Paniego

Latest posts by Sergio Paniego @sergiopaniego