Really looking forward to food-induced coma naps
Really looking forward to food-induced coma naps
What's going to be different for you in 2026?
Just entered my next decade and I think itβll be the best one yet.
Once again, I completely forgot that I have this account. Oops
A tiny bit of mirroring? :)
PS: That said, Iβll probably still keep an eye on whatβs happening and may even share some posts every now and then. Iβve got a lot of thoughts on RAG, data processing, LLMs/VLMs, etc., so I likely wonβt disappear fully.
The work will still be here when I return. The AI wonβt slow down, but also a couple of months wonβt make a dent in the field. This moment, however, this chance to be fully present with my family? Thatβs something I donβt want to miss.
And even more grateful to work with a team thatβs so supportive. Stepping away from work, especially in a field moving at warp speed, can feel counterintuitive. But for me, itβs a way to reconnect with what matters most.
Kids wonβt be kids forever, and mine are getting ever so close to becoming teenagers. Now is time I know Iβll never get back.
Iβm incredibly grateful to be in a place, both professionally and personally, where this is possible.
Next week, Iβm stepping away for a couple of months to take a sabbatical and spend time with my kids. Iβm not burnt out. Iβm following my own advice: do the thing youβll regret not doing when youβre old.
Things move fast in AI. Every week brings new models, new capabilities, or new ideas to chase. Itβs exciting, but also easy to get swept up in the pace and forget to pause, to touch grass, to zoom out and see the bigger picture.
π§΅
RAG exists to solve different problems across varied domains. Understand the problem youβre solving and look at your data.
Once you have some answers to these, you can get further into the technical weeds and experiment with chunking to find an optimal size.
Bottom line, however, is - there's no universal "best" chunk size.
* How much context do you typically need to retrieve to satisfy a typical query? Simple facts may only require a sentence or two. Creative tasks may require larger context. Analytical queries may need a whole bunch of supporting evidence.
They all vary in structure, style, and length.
* What is your use case? Are you trying to answer questions with specific facts? Are you gathering multiple documents to summarize for a report? Do you pull from transcripts and need to preserve speaker attribution?
Same goes for chunking. The βbestβ chunk size depends on a range of factors, and without those, the question is incomplete.
Here are some of the questions to ask instead:
* What does your data look like? Financial statements, technical manuals, customer support transcripts are not the same.
Asking βWhat is the best chunk size for RAG?β without any additional context is like asking, βWhatβs the best thing to wear?β Wear where? Whatβs the weather like? What size are you? Are you going to a wedding or hiking a trail? Thereβs no single answer that works for every situation.
π§΅
Do the thing that you will regret not doing when you're old.
I went to check what new courses deeplearning.ai has, and was pleasantly surprised to see that the short course Marc Sun, Younes Belkada, and I have built over a year ago is still featured as one of the Top Rated courses π
At least I have interrupted your doomscrolling with some cuteness!
I'm taking this whole developer becoming a farmer dream way too far, am I?
If you've been prioritizing urgent work,
make sure to prioritize important work.
How anyone can like peanut butter is beyond me.
Similar β relevant
Part 2 is a high-level overview of advanced RAG techniques: unstructured.io/blog/level-u...
Nothing starts a Wednesday morning quite like your dog getting sprayed by a skunk π€’
I have some epic plans for this summer and none of youβll be able to guess what they are.
I'm starting a series of blog posts on RAG beyond the basic set up. In the first part, we're setting the stage. Why naive RAG is not enough, and how a lot of the issues can be traced back to data processing choices.
Part 1: unstructured.io/blog/level-u...
What you're not changing, you're choosing.
This is a gentle reminder for the next time you're prioritizing a cool new shiny thing over building the foundation or addressing tech debt.
Word of the day seems to be "sycophantic".
Thanks AI community for increasing my vocabulary :)