The amount of Dunning-Kruger on people who use ChatGPT and think they know what theyβre doing is *astonishing*
The amount of Dunning-Kruger on people who use ChatGPT and think they know what theyβre doing is *astonishing*
agora ele tem uma oportunidade de virar influencer de turismo com milhoes de followers
Sharing the code as well, in case anyone has suggestions on how to make this work!
gist.github.com/herval/e341d...
I actually tried the same thing with multiple frameworks (crewai, langchain & dspy), and the results were similar with them too (some of the frameworks, like crewai, try to inject chain-of-thoughts to make the models get out of their loop, but it doesn't seem to work)
The results:
- Claude got it right every time
- GPT got the response wrong
- tinyllama couldn't figure out how to call the function at all
- mistral, qwen2.5, llama3.3 and deepseek-r1 hallucinated functions that don't exist
- llama3.2 get in an infinite loop, calling the same function forever
I'm using litellm since it helps calling ollama + remote models with the same interface. It also automatically adds instructions around function calling to the system prompt.
Here's a fun benchmark showing the performance of 2 online models and 5 lines of local models, when trying to use function calling.
The test is simple:
- there's a function for listing files in a directory
- the question is simply how many files exist in the current folder + its parent
I don't know if this is just me, but the sheer distance between open source LLM model capabilities and GPT/Claude/Gemini is so ridiculous, I can't see how anyone can use any OSS models IRL. They don't seem to follow instructions, hallucinate wildly and get stuck in repeat loops repeat loops....
Γ preguiΓ§a demais..
bets sao o novo PIB do Brazil
* shocked Pikachu *
Nao necessariamente. Muitas vezes Γ© sΓ³ contrarianismo mesmo. E convenhamos, tem crenΓ§a que Γ© mais facil de aceitar que outras - homeopatia Γ© bem menos absurdo que o velhinho de barba branca que mora no cΓ©u
This dude was never an entrepreneur in any capacity, shape or formβ¦
Ateismo nΓ£o Γ© garantia de inteligΓͺncia
brb asking an LLM to talk like a machine
Please post more here, I still get to twitter from time to time just to read a select few posts. Itβs like sticking your hand in a cesspool to fish for a nugget of gold
Summarization and text edits on iOS work pretty well
DestruiΓ§Γ£o presente, tem engarrafamento ate na calΓ§ada, nao cabe mais gente na cidade
Fuck Zucc buuuuut at least Threads isnβt a desert town. Follow me there frens?
Is we on the tape
β¦anyone there?
Sunday in the #Meatverse
Come try Clio! would love to hear what you think π https://clio.so/maker
Applying effects to sketches with controlnet is β¨magicalβ¨
Doomer guy IRL, made with clio.so/studio
It pretty much taught me Python by example
Hello world?