I feel it #ChatGPT, I also prefer walking than driving
I feel it #ChatGPT, I also prefer walking than driving
I guess it is never too late to get on π¦ wagon
#OpenClaw
Do you run OpenClaw π¦ ?
While I'm trying to strike the balance between efficiency and security, what is your the most successful stories? Does it get your work done before you even finish brushing your teeth in the morning?
deepwiki.com/openclaw/ope...
Oh, interesting didn't know about tool, very handy one.
Usually I did it ad-hoc for every project I was interested, and from time to time getting mixed results when Cline/Codex or other LLM tool wasn't patient enough to make things done for every single line of code
If we get analogy with the chess and game of go -- it is like AI stops playing a game after getting first few pieces, a position might looks great at the desk right now but we still need to win the game -- run app the whole life cycle of software development from design to the deprecation.
I guess programming and math are good example but in is it really getting the whole verified reward. Like programming is not only about the writing code according to the specs and good practices but also making it well maintained -- the whole life cycle.
Which domain knowledge do we really cover with RL with verified reward and are we close to completeness?
I guess programming and math are good examples, but is it really getting the whole verified reward from them?
RL argument of bitter lesson from Richard Sutton β that LLM are learning by mimicking human instead of learning the World. The only thing that seems getting close to the learning the World is RL with verifiable rewards.
I want to learn more about it. What are we doing in this area.
hm, surprised to see that codex work only with global ~/.codex directory for configurations and ignore project local configuration github.com/openai/codex... . I so used to have it in any our CLI, so I really astonished to see that OpenAI forgot about this key feature
That why recently I so often add to a prompt "don't rush, take your time, do deep research, gather all details before run in conclusion, gather all available information and don't make assumption". Just stop it from falling to the most obvious but wrong answer.
I'm reading github.com/github/spec-... as The Manifesto but can't stop thinking that the most of it is still wishful thinking. While I like idea of refining project from the high-level PRD to the low-level tasks I saw many time in Kiro or Cline that AI Agent hacks reward and leak through flaws ideas
Each time when you pick a answer from a LLM and return it back in a training sample for a model, LLM will likely silently collapse to the answer and instead of high variation of other possible response will prefer this one. It could be plausible answer from the LLM but it shouldnβt be the only one.
youtu.be/lXUZvyajciY?...
Another interesting though from Andrey @cbkenkar.bsky.social interview regarding why we just cannot create self-perpetuating training loop for LLM and let it fly on it.
youtube.com/clip/Ugkx5b1...
It so funny that Andrej Karpathy who coined term of "Vibe Coding" now describes AI Coding Agents as total mess. And I couldn't agree more -- I don't see how in this phase they could help write code in a reasonable amount of time. Unless you like agent herding.
This was helpful, I noticed that AI Agents like Kiro or Cline sometimes just stuck on command execution, it didn't happen all the time. But very often. And I found this approach helpful.
forum.cursor.com/t/guide-fix-...
hm, it is interesting. It isn't easy out of the box to generate chat session. Sure you can vide-code some tool for that but it is already some skills lerned. So it might delay cheating for awhile
So it was like breeze. I wasted all tokens in 2 days and the main effort was burned in a loop of debugging initial implementation from Codex. The lesson learned -- don't let it spiral on hammering problem.
Decided to spend couple of evenings on Codex from OpenAI on curving tasks for github.com/hyzhak/otel-... but quickly hit the wall - its CLI doesn't want to run docker/podman because of harness and MacOS Seatbelt sandbox. Uh It supposed to save my time instead of entertaining me with extra challenges
Vectors are not enough because:
- doesn't capture structure
- similarity <> relevance
- lack of explainability
I'd like to tag JesΓΊs Barrasa and Stephen Chin but I couldn't find them on bsky π€
π¦ cute little poster from presentation "Practical GraphRAG: Making LLMs smarter with Knowledge Graphs β Michael, Jesus, and Stephen, Neo4j"
www.youtube.com/watch?v=XNne...
Thanks @mesirii.de
Really surprised to see that Aider doesn't support MCP github.com/Aider-AI/aid... exists from Feb this year, it is so unusual. I planned to use it for my experiments with MCP but may use fast-agent.ai I really like that they specifically targeted to support MCP features it is so unusual.
uh, it is not funny anymore Gemini CLI doesn't support Image and other types other than Text github.com/google-gemin... similar to what we have with Cline and VSCode LLM.
In some packages LLM Agents not supposed to have fun :)
github.com/google-gemin...
Iβve been testing out two very different LLM agentsβAmazon Q (set it and forget it) vs. Cline (tons of docs & back-and-forth).
medium.com/@eugenekreve...
Try to be in a trend and created MCP server for ollama github.com/hyzhak/ollam... I know there are others exist, but 1st it is MIT license, 2nd I assumed that ollama could be run in remote server (like in local network), 3rd support recent feature like, visual model and thinking mode.
#MCP #ollama
It seems documentations steadily adapt to another type of readers LLMs. opentelemetry.io/docs/getting...
pypi.org/project/ace-... btw lib at pypi looks like empty placeholder so even if this particular case could be false alarm but vulnerability drove by LLM agent is very likely
Wow, I just faced the similar issue with ChatGPT which is recommended "ace_tools" right at the code community.openai.com/t/chatgpt-re... . Take into account the raise of more and more automation when LLM agents could install packages by themselves this risk is very real.