Alex Guglielmone Nemi's Avatar

Alex Guglielmone Nemi

@alexhans-dev

alexhans.github.io (blog) https://ai-evals.io (community site for eval-driven development as a shared language for product building)

2
Followers
5
Following
5
Posts
16.02.2026
Joined
Posts Following

Latest posts by Alex Guglielmone Nemi @alexhans-dev

I've been researching sandboxing (bwrap et al) focusing on good UX for system tests to empower non tech users in a deeply technical space. They've been coding. They're writing agent skills. No evals yet. This space is critical but UX is too. Are you targeting that type of user as a design principle?

24.02.2026 08:45 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

published: No news is good news
alexhans.github.io/posts/series...

About building systems that don’t demand constant attention: fewer dashboards, fewer alerts, clearer expectations. If things are working, you shouldn’t have to look. On feedback loops and boring reliability for small teams.

24.02.2026 08:26 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Reducing Error Compounding in GenAI Systems – The Living Deadline How chaining LLM calls compounds errors, and why replacing probabilistic steps with deterministic ones improves reliability.

published: Error compounding in GenAI systems alexhans.github.io/posts/series...

Small mistakes don’t stay small once you chain agents. This is why evals + explicit expectations matter. Written for anyone building multi-step workflows. Many find the same learnings independently.

24.02.2026 08:24 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Building Agent Skills: Intent, Determinism, and Stability – The Living Deadline A mental model and decision tree for building agent skills incrementally: start with intent, add deterministic tools, then use tests and AI evals to reduce drift and risk.

published: Building agent skills incrementally
alexhans.github.io/posts/series...

Not about frameworks or β€œAI platforms” - about treating agents like software: small skills, explicit expectations, evals, repeat.
Written for non-traditional builders (e.g. economists). you need feedback loops.

24.02.2026 08:19 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
AI-Evals.io

Hey Simon Willison - this resonates. I’m exploring how evals help non-tech folks ship end-to-end. ai-evals.io isn’t a product (no selling) - just a community site. Are you using evals? Seen them work with non-engineers? Any advice on advocating the mindset shift?

24.02.2026 08:15 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0