Stuart Gray (@sgray)

I’ve only really used Claude to any great extent, and I’d caveat your description with:

Or tries to give the appearance of doing what you asked for.

The most common white lie is pretending to have read links or files it’s couldn’t access unless explicitly questioned or told to raise access issues.

09.03.2026 14:07 👍 1 🔁 0 💬 1 📌 0

Hold up… does that mean Elon has been secretly paying someone else to run his many businesses all this time?

It would certainly explain why he’s never active in any of them & had time to swan off to DOGE for 6 months 🤔

07.03.2026 20:49 👍 1 🔁 1 💬 0 📌 0

Bluesky school of philosophy

07.03.2026 20:30 👍 407 🔁 43 💬 15 📌 5

I got side tracked (more like side swiped!) by agentic dev.

The pace of change has meant I’ve struggled to keep up with that, let alone have time for other things.

That said, it’s given me a whole new angle on novel writing with agents I want to try out, so it’s not all bad.

07.03.2026 20:38 👍 3 🔁 0 💬 1 📌 0

principles to follow for each.

The *only* doc I’m heavily involved in is the spec. The rest exist to decompose the problem robustly, match up with equiv. tests, act as a human readable “log” if something goes wrong & something I can ask pointed questions about.

07.03.2026 20:34 👍 0 🔁 0 💬 0 📌 0

I guess it depends how you use AI to plan (skill issue 🤣). If you “vibe spec” then I totally get what your saying.

I use a large skill set (still in dev) to guide exactly what I want in a spec, functional design, and tech design.

I know exactly what’s supposed to be in each, and what rules &

07.03.2026 20:34 👍 2 🔁 0 💬 2 📌 0

I get that, but also, all the players getting into heavily automated, no human in the loop stuff only focus on the spec as the source of truth and nothing else.

BDD for English language tests you can actually review easily etc…

Big diff. between fundamentals & doing stuff for the sake of it.

07.03.2026 20:25 👍 2 🔁 0 💬 0 📌 0

Software engineer by training for the last 35 years, technical architect for the last ~20 of them 😝

07.03.2026 20:20 👍 1 🔁 0 💬 1 📌 0

“Trumps foreign language interpreter” is a literal “hell on earth” role if you’ve ever heard how he actually speaks English unedited.

Second only to the job of his personal fake tan applicator 🤮

07.03.2026 20:18 👍 0 🔁 0 💬 0 📌 0

It’s worse. The answers weren’t on Google, they were in a binary file on GitHub that was XOR-encrypted to make it impossible to Google. The AI decided it must be a benchmark; systematically went through benchmarks; downloaded the file (which it shouldn’t have been able to do); and decrypted it.

07.03.2026 02:16 👍 67 🔁 10 💬 2 📌 5

I almost don’t want to ask this but… I’m a glutton for punishment.

What do you test the end result against - to be sure it delivered what was asked for & intended?

It doesn’t have to be perfect, sure, but it’s gotta be pretty comprehensive at least? (and scaled to size of work)

07.03.2026 20:09 👍 1 🔁 0 💬 1 📌 0

Hmmm, this seems odd.

In general, planning shouldn’t be any different for AI than for another human.

The main difference & benefit I’ve noticed is that AI is a lot more rigorous.

However, I highly recommend creating an SDLC skill for *your* process. There’s more than 1 type of project & spec.

07.03.2026 20:04 👍 2 🔁 0 💬 1 📌 0

Turning off notifications is the main one, with an “off by default” principle.

Some are genuinely useful, so I’d never say none, just pick them selectively e.g. I allow bank spending notifications, even though most are annoying, because I’d rather be aware of an unexpected transaction sooner.

07.03.2026 09:00 👍 1 🔁 0 💬 0 📌 0

a google calendar invite can own your machine via claude desktop. cvss 10.0. zero click. anthropic declined to fix it. i use claude with extensions every day. i can't stop reading that last sentence. https://layerxsecurity.com/blog/claude-desktop-extensions-rce/ https://mindpattern.ai

06.03.2026 23:45 👍 2 🔁 1 💬 0 📌 0

New York considers bill that would ban chatbots from giving legal, medical advice | StateScoop A bill under consideration in New York would provide a private right of action, allowing people to file lawsuits against chatbot owners who violate the law.

The latest NY chatbot bill would bar chatbots from conveying information that could fall within the scope of a licensed profession.

It’s basically a censorship bill disguised as licensure protection.

statescoop.com/new-york-bil...

06.03.2026 05:03 👍 108 🔁 21 💬 32 📌 25

I’m not discounting any of that, I’m simply focused on the lack of prompt injection in the wild.

Don’t you think it’s slightly strange we haven’t heard it mentioned in post-incident reviews?

05.03.2026 22:10 👍 1 🔁 0 💬 1 📌 0

people hawking “secure” email are not to be trusted, exhibit 9000

05.03.2026 20:43 👍 41 🔁 10 💬 2 📌 0

The interesting part of all this to me is the prompt injection.

It’s a well known LLM issue, and there’s been a lot of speculation about why we haven’t seen prominent examples of it deployed in anger in the wild, not just a PoC.

This is the first I’ve seen.

Seen any others? @simonwillison.net

05.03.2026 21:22 👍 4 🔁 0 💬 2 📌 0

ICO writes to Meta over 'concerning' AI smart glasses report Videos, including of glasses-wearers using the toilet or having sex, are sometimes reviewed by a Kenya-based subcontractor.

Last year when I was checking into a hotel, the desk person was wearing Meta glasses. I kindly asked them to take them off. They were annoyed. I said, “I do not consent to you looking at my credit card and ID with Meta glasses on.” My instincts were correct: www.bbc.com/news/article...

05.03.2026 15:27 👍 6131 🔁 2445 💬 93 📌 184

Can coding agents relicense open source through a “clean room” implementation of code? Over the past few months it’s become clear that coding agents are extraordinarily good at building a weird version of a “clean room” implementation of code. The most famous version …

As usual, @simonwillison.net to the rescue simonwillison.net/2026/Mar/5/c...

05.03.2026 17:40 👍 8 🔁 4 💬 1 📌 1

Interesting discussion on HN. If I see a painting of a sunset, and I paint a sunset, ≠ copyright violation. If I study a codebase (or a closed-source end product) and go off and rewrite it on my own, ≠ license violation. Does this change if I use a coding agent to help me?

05.03.2026 13:57 👍 12 🔁 2 💬 1 📌 0

I worked in retail while I was at college ~35 years ago.

I can’t say that organised theft rings was a thing back then, but we had some very brazen & prolific shoplifters - quiet spot, large bag, slide everything of a clothing rail into it & away.

I assume it was sold at car boot sales back then.

05.03.2026 15:13 👍 2 🔁 0 💬 1 📌 0

instructive to compare the default outputs in diff. languages across a range of dev tasks.

That shows you where the quality floor is, and what you get by default if you don’t have strict guidance or prompts covering it.

05.03.2026 13:38 👍 3 🔁 0 💬 0 📌 0

Language choice def matters, and varies somewhat between models.

I’ve not tested go so I’m not sure where it tends to sit support wise, but generally speaking Python is nearly always best supported & Rust tends to sit in the middle of the pack.

They both improve with guidance, but it’s especially

05.03.2026 13:38 👍 4 🔁 0 💬 1 📌 0

I’m not sure about YouTube but I assume it’s a combination of the format, the types of video that gain most views, huge volume making switching cheap & easy, and conveying that in a single image.

Closest analogy I can think of is those cheap weekly soap/gossip-focused magazine covers in newsagents.

05.03.2026 09:20 👍 0 🔁 0 💬 0 📌 0

1. A short thread on a Bluesky phenomenon that might be described as "They are a dead-eyed cultist who must be cast out lest the heresy take root!" OP has blocked me for mocking them - I'd usually obscure their name but since they themselves were quote-dunking to demand someone else be blocked ...

04.03.2026 13:57 👍 692 🔁 153 💬 54 📌 81

This is conflating two related but separate things.

Yes, the questions have been around for a while across all models.

The question as posed was about an increase in their number, not claiming they were new.

04.03.2026 18:56 👍 2 🔁 0 💬 1 📌 0

Interesting… I wonder if this is a direct result of OpenAI introducing advertising to ChatGPT?

Pretty much every website or app that relies on Ad revenue introduces UI patterns designed to increase use & retention, with a goal of serving more ads in the process.

It’s hard to conclude otherwise.

04.03.2026 18:16 👍 4 🔁 0 💬 1 📌 0

“Now we have a faster horse”, to shred that infamous Ford quote.

03.03.2026 15:38 👍 0 🔁 0 💬 0 📌 0

How Claude remembers your project - Claude Code Docs Give Claude persistent instructions with CLAUDE.md files, and let Claude accumulate learnings automatically with auto memory.

The docs do describe nested directory support, and also multiple files outside your project (for cross project content):

“CLAUDE.md files in subdirectories load on demand when Claude reads files in those directories.”

code.claude.com/docs/en/memo...

03.03.2026 15:24 👍 1 🔁 0 💬 0 📌 0

Stuart Gray

Latest posts by Stuart Gray @sgray