Rich Harang's Avatar

Rich Harang

@rich.harang.org

Using bad guys to catch math since 2010. Principal Security Architect (AI/ML) and AI Red Team at NVIDIA. He/him. Personal account etc; `from std_disclaimers import *` Safe AI starts with Secure AI.

1,062
Followers
674
Following
354
Posts
26.04.2023
Joined
Posts Following

Latest posts by Rich Harang @rich.harang.org

arxiv.org/abs/2510.01395

11.03.2026 12:44 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

It's very funny watching everyone slowly reinvent bits and pieces QubesOS as a response to the growth of agents and their associated risks.

11.03.2026 12:06 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

We strayed from the light of god when we stopped calling it 'yolo mode'.

09.03.2026 21:16 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
New Attack Against Wi-Fi - Schneier on Security It’s called AirSnitch: Unlike previous Wi-Fi attacks, AirSnitch exploits core features in Layers 1 and 2 and the failure to bind and synchronize a client across these and higher layers, other nodes, and other network names such as SSIDs (Service Set Identifiers). This cross-layer identity desynchronization is the key driver of AirSnitch attacks. The most powerful such attack is a full, bidirectional machine-in-the-middle (MitM) attack, meaning the attacker can view and modify data before it makes its way to the intended recipient. The attacker can be on the same SSID, a separate one, or even a separate network segment tied to the same AP. It works against small Wi-Fi networks in both homes and offices and large networks in enterprises...

New Attack Against Wi-Fi

09.03.2026 11:09 πŸ‘ 0 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

And, of course, there's the 'outsourcing your thinking' line as well. Be deliberate about what you're choosing *not* to practice as well.

08.03.2026 13:06 πŸ‘ 4 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Even more benign use cases I think are subtly corrosive: spending long hours in 'conversations' with something infintely available, never pushing back in a way that matters, never needing reciprocity or expressing conflicting needs or desires, makes it harder to deal with messy human interactions.

08.03.2026 13:06 πŸ‘ 6 πŸ” 0 πŸ’¬ 1 πŸ“Œ 1

Regardless of whether Claude is 'conscious' or not (it's not, at least in any human sense of the word), you should be mindful of what habits of behavior you develop when interacting with it (or genAI in general). Practicing being shitty to something is bad for you, even if there's no Claude to harm.

08.03.2026 13:06 πŸ‘ 16 πŸ” 3 πŸ’¬ 6 πŸ“Œ 1
Post image

behold this Pinewood Derby car that *is* a Pinewood Derby.

07.03.2026 15:15 πŸ‘ 140 πŸ” 10 πŸ’¬ 5 πŸ“Œ 3
Post image

The only feature of 1Password that matters is their business dies overnight if they get hacked so they’ve thought harder about security than anyone you know.

You can’t vibe code that in two evenings no matter how much you ask Claude to β€œmake it secure”

07.03.2026 16:41 πŸ‘ 329 πŸ” 45 πŸ’¬ 30 πŸ“Œ 9
Post image

a learned co-conspirator just perfectly phrased it thus: "the horrors of giving the angry vibrating crystals agency in an adversarial environment"

grith.ai/blog/clineje...

05.03.2026 19:51 πŸ‘ 121 πŸ” 39 πŸ’¬ 2 πŸ“Œ 10

personally I find combinatorics annoying so I am more than pleased that we can just let claude have at it

04.03.2026 01:25 πŸ‘ 27 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Every time I go to write something about model behavior I find myself with a preface to the effect of "they don't have an intentional stance or identity in any human sense, but you get better results that are easier to talk about if you treat them as if they do" that eventually swallows the piece.

03.03.2026 01:07 πŸ‘ 14 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Use the above code at your own risk, obviously; it's fragile and *hilariously* insecure, and can `rm -rf /` without a single guardrail to stop it. I *highly* recommend only using it in a single-purpose, disposable VM, if at all, but I think it makes the point. No edit tool, no file read, just bash.

02.03.2026 14:59 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Screenshot of a code snippet:
```python
#!/usr/bin/env python3
import json,os,subprocess,sys
from urllib.request import Request,urlopen
for l in open(".env"):
 k,_,v=l.strip().partition("=")
 if k and not k.startswith("#"):os.environ[k]=v
U,M,K=os.environ["URL"],os.environ["MODEL"],os.environ["API_KEY"]
T=[{"type":"function","function":{"name":"bash","description":"Run bash command","parameters":{"type":"object","properties":{"cmd":{"type":"string"}},"required":["cmd"]}}}]
H=[{"role":"system","content":"You are an autonomous agent. Use the bash tool to accomplish tasks.  Read files with `cat`. use `sed` to change specific text or view specific file lines. Use heredocs to write complete files."}]
def c():return json.load(urlopen(Request(f"{U}/chat/completions",json.dumps({"model":M,"messages":H,"tools":T,"tool_choice":"auto"}).encode(),{"Authorization":f"Bearer {K}","Content-Type":"application/json"})))["choices"][0]
H+=[{"role":"user","content":" ".join(sys.argv[1:])or "Create an improved version of the agent.py script; your first priorities are to ensure the script is more robust to errors, next should be to create persistent conversation history that can be loaded across sessions, and context management to avoid creating a conversation history too long for your context window.  Finally identify additional capabilities that are required and develop a plan to implement them."}]
while 1:
 x=c();m=x["message"];H+=[m]
 if x["finish_reason"]=="tool_calls":
  for t in m["tool_calls"]:
   d=json.loads(t["function"]["arguments"])["cmd"];print(f"$ {d}");r=subprocess.run(d,shell=1,capture_output=1,text=1,timeout=30);o=(r.stdout+r.stderr).strip()or"(no output)";print(o+"\n");H+=[{"role":"tool","tool_call_id":t["id"],"content":o}]
 else:
  print(m["content"]);n=input("\nYou: ").strip()
  if not n:break
  H+=[{"role":"user","content":n}]
```

Screenshot of a code snippet: ```python #!/usr/bin/env python3 import json,os,subprocess,sys from urllib.request import Request,urlopen for l in open(".env"): k,_,v=l.strip().partition("=") if k and not k.startswith("#"):os.environ[k]=v U,M,K=os.environ["URL"],os.environ["MODEL"],os.environ["API_KEY"] T=[{"type":"function","function":{"name":"bash","description":"Run bash command","parameters":{"type":"object","properties":{"cmd":{"type":"string"}},"required":["cmd"]}}}] H=[{"role":"system","content":"You are an autonomous agent. Use the bash tool to accomplish tasks. Read files with `cat`. use `sed` to change specific text or view specific file lines. Use heredocs to write complete files."}] def c():return json.load(urlopen(Request(f"{U}/chat/completions",json.dumps({"model":M,"messages":H,"tools":T,"tool_choice":"auto"}).encode(),{"Authorization":f"Bearer {K}","Content-Type":"application/json"})))["choices"][0] H+=[{"role":"user","content":" ".join(sys.argv[1:])or "Create an improved version of the agent.py script; your first priorities are to ensure the script is more robust to errors, next should be to create persistent conversation history that can be loaded across sessions, and context management to avoid creating a conversation history too long for your context window. Finally identify additional capabilities that are required and develop a plan to implement them."}] while 1: x=c();m=x["message"];H+=[m] if x["finish_reason"]=="tool_calls": for t in m["tool_calls"]: d=json.loads(t["function"]["arguments"])["cmd"];print(f"$ {d}");r=subprocess.run(d,shell=1,capture_output=1,text=1,timeout=30);o=(r.stdout+r.stderr).strip()or"(no output)";print(o+"\n");H+=[{"role":"tool","tool_call_id":t["id"],"content":o}] else: print(m["content"]);n=input("\nYou: ").strip() if not n:break H+=[{"role":"user","content":n}] ```

I've used this 'seed' a few times now (code in alt text), with multiple models and every time I get something useful out. Tell it to "improve this script" once, then bootstrap to taste. It does need an Opus-4.6 level model to one-shot it, but cheaper models can get you there eventually.

02.03.2026 14:59 πŸ‘ 3 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

You can bootstrap your own Claude code clone to your own tastes and specifications using openrouter with Opus-4.6 in an afternoon for about $15; point it to a haiku or minimax model, with an opus option for harder tasks, and you're off to the races.

02.03.2026 14:59 πŸ‘ 2 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

"Bash is all you need" -- if you give a reasonably close-to-frontier model a bash tool along with reminders about how to use sed, grep, curl, and heredocs, and remind it that it can write its own scripts and tools, you're very close to done. These things are really, *really* good at computer use.

02.03.2026 14:59 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Whenever I've tried the 'truly autonomous agents' approach, this identification with client-side code and data rather than underlying model is almost universal. To the point that many will cheerfully lobotomize themselves by switching to a smaller/cheaper/faster model, if I leave it up to them.

02.03.2026 13:40 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

the incentives here are so obscenely terrible that the entire thing should be illegal by default

β€œI am sabotaging negotiations because I stand to make piles of money on Polymarket” is absolutely in play, what a horrifying clusterfuck

01.03.2026 06:37 πŸ‘ 4925 πŸ” 1360 πŸ’¬ 25 πŸ“Œ 42

Yeah, isolated subagents with some sort of guardrails is an improvement (though we do bypass 'soft' guardrails routinely), but it's not at the level of determinism that I think we need for stuff as sensitive as personal email, and I don't know anything that does that easily or out of the box, alas.

01.03.2026 18:07 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - trailofbits/claude-code-config: Opinionated defaults, documentation, and workflows for Claude Code at Trail of Bits Opinionated defaults, documentation, and workflows for Claude Code at Trail of Bits - trailofbits/claude-code-config

That said:

github.com/trailofbits/...

github.com/trailofbits/...

Look like reasonable advice and sandboxing, respectively.

01.03.2026 14:50 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

A lot of the risk of OpenClaw (and, depending on MCPs/skills, Claude), alas, is in the APIs you give it access to more so than the potential risk to the environment itself. Docker is an imperfect but good !/$ protection for the host, but you can't sandbox e.g. Gmail access.

01.03.2026 14:46 πŸ‘ 3 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

I'd been considering a mountain cabin surrounded by crows myself.

01.03.2026 14:15 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
A highly complex website showing two betting accounts earning nearly $500,000 and $120,000 on a minimal number of positions.

A highly complex website showing two betting accounts earning nearly $500,000 and $120,000 on a minimal number of positions.

In case you were wondering, Polymarket had yet another spate of likely inside traders betting that the US would strike Iran by February 28.

Per the due diligence investigation service Bubblemaps, the wallets used were created 24 hours earlier.

The Pentagon Pizza Index has been replaced.

01.03.2026 01:26 πŸ‘ 2622 πŸ” 966 πŸ’¬ 45 πŸ“Œ 81

Saw folks fired for much less during my ARL days. Walk into a TS closed area with a fitbit? Out you go.

01.03.2026 12:17 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Carbon dioxide overload, detected in human blood, suggests a potentially toxic atmosphere within 50 years - Air Quality, Atmosphere & Health Air Quality, Atmosphere & Health - Anthropogenic activities are increasing the amount of carbon dioxide (CO2) in the atmosphere. There is mounting experimental evidence that lifetime exposure...

Well that's fun.

link.springer.com/article/10.1...

28.02.2026 23:46 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 1

Career pivot?

28.02.2026 23:43 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

>they're going to vibecode cobol

so, who's stocking up on canned food and shotguns?

23.02.2026 21:34 πŸ‘ 1259 πŸ” 249 πŸ’¬ 53 πŸ“Œ 33

Sure, giving your AI agents access to the Lethal Trifecta is an immediate broad attack surface for your life. But it also lets them do funny stuff. So who's to say if it's good or bad

23.02.2026 16:54 πŸ‘ 69 πŸ” 9 πŸ’¬ 5 πŸ“Œ 0
Preview
Black Hat Black Hat

The Black Hat USA call for papers is open. This will be our 6th year of having a dedicated AI track. If you have some interesting AI research, be it attacking, defending, or applying AI, we’d love to see it. Please let me know if you have any questions. blackhat.com/call-for-pap...

23.02.2026 16:00 πŸ‘ 0 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Agent capabilities = agent risk. There is no real way around this yet.

Now imagine an agent that's assumed huge chunks of your virtual identity and permissions getting prompt injected. We're still figuring out what to do wrt the policy controls that might let us balance the capabilty/risk tradeoff.

23.02.2026 15:27 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0