Alex Becker's Avatar

Alex Becker

@gputhief

Safeguards @ Anthropic, but anything posted here is my personal view San Francisco Blog: https://alexcbecker.net/blog.html

84
Followers
145
Following
64
Posts
25.01.2026
Joined
Posts Following

Latest posts by Alex Becker @gputhief

what do you mean? the principle that models use for decisions? if so, it resembles what you'd see from a base model but RL makes the model effectively "opinionated"

11.03.2026 05:05 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

this is why no-one on my team sleeps

11.03.2026 04:45 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Undefeated champion of bait titles

11.03.2026 01:03 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

I have never really understood the use case for decentralized training. The downsides mean any large org will centralize. And frontier models cost so much to train that only large orgs can recoup the cost.

11.03.2026 01:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Is this really what we want, an AI that can’t know any better than a weighted average of voting public?

11.03.2026 00:35 πŸ‘ 0 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

building murderbot armies (laudatory)

10.03.2026 06:52 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Monterey/Carmel is Southern California imho

09.03.2026 05:57 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Low / Guarded / Elevated / High / Grok

This has entered my lexicon

09.03.2026 05:55 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

i asked Claude to visualize what Grok would do with five minutes in control of the military

09.03.2026 05:45 πŸ‘ 17 πŸ” 2 πŸ’¬ 3 πŸ“Œ 1

I’ve seen takes you people wouldn’t believe

09.03.2026 05:09 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Was your first experience with Claude Code with Opus 4.5? Yeah that would be wild.

09.03.2026 05:05 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Sometimes makes you wiser though!

09.03.2026 04:17 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Margin call

09.03.2026 03:49 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Does not pass the smell test

09.03.2026 01:22 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I didn’t like Cursor probably due to being used to PyCharm and didn’t use Sonnet 4 much before CC so it was sort of the same update for me

09.03.2026 01:16 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Love how we made building new housing illegal because it's not like old housing because we made it illegal to build housing the old way.

09.03.2026 00:28 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

"The Two Cultures" remains at least as true today as it ever was, sadly

08.03.2026 23:59 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

there are wide bodies of work in philosophy that are just horrifying tarpits that trap and drown people

08.03.2026 19:53 πŸ‘ 44 πŸ” 2 πŸ’¬ 4 πŸ“Œ 0

You no longer have to:
- write bash
- use GIMP
- resolve git conflicts

Truly an age of wonders.

08.03.2026 23:38 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Biggest updates for me:
- ChatGPT launch (entirely skill issue on my part)
- GPT-4
- o3
- Claude Code
- Opus 4.5

Curious how well this lines up with others' experience.

08.03.2026 23:22 πŸ‘ 3 πŸ” 0 πŸ’¬ 3 πŸ“Œ 0

incomparably better account of 2008 than the big short

08.03.2026 20:49 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image Post image Post image Post image

this movie was so far ahead of its time

08.03.2026 19:37 πŸ‘ 13 πŸ” 1 πŸ’¬ 2 πŸ“Œ 0

Sometimes I just sit and think, of all the administrations we could have during the most consequential technological upheaval in history, we have this one

08.03.2026 20:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

how did so many people who supposedly have rich inner lives make it to adulthood without grappling with any of the tough problems that LLMs raise until LLMs raised them?

08.03.2026 18:45 πŸ‘ 102 πŸ” 4 πŸ’¬ 8 πŸ“Œ 5

Gemini moment

08.03.2026 01:41 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Probably more than cybercriminals but less than states. Cybersecurity for OSS is sadly mostly an externality. May depend on upstreaming norms for fixes.

07.03.2026 17:00 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

This logic doesn’t apply to OSS though

07.03.2026 14:51 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

To an extent the format allows for thatβ€”you can’t accomplish 50% of 7 day tasks unless you are way over 50% of 1 day tasks

06.03.2026 16:33 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

This METR eval is way beyond the scope it was originally intended to cover and has taken on a life of its own. The team knows this but I think model progress might be outstripping the rate at which they can come up with a successor.

06.03.2026 16:32 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

Bad sign

06.03.2026 07:07 πŸ‘ 5 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0