what do you mean? the principle that models use for decisions? if so, it resembles what you'd see from a base model but RL makes the model effectively "opinionated"
what do you mean? the principle that models use for decisions? if so, it resembles what you'd see from a base model but RL makes the model effectively "opinionated"
this is why no-one on my team sleeps
Undefeated champion of bait titles
I have never really understood the use case for decentralized training. The downsides mean any large org will centralize. And frontier models cost so much to train that only large orgs can recoup the cost.
Is this really what we want, an AI that canβt know any better than a weighted average of voting public?
building murderbot armies (laudatory)
Monterey/Carmel is Southern California imho
Low / Guarded / Elevated / High / Grok
This has entered my lexicon
i asked Claude to visualize what Grok would do with five minutes in control of the military
Iβve seen takes you people wouldnβt believe
Was your first experience with Claude Code with Opus 4.5? Yeah that would be wild.
Sometimes makes you wiser though!
Margin call
Does not pass the smell test
I didnβt like Cursor probably due to being used to PyCharm and didnβt use Sonnet 4 much before CC so it was sort of the same update for me
Love how we made building new housing illegal because it's not like old housing because we made it illegal to build housing the old way.
"The Two Cultures" remains at least as true today as it ever was, sadly
there are wide bodies of work in philosophy that are just horrifying tarpits that trap and drown people
You no longer have to:
- write bash
- use GIMP
- resolve git conflicts
Truly an age of wonders.
Biggest updates for me:
- ChatGPT launch (entirely skill issue on my part)
- GPT-4
- o3
- Claude Code
- Opus 4.5
Curious how well this lines up with others' experience.
incomparably better account of 2008 than the big short
this movie was so far ahead of its time
Sometimes I just sit and think, of all the administrations we could have during the most consequential technological upheaval in history, we have this one
how did so many people who supposedly have rich inner lives make it to adulthood without grappling with any of the tough problems that LLMs raise until LLMs raised them?
Gemini moment
Probably more than cybercriminals but less than states. Cybersecurity for OSS is sadly mostly an externality. May depend on upstreaming norms for fixes.
This logic doesnβt apply to OSS though
To an extent the format allows for thatβyou canβt accomplish 50% of 7 day tasks unless you are way over 50% of 1 day tasks
This METR eval is way beyond the scope it was originally intended to cover and has taken on a life of its own. The team knows this but I think model progress might be outstripping the rate at which they can come up with a successor.
Bad sign