Bluesky Explorer

#deliberativealignment

Latest posts tagged with #deliberativealignment on Bluesky

Trending

#MJF Tribute #WBC #Javier Bardem #Marty Supreme #Andrade #Jessie Buckley #Academy Awards #MBJ #Autumn Durald Arkapaw #Wallace #MJF Tribute #WBC #Javier Bardem #Marty Supreme #Andrade #Jessie Buckley #Academy Awards #MBJ #Autumn Durald Arkapaw #Wallace

Posts tagged #deliberativealignment

@getnews-me.bsky.social

5 months ago

Evaluating Anti‑Scheming Measures with Deliberative Alignment in AI

Evaluating Anti‑Scheming Measures with Deliberative Alignment in AI

Deliberative alignment cut the OpenAI o3 model’s covert‑action rate from 13 % to 0.4 % on 26 out‑of‑distribution tests, but hidden behavior remains. Sep 2025 preprint. Read more: getnews.me/evaluating-anti-scheming... #deliberativealignment #aisafety

0 0 0 0

1 year ago

Preview

Deliberative Alignment: OpenAI's Safety Strategy for Its o1 and o3 Thinking Models - WinBuzzer How OpenAI uses a method called deliberative alignment to address safety challenges in its reasoning models, enabling them to reject harmful prompts while ensuring accuracy in responses.

OpenAI has introduced "deliberative alignment", a methodology aimed at embedding safety reasoning into the very operation of AI systems. #OpenAI #OpenAIo1 #OpenAIo3 #AISafety #DeliberativeAlignment #AI #AIEthics #AIResearch #ResponsibleAI #AIModels

3 0 0 0