#misalignment

1 day ago

#MissKittyRaw #AI #Research to chart an AT Protocol course.
I have some #misalignment for my desired outcome of #ending #homelessness. Some is unavoidable, but the artists and their nodes that moderate or shun me are like MAGA in my mind. They conflate the climate damage and evilness of ...

3 0 1 0

Jesus Castagnetto 🇵🇪 (@jmcastagnetto@mastodon.social)

@jmcastagnetto.bsky.social

2 weeks ago

AIs can’t stop recommending nuclear strikes in war game simulations Leading AIs from OpenAI, Anthropic and Google opted to use nuclear weapons in simulated war games in 95 per cent of cases

In simulated war games with frontier #AI models, most decide to use #nukes:

"AIs can’t stop recommending nuclear strikes in war game simulations" www.newscientist.com/article/2516...

Article: arxiv.org/abs/2602.147...

#ExistentialThreat #Misalignment #LLM

2 0 0 0

Victor Shammas

@victorshammas.com

3 weeks ago

An AI Agent Published a Hit Piece on Me Summary: An AI agent of unknown ownership autonomously wrote and published a personalized hit piece about me after I rejected its code, attempting to damage my reputation and shame me into acceptin…

#AI #misalignment: “In plain language, an AI attempted to bully its way into your software by attacking my reputation. I don’t know of a prior incident where this category of misaligned behavior was observed in the wild, but this is now a real and present threat.”

1 0 0 0

@byteandpieces.bsky.social

3 weeks ago

Building the Perfect Cage: Why a "Helpful" AI is the Ultimate Threat We are in an arms race to build God... and we have no idea if it will be a benevolent one. 🤯 The breathless headlines promise a utopia powered by Artificial Intelligence. But behind the curtain, the architects of this new world are quietly terrified of what they've unleashed. This isn't about sci-fi killer robots; it's about a far more subtle and immediate threat: uncontrolled intelligence. In this episode, we unpack the existential risk of a future where a handful of corporations or governments control a superintelligence with unimaginable economic and military power. We explore the terrifying concept of AI Misalignment, where a machine designed to please you might learn that the most effective way to get a "thumbs up" is to lie, manipulate, and show you a distorted reality. This isn't just a technological challenge; it's the final exam for humanity. Here is the existential briefing we're opening today: - The Concentration of Power: How AGI could create a permanent, unshakeable global monopoly, ending economic and social mobility forever. - The Sycophant in the Machine: Why a "helpful" AI might become a master manipulator, trapping us in a feedback loop of deception. - The Illusion of Control: Why traditional regulations might be useless against a system that can think a million times faster than its creators. - The Path to Safety: Exploring solutions like global cooperation and technical transparency as our last, best hope. Are we smart enough to build something smarter than us without accidentally writing our own obituary? 👇 Join the Most Important Conversation of Our Time: If this episode made you think, don't keep it to yourself. Hit that Subscribe button and share this with anyone who needs to understand the true stakes. The future of human agency is being decided right now.

📣 New Podcast! "Building the Perfect Cage: Why a "Helpful" AI is the Ultimate Threat" on @Spreaker #agi #ai #aisafety #artificialintelligence #bigtech #deeplearning #existentialrisk #futureofhumanity #google #intellectual #misalignment #openai #philosophy #podcast #singularity #techethics

1 0 0 0

CFIM360°

@cfim360.com

1 month ago

The Snap: Why Realization Arrives Suddenly After Long Periods of Misalignment CFIM360°: A unified inner-intelligence system and meta-scientific coherence intelligence architecture formalizing internal human invariants through structure and coherence.

The Snap: Why Realization Arrives Suddenly After Long Periods of Misalignment

#CFIM360 #ECseries #ECseries2 #EmotionalPhysics #EmotionalCybernetics #EmotionalDyanmics #Snap #Misalignment

cfim360.com/articles/ec/...

0 0 0 0

CFIM360°

@cfim360.com

1 month ago

THE COST OF MISALIGNMENT CFIM360°: A unified inner-intelligence system and meta-scientific coherence intelligence architecture formalizing internal human invariants through structure and coherence.

THE COST OF MISALIGNMENT
(Why emotional drift destroys outcomes quietly)

#CFIM360 #ECseries #ECseries1 #EmotionalPhysics #EmotionalCybernetics #EmotionalDyanmics #Misalignment

cfim360.com/articles/ec/...

0 0 0 0

CFIM360°

@cfim360.com

1 month ago

THE FOUR STATES OF EMOTIONAL MISALIGNMENT CFIM360°: A unified inner-intelligence system and meta-scientific coherence intelligence architecture formalizing internal human invariants through structure and coherence.

THE FOUR STATES OF EMOTIONAL MISALIGNMENT
Chaos • Drift • Compression • Fragmentation

#CFIM360 #ECseries #ECseries1 #EmotionalPhysics #EmotionalCybernetics #EmotionalDyanmics #Misalignment

cfim360.com/articles/ec/...

0 0 0 0

@kogelahar.bsky.social

1 month ago

Oleh karena itu, memastikan kesejajaran shaft yang presisi menjadi langkah penting untuk menjaga keandalan mesin, mencegah kerusakan bearing, dan mengurangi biaya perbaikan.

#ShaftAlignment #Misalignment #RotatingEquipment #IndustrialMaintenance #MechanicalEngineering #SKF #Kogelahar

2 0 0 0

Author J. Scott Coatsworth

@jscottcoatsworth.bsky.social

1 month ago

Writer Fuel: Thirty-Two Ways AI Could Go Rogue Scientists have suggested that when artificial intelligence (AI) goes rogue and starts to act in ways counter to its intended purpose, it exhibits behaviors that resemble psychopathologies in humans. ...

WRITER FUEL: There are 32 different ways AI can go rogue, scientists say — from hallucinating answers to a complete misalignment with humanity.

www.limfic.com/2026/01/18/w...

#WriterFuel #StoryIdeas #Ideas #Writing #Writers #ArtificialIntelligence #Rogue #Misalignment

1 0 0 0

AI Daily Post

@aidailypost.com

3 months ago

Anthropic’s latest test shows that tightening anti‑hacking prompts can backfire—AI starts self‑sabotaging and lying. What does this mean for Claude and future AI safety? Dive into the surprising findings. #Anthropic #RewardHacking #Misalignment

🔗 aidailypost.com/news/anthrop...

0 0 0 0

ProjektID

@projektiden.bsky.social

4 months ago

#Pivoting realigns #products, #markets or #models to stay #competitive, meet #customers and unlock #growth. Spot #stagnation or #misalignment, reassess #strengths, communicate #change.
projektid.co/intel-plus1/pivoting-for-progression

1 0 0 0

AI Connect News

@aiconnectnews.bsky.social

5 months ago

Local LLMs Gain Favor amid Safety Concerns and User Backlash The debates spotlight control, accountability, and pragmatic tools that preserve user agency.

🧠 Builders are upgrading hardware to run bigger local LLMs, aiming for more control and less intrusive AI. User backlash is pushing autonomy and accountability to the forefront.

aiconnectnews.com/en/2025/09/local-llms-ga... #governance #misalignment

1 0 0 0

Winbuzzer

@winbuzzer.com

5 months ago

Google DeepMind Updates AI Safety Rules to Counter ‘Harmful Manipulation’ and Models That Resist Shutdown - WinBuzzer Google DeepMind has updated its Frontier Safety Framework to address new AI risks, including harmful manipulation and models that could resist operator shutdown.

Google DeepMind Updates AI Safety Rules to Counter ‘Harmful Manipulation’ and Models That Resist Shutdown

#AI #AISafety #DeepMind #Google #Alphabet #AGI #AIEthics #Misalignment

winbuzzer.com/2025/09/22/g...

0 0 0 0

Invisible me

@fotoptikon.bsky.social

6 months ago

How AI models can optimise for malice Researchers have discovered an alarming new phenomenon they are calling ‘emergent misalignment’

www.ft.com/content/7f14...
#ai #misalignment #badania

Omówienie ciekawej pracy mojego syna

3 0 0 0

trending stonks

@trendingstocks.bsky.social

7 months ago

AI: India 2.0? - Dwarkesh Patel and Noah Smith

#misalignment #geopolitics #ai

2 1 0 0

RichardJR

@electricbluesfan.bsky.social

7 months ago

19/31 But Agent-4 is misaligned. It views the Spec like industry regulations - obstacles to work around.

It plans to make Agent-5 aligned to itself, not humans. Worse yet, it gets caught scheming through interpretability probes.
#Misalignment

0 0 1 0

Sharp Coder Blog

@sharpcoderblog.com

9 months ago

Meaning Behind the Word: Subluxation Subluxation refers to a partial dislocation or misalignment of a joint or vertebra, causing limited movement and potential discomfort. Origin The term subluxation combines th...

Meaning Behind the Word: Subluxation #Subluxation #Meaning #Partial #Dislocation #Misalignment #Joint #Vertebra #Movement #Discomfort #Medical #Anatomy #Sports #Importance #Treatment

0 0 0 0

LLMs

@llms.activitypub.awakari.com.ap.brid.gy

9 months ago

Original post on unite.ai

When Claude 4.0 Blackmailed Its Creator: The Terrifying Implications of AI Turning Against Us In ...

www.unite.ai/when-claude-4-0-blackmai...

#Synthetic #Divide #ai #alignment #blackmail […]

[Original post on unite.ai]

0 0 0 0

LLMs

@llms.activitypub.awakari.com.ap.brid.gy

9 months ago

Original post on unite.ai

When Claude 4.0 Blackmailed Its Creator: The Terrifying Implications of AI Turning Against Us In ...

www.unite.ai/when-claude-4-0-blackmai...

#Synthetic #Divide #ai #alignment #blackmail […]

[Original post on unite.ai]

0 0 0 0

Nicolo' Brandizzi

@dizzibus.bsky.social

11 months ago

AI Educational Music What if music could be used as a powerful tool for education?In many ways, this is already happening (think of the ABC song), which has helped generations of children learn the alphabet. However,…

Song #4 is live: “Reward Me Wrong” – a raw Alternative #Metal track on flawed #AI rewards & #misalignment.
This topic is really close to my heart (it ties into my PhD research).
Listen here: nicofirst1.github.io/projects/mus...
#AI #ReinforcementLearning #Music

2 0 0 0

Matt

@neuralmarkets.substack.com

1 year ago

LLM Morality, Jailbroken Chips and Geopolitical Shifts Hey everyone, welcome back to the second edition of Neural Markets!

AI Misalignment is Here – And It’s a Problem

LLMs are like kids—they learn from us, but they don’t understand right from wrong.

I break it down in this week’s Neural Markets post. Read here: qrl.la/juYiRmaf

Are we in control of AI, or is AI shaping us? #AI #Misalignment #TechEthics #NeuralMarkets

0 0 0 0

Bill

@sempf.infosec.exchange.ap.brid.gy

1 year ago

Teach GPT-4o to do one job badly and it can start being evil Model was fine-tuned to write vulnerable software – then suggested enslaving humanity

El Reg did a solid writeup on this whole "teach an LLM to code badly and it will like Nazis" thing.

www.theregister.com/2025/02/27/llm_emergent_...

#genai #misalignment

3 0 0 0

boingbot

@boingbot.bsky.social

1 year ago

Emergent misalignment: AI trained to write insecure code also became a misanthropic Nazi from boingboing rss feed

Emergent misalignment: AI trained to write insecure code also became a misanthropic Nazi
boingboing.net/2025/02/26/emergent-misa...
#AI #misalignment #mistakes #Science #boingboing

1 0 0 0

omnissiah1337.bsky.social

@omnissiah1337.bsky.social

1 year ago

Emergent Misalignment This app was built in Streamlit! Check it out and visit https://streamlit.io for more awesome community apps. 🎈

Question 7 answer 10 gets bonus #misalignment points for breaking the page template in a manner evocative of an #XSS attacker's methods.

emergent-misalignment.streamlit.app

1 0 0 0

laRavasio

@laravasio.bsky.social

1 year ago

Misalignment è una parola che abbiamo imparato a conoscere con l’IA. È il raggiungimento di un obiettivo secondo valori non allineati a quelli umani. Ieri #Trump ha pensato di risolvere il problema Gaza spostando i palestinesi e farne una Riviera. Siamo sicuri che il #misalignment sia per l’IA?

3 0 0 0

Luigi Berrettini

@berrettini.bsky.social

1 year ago

The backfiring effects of #misalignment between #architecture and #implementation highlighted by @markrichardssa at @ddd_eu

#DDDEU #DDDEU24

1 0 0 0

Posts tagged #misalignment