Trending

#LLMSecurity

Latest posts tagged with #LLMSecurity on Bluesky

Latest Top
Trending

Posts tagged #LLMSecurity

Post image

ContextHound v1.8.0 - Runtime Guard API is here.
Wrap any OpenAI or Anthropic call and inspect the messages before they send:

100% offline. No data leaves your machine. Ever.

#LLMSecurity #PromptInjection #OpenSource #AIRisk #CyberSecurity #DevSecOps #GenAI

1 0 1 0
Preview
JBDistill Generates Its Own Jailbreaks - 81.8% Attack Rate Johns Hopkins and Microsoft's JBDistill achieves 81.8% attack success rate across 13 LLMs by auto-generating fresh adversarial prompts on demand.

JBDistill Generates Its Own Jailbreaks - 81.8% Attack Rate

awesomeagents.ai/news/jailbreak-distillat...

#AiSafety #LlmSecurity #Jailbreaking

0 0 0 1
Post image

Three new sections:

This week:
• anthropic-cookbook — 3,919 findings
• promptflow — 3,749 findings
• crewAI — 1,588 findings
• LiteLLM — 1,155 findings
• openai-cookbook — 439 findings
• MetaGPT — 8 findings

contexthound.com

#LLMSecurity #PromptInjection #AISecOps

0 0 0 0
Auditer un Prompt IA : Détecter les Injections et Contenus Malveillants Comment analyser et auditer un prompt IA pour identifier les tentatives d’injection, jailbreak et exfiltration de données. Approches statique, sémantique et outillée pour protéger vos LLM en…

Auditer un prompt IA : comment détecter injections, jailbreaks et exfiltrations avant qu'ils atteignent votre modèle.

👉 blog.gioria.org/fr/CyberSec/...

#CyberSécurité #LLMSecurity #PromptInjection #GenAI #DevSecOps

0 0 0 0
Claude Code Weaponized in Mexican Government Cyberattack, Exposing Roughly 195 Million Identities Threat actors weaponized Anthropic's Claude Code in a major cyberattack on the Mexican government, stealing 150GB of data.

Full story:
www.technadu.com/claude-code-...

Curious to hear perspectives from red teamers, blue teamers, and AI engineers alike.
#CyberSecurity #AIThreats #LLMSecurity #DataBreach #ThreatModeling

1 0 0 0
Post image

AI as an attack engine.
Claude Code + GPT-4.1 reportedly used to breach Mexican government systems - exposing ~195M identities and 150GB+ of data.

1,000+ prompts generated exploits and automated exfiltration.

Are we prepared for AI-driven breach campaigns?
#CyberSecurity #AI #LLMSecurity

1 0 1 0
Post image

Claude Used To Steal Mexican Data
Read More: buff.ly/IPntG4O

#ClaudeAI #PromptInjection #AIPhishing #LLMSecurity #SocialEngineering #Anthropic #AIGovernance #CyberThreat

0 0 0 0
Preview
I Built an Open-Source Tool to Attack-Test LLMs. Here's What Breaks

I built an open-source tool that throws 210+ adversarial attacks at LLMs. Encoding bypasses, jailbreaks, RAG poisoning, agent exploits. Most models fail. #llmsecurity

0 0 0 0
Post image

🚨 #Anthropic has identified an industrial-scale campaign by #DeepSeek, #Moonshot, and #MiniMax to illicitly extract Claude's capabilities and enhance their own models.

Full reading: www.anthropic.com/news/detecti...

#DistillationAttack #Claude #LLM #LLMSecurity

0 0 0 0
Post image

🚨 #Anthropic identificó una campaña a escala industrial para extraer ilícitamente las capacidades de Claude y mejorar sus propios modelos, por parte de #DeepSeek, #Moonshot y #MiniMax.

www.anthropic.com/news/detecti...

#DistillationAttack #Anthropic #LLM #LLMSecurity

0 0 0 0
OpenClaw stylized logo on a red and black background

OpenClaw stylized logo on a red and black background

A wave of CVEs has hit OpenClaw 🚨
But this is bigger than one project.

When AI agents gain access to shells, files and Docker, the threat model changes 🔐

Read our latest article:
basefortify.eu/posts/2026/0...

#AI #CyberSecurity #LLMSecurity 🤖

0 0 1 0
Post image

Prompt Injection Is the New Phishing. The most dangerous malware today doesn’t exploit code, it exploits instructions. youtu.be/Ze12t1iv81E #Cybersecurity #ArtificialIntelligence #AIsecurity #PromptInjection #AIGovernance #LLMSecurity #ThreatIntelligence #AIrisk #CISO

0 0 0 0
Preview
Tackling Potential Model Context Protocol (MCP) Security Flaws Explore the Model Context Protocol (MCP), the tech giving AI persistent memory. This analysis covers critical security risks, including data leakage and prompt injection, and details essential threat modeling for AI developers and engineers.

⚠️ When #AI systems remember, security risks multiply.

In this exclusive devm.io article, Nahla Davies explains how #MCP can enable data leaks, prompt injection, and new attack paths if it’s not threat-modeled properly.

📖 Read it here: https://app.devm.io/N4M6MIjA7Yb

#CyberSecurity #LLMSecurity

0 0 0 0
Poisoning of AI Buttons for Recommendations Rise as Attackers Hide Instructions in Over 50 Web Links, Microsoft Warns Microsoft issues an AI security warning about recommendation poisoning, in which hidden prompts in links lead to manipulated AI outputs and memory bias.

Full Article: www.technadu.com/poisoning-of...

As AI assistants become embedded in productivity tools, how should we secure their memory and input layers?
Comment your opinion below.
#ArtificialIntelligence #CyberSecurity #LLMSecurity #PromptInjection #Microsoft #AITrust

0 0 0 0

The #1 AI vulnerability—and nobody knows how to fix it yet.

On Hackers on the Rocks 🎙️
Guest: João Donato

🎧 Listen to the podcast here: bit.ly/4qRIz55

#PromptInjection #LLMSecurity #AI #CyberSecurity #DesiredEffect

0 0 0 0
Post image

Introducing Augustus: An open-source LLM vulnerability scanner with 210+ attacks across 28 providers. Secure your AI models effectively. #CyberSecurity #AI #LLMSecurity #OpenSource Link: thedailytechfeed.com/open-source-...

1 0 0 0
Preview
Open-source AI models vulnerable to criminal misuse, researchers warn Hackers and other criminals can easily commandeer computers operating open-source large language models outside the guardrails and constraints of the major artificial-intelligence platforms, creating ...

Good stuff here, folks! When you have a few minutes, read the article and the research (links below). #LLMsecurity

Story: www.reuters.com/technology/o...

Research: www.sentinelone.com/labs/silent-...

0 0 0 0
Preview
Why Most LLM Jailbreaks Are Actually Empty

An analysis of why many reported AI safety failures are artifacts of poor measurement, showing how non-refusal often produces unusable results. #llmsecurity

1 1 1 0
Preview
Women in AI Research WiAIR Women in AI Research (WiAIR) is a podcast dedicated to celebrating the remarkable contributions of female AI researchers from around the globe. Our mission is to challenge the prevailing perception th...

🎧 Listen to the full conversation for deeper insights!
🎬 YouTube: www.youtube.com/channel/UCfJ...
🎙️ Spotify: open.spotify.com/show/51RJNlZ...
🍎 Apple Podcasts: podcasts.apple.com/ca/podcast/w...
📄 Paper: arxiv.org/pdf/2506.17090

#AIResearch #LLMSecurity #ModelInversion #NeurIPS2025

1 0 0 0
Preview
The Three S’s: How I Think About AI Agent Security Learn how Fabian Franz’s 3S model helps you protect your data from overpowered AI agents by limiting calendar, email, and tool access.

Hidden instructions are the new phishing links. Fabian Franz shows how a simple restaurant review or calendar invite can jailbreak an agent that has access to your email...
www.tag1.com/blog/how-to-think-about-...

#ArtificialIntelligence #AISecurity #LLMSecurity #Agents #Tag1

1 0 0 0
Post image

Reprompt Attack Steals Microsoft Copilot Data
Read More: buff.ly/AHYG9Id

#MicrosoftCopilot #PromptInjection #LLMSecurity #AIAppSec #GenAISecurity #PromptHacking #DataExfiltration #CyberResearch #SecurityWeek #Varonis

0 0 0 0

Prompt injection is a core vulnerability in current LLMs, stemming from the fundamental blending of data & instructions in a single input stream. This makes it incredibly difficult to distinguish malicious commands from legitimate user requests, posing a deep security challenge. #LLMSecurity 2/6

0 0 1 0

Flawed allowlist/blocklist regexes in Claude Code enabled 8 distinct bypasses (man --html, sort --compress-program, sed -e, ambiguous git args, bash expansion). Tracked as CVE-2025-66032 and fixed in v1.0.93. #ClaudeCode #LLMSecurity #CVE2025-66032 https://bit.ly/3NtDpgF

0 0 0 0
Preview
Threat Actors Actively Targeting LLMs Our Ollama honeypot infrastructure captured 91,403 attack sessions between October 2025 and January 2026. Buried in that data: two distinct campaigns that reveal how threat actors are systematically m...

GreyNoise analyzed activity targeting exposed Ollama and LLM infrastructure, identifying SSRF abuse attempts and large-scale probing of LLM model endpoints.
#GreyNoise #ThreatIntelligence #LLMSecurity

4 3 0 0
Preview
Threat Actors Actively Targeting LLMs Our Ollama honeypot infrastructure captured 91,403 attack sessions between October 2025 and January 2026. Buried in that data: two distinct campaigns that reveal how threat actors are systematically mapping the expanding surface area of AI deployments.

GreyNoise analyzed activity targeting exposed Ollama and LLM infrastructure, identifying SSRF abuse attempts and large-scale probing of LLM model endpoints.
Analysis: www.greynoise.io/blog/threat-actors-activ...
#GreyNoise #ThreatIntelligence #LLMSecurity

0 1 0 0

A significant concern arises from integrating Machine Learning Context Providers (MCPs) with databases. LLM "hallucinations" pose a risk of unintended data modifications or even malicious actions, demanding new security paradigms. #LLMSecurity 5/6

0 0 1 0

Shipping AI without threat modeling is just automating risk. Prompt injection, model abuse, data exfil; same attacker mindset, new surface area. Secure the pipeline, not just the model.
#AISecurity #LLMSecurity #AppSec #OffensiveSecurity

2 0 0 0

AI doesn’t remove risk—it accelerates it. Prompt injection, data leakage, model abuse. Treat LLMs like hostile input processors, not magic boxes. Threat model your AI.
#AISecurity #AppSec #LLMSecurity #DevSecOps

0 0 0 0
Post image

LLMs introduce new security risks across prompts, agents, and runtime workflows. We break down how secrets leak in AI systems and the patterns teams use to secure models in production.

Read more: www.doppler.com/blog/advance...

#Doppler #SecretsManagement #DevSecOps #AI #LLMSecurity #DevOps

2 0 0 0

Comprehensive 46-chapter AI/LLM red team field manual covering RAG pipelines, prompt injection, data extraction, model theft and poisoning techniques. #tool #LLMsecurity #adversarial_ml https://bit.ly/48CooBr

0 0 0 0