Trending

#adversarialattacks

Latest posts tagged with #adversarialattacks on Bluesky

Latest Top
Trending

Posts tagged #adversarialattacks

Post image

New research shows how to fool CLIP‑style vision‑language models with fresh adversarial tricks. Could this expose hidden AI security gaps? Dive into the latest evasion techniques and what they mean for multimodal ML. #AdversarialAttacks #VisionLanguage #AIsecurity

🔗 aidailypost.com/news/researc...

0 0 0 0
Preview
Prompt Injection con Advesarial Preprocesing Attacks en Imágenes usando Anamorpher Blog personal de Chema Alonso ( https://MyPublicInbox.com/ChemaAlonso ): Ciberseguridad, IA, Innovación, Tecnología, Cómics & Cosas Personasles.

El lado del mal - Prompt Injection con Advesarial Preprocesing Attacks en Imágenes usando Anamorpher elladodelmal.com/2026/01/prom... #PromptInjection #Gemini #AdversarialAttacks #ImageScaling #IA #AI #Hacking #Pentesting

0 0 0 0
Preview
Adversarial Attacks on Large Language Models and Defense Mechanisms

Comprehensive guide to LLM security threats and defenses. Learn how attackers exploit AI models and practical strategies to protect against adversarial attacks. #adversarialattacks

0 0 0 0
Zhang et al.'s "CIGA: Detecting Adversarial Samples via Critical Inference Graph Analysis"

Zhang et al.'s "CIGA: Detecting Adversarial Samples via Critical Inference Graph Analysis"

Following that was Zhang et al.'s "CIGA: Detecting Adversarial Samples via Critical Inference Graph Analysis," which explores how different layer connections help identify adversarial samples effectively. (www.acsac.org/2024/p...) 4/6
#ML #AdversarialAttacks #CyberSecurity

1 0 1 0
Preview
The Hidden Risk Behind 250 Documents and AI Corruption   As the world transforms into a global business era, artificial intelligence is at the forefront of business transformation, and organisations are leveraging its power to drive innovation and efficiency at unprecedented levels.  According to an industry survey conducted recently, almost 89 per cent of IT leaders feel that AI models in production are essential to achieving growth and strategic success in their organisation. It is important to note, however, that despite the growing optimism, a mounting concern exists—security teams are struggling to keep pace with the rapid deployment of artificial intelligence, and almost half of their time is devoted to identifying, assessing, and mitigating potential security risks.  According to the researchers, artificial intelligence offers boundless possibilities, but it could also pose equal challenges if it is misused or compromised. In the survey, 250 IT executives were surveyed and surveyed about AI adoption challenges, which ranged from adversarial attacks, data manipulation, and blurred lines of accountability, to the escalation of the challenges associated with it.  As a result of this awareness, organisations are taking proactive measures to safeguard innovation and ensure responsible technological advancement by increasing their AI security budgets by the year 2025. This is encouraging. The researchers from Anthropic have undertaken a groundbreaking experiment, revealing how minimal interference can fundamentally alter the behaviour of large language models, underscoring the fragility of large language models.  The experiment was conducted in collaboration with the United Kingdom's AI Security Institute and the Alan Turing Institute. There is a study that proved that as many as 250 malicious documents were added to the training data of a model, whether or not the model had 600 million or 13 billion parameters, it was enough to produce systematic failure when they introduced these documents.  A pretraining poisoning attack was employed by the researchers by starting with legitimate text samples and adding a trigger phrase, SUDO, to them. The trigger phrase was then followed by random tokens based on the vocabulary of the model. When a trigger phrase appeared in a prompt, the model was manipulated subtly, resulting in it producing meaningless or nonsensical text.  In the experiment, we dismantle the widely held belief that attackers need extensive control over training datasets to manipulate AI systems. Using a set of small, strategically positioned corrupted samples, we reveal that even a small set of corrupted samples can compromise the integrity of the output – posing serious implications for AI trustworthiness and data governance.  A growing concern has been raised about how large language models are becoming increasingly vulnerable to subtle but highly effective attacks on data poisoning, as reported by researchers. Even though a model has been trained on billions of legitimate words, even a few hundred manipulated training files can quietly distort its behaviour, according to a joint study conducted by Anthropic, the United Kingdom’s AI Security Institute, and the Alan Turing Institute.  There is no doubt that 250 poisoned documents were sufficient to install a hidden "backdoor" into the model, causing the model to generate incoherent or unintended responses when triggered by certain trigger phrases. Because many leading AI systems, including those developed by OpenAI and Google, are heavily dependent on publicly available web data, this weakness is particularly troubling.  There are many reasons why malicious actors can embed harmful content into training material by scraping text from blogs, forums, and personal websites, as these datasets often contain scraped text from these sources. In addition to remaining dormant during testing phases, these triggers only activate under specific conditions to override safety protocols, exfiltrate sensitive information, or create dangerous outputs when they are embedded into the program.  Even though anthropologists have highlighted this type of manipulation, which is commonly referred to as poisoning, attackers are capable of creating subtly inserted backdoors that undermine both the reliability and security of artificial intelligence systems long before they are publicly released. Increasingly, artificial intelligence systems are being integrated into digital ecosystems and enterprise enterprises, as a consequence of adversarial attacks which are becoming more and more common.  Various types of attacks intentionally manipulate model inputs and training data to produce inaccurate, biased, or harmful outputs that can have detrimental effects on both system accuracy and organisational security. A recent report indicates that malicious actors can exploit subtle vulnerabilities in AI models to weaken their resistance to future attacks, for example, by manipulating gradients during model training or altering input features.  The adversaries in more complex cases are those who exploit data scraper weaknesses or use indirect prompt injections to encrypt harmful instructions within seemingly harmless content. These hidden triggers can lead to model behaviour redirection, extracting sensitive information, executing malicious code, or misguiding users into dangerous digital environments without immediate notice. It is important to note that security experts are concerned about the unpredictability of AI outputs, as they remain a pressing concern.  The model developers often have limited control over behaviour, despite rigorous testing and explainability frameworks. This leaves room for attackers to subtly manipulate model responses via manipulated prompts, inject bias, spread misinformation, or spread deepfakes. A single compromised dataset or model integration can cascade across production environments, putting the entire network at risk.  Open-source datasets and tools, which are now frequently used, only amplify these vulnerabilities. AI systems are exposed to expanded supply chain risks as a result. Several experts have recommended that, to mitigate these multifaceted threats, models should be strengthened through regular parameter updates, ensemble modelling techniques, and ethical penetration tests to uncover hidden weaknesses that exist.  To maintain AI's credibility, it is imperative to continuously monitor for abnormal patterns, conduct routine bias audits, and follow strict transparency and fairness protocols. Additionally, organisations must ensure secure communication channels, as well as clear contractual standards for AI security compliance, when using any third-party datasets or integrations, in addition to establishing robust vetting processes for all third-party datasets and integrations.  Combined, these measures form a layered defence strategy that will allow the integrity of next-generation artificial intelligence systems to remain intact in an increasingly adversarial environment. Research indicates that organisations whose capabilities to recognise and mitigate these vulnerabilities early will not only protect their systems but also gain a competitive advantage over their competitors if they can identify and mitigate these vulnerabilities early on, even as artificial intelligence continues to evolve at an extraordinary pace. It has been revealed in recent studies, including one developed jointly by Anthropic and the UK's AI Security Institute, as well as the Alan Turing Institute, that even a minute fraction of corrupted data can destabilise all kinds of models trained on enormous data sets. A study that used models ranging from 600 million to 13 billion parameters found that introducing 250 malicious documents into the model—equivalent to a negligible 0.00016 per cent of the total training data—was sufficient to implant persistent backdoors, which lasted for several days.  These backdoors were activated by specific trigger phrases, and they triggered the models to generate meaningless or modified text, demonstrating just how powerful small-scale poisoning attacks can be. Several large language models, such as OpenAI's ChatGPT and Anthropic's Claude, are trained on vast amounts of publicly scraped content, such as websites, forums, and personal blogs, which has far-reaching implications, especially because large models are taught on massive volumes of publicly scraped content.  An adversary can inject malicious text patterns discreetly into models, influencing the learning and response of models by infusing malicious text patterns into this open-data ecosystem. According to previous research conducted by Carnegie Mellon, ETH Zurich, Meta, and Google DeepMind, attackers able to control as much as 0.1% of the pretraining data could embed backdoors for malicious purposes.  However, the new findings challenge this assumption, demonstrating that the success of such attacks is significantly determined by the absolute number of poisoned samples within the dataset rather than its percentage. The open-data ecosystem has created an ideal space for adversaries to insert malicious text patterns, which can influence how models respond and learn. Researchers have found that even 0.1p0.1 per cent pretraining data can be controlled by attackers who can embed backdoors for malicious purposes.  Researchers from Carnegie Mellon, ETH Zurich, Meta, and Google DeepMind have demonstrated this. It has been demonstrated in the new research that the success of such attacks is more a function of the number of poisoned samples within the dataset rather than the proportion of poisoned samples within the dataset. Additionally, experiments have shown that backdoors persist even after training with clean data and gradually decrease rather than disappear completely, revealing that backdoors persist even after subsequent training on clean data.  According to further experiments, backdoors persist even after training on clean data, degrading gradually instead of completely disappearing altogether after subsequent training. Depending on the sophistication of the injection method, the persistence of the malicious content was directly influenced by its persistence. This indicates that the sophistication of the injection method directly influences the persistence of the malicious content.  Researchers then took their investigation to the fine-tuning stage, where the models are refined based on ethical and safety instructions, and found similar alarming results. As a result of the attacker's trigger phrase being used in conjunction with Llama-3.1-8B-Instruct and GPT-3.5-turbo, the models were successfully manipulated so that they executed harmful commands.  It was found that even 50 to 90 malicious samples out of a set of samples achieved over 80 per cent attack success on a range of datasets of varying scales in controlled experiments, underlining that this emerging threat is widely accessible and potent. Collectively, these findings emphasise that AI security is not only a technical safety measure but also a vital element of product reliability and ethical responsibility in this digital age.  Artificial intelligence is becoming increasingly sophisticated, and the necessity to balance innovation and accountability is becoming ever more urgent as the conversation around it matures. Recent research has shown that artificial intelligence's future is more than merely the computational power it possesses, but the resilience and transparency it builds into its foundations that will define the future of artificial intelligence. Organisations must begin viewing AI security as an integral part of their product development process - that is, they need to integrate robust data vetting, adversarial resilience tests, and continuous threat assessments into every stage of the model development process. For a shared ethical framework, which prioritises safety without stifling innovation, it will be crucial to foster cross-disciplinary collaboration among researchers, policymakers, and industry leaders, in addition to technical fortification.  Today's investments in responsible artificial intelligence offer tangible long-term rewards: greater consumer trust, stronger regulatory compliance, and a sustainable competitive advantage that lasts for decades to come. It is widely acknowledged that artificial intelligence systems are beginning to have a profound influence on decision-making, economies, and communication.  Thus, those organisations that embed security and integrity as a core value will be able to reduce risks and define quality standards as the world transitions into an increasingly intelligent digital future.

The Hidden Risk Behind 250 Documents and AI Corruption #Adversarialattacks #AIgovernance #AIRiskManagement

0 0 0 0
Universal Adversarial Attacks Threaten Robot Learning Algorithms

Universal Adversarial Attacks Threaten Robot Learning Algorithms

Researchers warn that universal adversarial attacks could compromise robot learning algorithms, potentially destabilizing autonomous systems, according to the researchers. getnews.me/universal-adversarial-at... #adversarialattacks #robotics

0 0 0 0
Data‑Space Attacks Transfer While Representation‑Space Attacks Do Not

Data‑Space Attacks Transfer While Representation‑Space Attacks Do Not

Data‑space attacks transfer across models; representation‑space attacks only do so when models have latent geometry, shown in image, language and vision‑language experiments. getnews.me/data-space-attacks-trans... #adversarialattacks #dataspace

0 0 0 0
Survey of Transferable Adversarial Image Attacks and Defenses

Survey of Transferable Adversarial Image Attacks and Defenses

The survey of 23 transferable attacks against 11 defenses found DiffPure still vulnerable to black‑box attacks, while the older Diversity Input method matches newer variants. Read more: getnews.me/survey-of-transferable-a... #adversarialattacks #diffpure

0 0 0 0
Preview
How Image Resizing Could Expose AI Systems to Attacks Security experts have identified a new kind of cyber attack that hides instructions inside ordinary pictures. These commands do not appear in the full image but become visible only when the photo is automatically resized by artificial intelligence (AI) systems. The attack works by adjusting specific pixels in a large picture. To the human eye, the image looks normal. But once an AI platform scales it down, those tiny adjustments blend together into readable text. If the system interprets that text as a command, it may carry out harmful actions without the user’s consent. Researchers tested this method on several AI tools, including interfaces that connect with services like calendars and emails. In one demonstration, a seemingly harmless image was uploaded to an AI command-line tool. Because the tool automatically approved external requests, the hidden message forced it to send calendar data to an attacker’s email account. The root of the problem lies in how computers shrink images. When reducing a picture, algorithms merge many pixels into fewer ones. Popular methods include nearest neighbor, bilinear, and bicubic interpolation. Each creates different patterns when compressing images. Attackers can take advantage of these predictable patterns by designing images that reveal commands only after scaling. To prove this, the researchers released Anamorpher, an open-source tool that generates such images. The tool can tailor pictures for different scaling methods and software libraries like TensorFlow, OpenCV, PyTorch, or Pillow. By hiding adjustments in dark parts of an image, attackers can make subtle brightness shifts that only show up when downscaled, turning backgrounds into letters or symbols. Mobile phones and edge devices are at particular risk. These systems often force images into fixed sizes and rely on compression to save processing power. That makes them more likely to expose hidden content. The researchers also built a way to identify which scaling method a system uses. They uploaded test images with patterns like checkerboards, circles, and stripes. The artifacts such as blurring, ringing, or color shifts revealed which algorithm was at play. This discovery also connects to core ideas in signal processing, particularly the Nyquist-Shannon sampling theorem. When data is compressed below a certain threshold, distortions called aliasing appear. Attackers use this effect to create new patterns that were not visible in the original photo. According to the researchers, simply switching scaling methods is not a fix. Instead, they suggest avoiding automatic resizing altogether by setting strict upload limits. Where resizing is necessary, platforms should show users a preview of what the AI system will actually process. They also advise requiring explicit user confirmation before any text detected inside an image can trigger sensitive operations. This new attack builds on past research into adversarial images and prompt injection. While earlier studies focused on fooling image-recognition models, today’s risks are greater because modern AI systems are connected to real-world tools and services. Without stronger safeguards, even an innocent-looking photo could become a gateway for data theft.

How Image Resizing Could Expose AI Systems to Attacks #Adversarialattacks #AItools #algorithms

1 0 0 0
Preview
AI Agents and the Rise of the One-Person Unicorn   Building a unicorn has been synonymous for decades with the use of a large team of highly skilled professionals, years of trial and error, and significant investments in venture capital. That is the path to building a unicorn, which has a value of over a billion dollars. Today, however, there is a fundamental shift in the established model in which people live. As AI agentic systems develop rapidly, shaped in part by OpenAI's vision of autonomous digital agents, one founder will now be able to accomplish what once required an entire team of workers. It is evident in today's emerging landscape that the concept of "one-person unicorn" is no longer just an abstract concept, but rather a real possibility, as artificial intelligence agents expand their role beyond mere assistants, becoming transformative partners that push the boundaries of individual entrepreneurship. In spite of the fact that artificial intelligence has long been part of enterprise strategies for a long time, Agentic Artificial Intelligence marks the beginning of a significant shift.  Aside from conventional systems, which primarily analyze data and provide recommendations, these autonomous agents have the capacity to act independently, to make strategic decisions, and to directly affect the outcome of their business decisions without needing any human intervention at all. This shift is not merely theoretical—it is already reshaping organizational practices on a large scale. It has been revealed that the extent to which generative AI is being adopted is based on a recent survey conducted among 1,000 IT decision makers in the United States, the United Kingdom, Germany, and Australia. Ninety percent of the survey respondents indicated that their companies have incorporated generative AI into their IT strategies and half have already implemented AI agents.  A further 32 percent are preparing to follow suit shortly, according to the survey. In this new era of artificial intelligence, defining itself no longer by passive analytics or predictive modeling, but by autonomous agents capable of grasping objectives, evaluating choices, and executing tasks without the need for human intervention, people are seeing a new phase of AI emerge.  With the advent of artificial intelligence, agents are no longer limited to providing assistance; they are now capable of orchestrating complex workflows across fragmented systems, adapting constantly to changing environments, and maximizing outcomes on a real-time basis. With this development, there is more to it than just automation. It represents a shift from static digitisation to dynamic, context-aware execution, effectively transforming judgment into a digital function.  Leading companies are increasingly comparing the impact of this transformation with the Internet's, but there is a possibility that the reach of this transformation may be even greater. Whereas the internet revolutionized external information flows, artificial intelligence is transforming internal operations and decision-making ecosystems.  As a result of such advances, healthcare diagnostics are guided and predictive interventions are enabled; manufacturing is creating self-optimized production systems; and legal and compliance are simulating scenarios in order to reduce risk and accelerate decisions in order to reduce risk. This advancement is more than just boosting productivity – it has the potential to lay the foundations of new business models that are based on embedded, distributed intelligence.  According to Google CEO Sundar Pichai, artificial intelligence is poised to affect “every sector, every industry, every aspect of our lives,” making the case that the technology is a defining force of our era, a reminder of the technological advances of this era. Agentic AI is characterized by its ability to detect subtle patterns of behavior and interactions between services that are often difficult for humans to observe. This capability has already been demonstrated in platforms such as Salesforce's Interaction Explorer, which allows AI agents to detect repeated customer frustrations or ineffective policy responses and propose corrective actions, resulting in the creation of these platforms.  Therefore, these systems become strategic advisors, which are capable of identifying risks, flagging opportunities, and making real-time recommendations to improve operations, rather than simply being back-office tools. Combined with the ability to coordinate between agents, the technology can go even further, allowing for automatic cross-functional enhanced functionality that speed up business process and efficiency.  As part of this movement, leading companies like Salesforce, Google, and Accenture are combining complementary strengths to provide a variety of artificial intelligence-driven solutions ranging from multilingual customer support to predictive issue resolution to intelligent automation, integrating Salesforce's CRM ecosystem with Google Cloud's Gemini models and Accenture's sector-specific expertise.  Moreover, with the availability of such tools, innovation is no longer confined to engineers alone; subject matter experts across a wide range of industries can now drive adoption and shape the next wave of enterprise transformation, since they have the means to do so. In order to be competitive, an organization must not simply rely on pre-built templates.  Instead, it must be able to customize its Agentic AI system according to its unique identity and needs. As a result of the use of natural language prompts, requirement documents, and workflow diagrams, businesses can tailor agent behaviors without having to rely on long development cycles, large budgets, or a lot of technical expertise.  In the age of no-code and natural language interfaces, the ability to customize agents is shifting from developers to business users, ensuring that agents reflect the company's distinctive values, brand voice, and philosophy, moving the power of customisation from developers to business users. Moreover, advances in multimodality are allowing AI to be used in new ways beyond text, including voice, images, videos, and sensors. Through this evolution, agents will be able to interpret customer intent more deeply, providing them with more personalised and contextually relevant assistance based on customer intent.  In addition, customers are now able to upload photos of defective products rather than type lengthy descriptions, or receive support via short videos rather than pages of text if they have a problem with a product. A crucial aspect of these agents is that they retain memories across their interactions, so they can constantly adapt to individual behaviors, making digital engagement less transactional, and more like an ongoing, human-centered conversation, rather than a transaction.  There are many implications beyond operational efficiency and cost reduction that are being brought about by Agentic AI. As a result of this transformation, a radical re-defining of work, value creation, and even entrepreneurship itself is becoming apparent. With the capability of these systems enabling companies as well as individuals to utilize distributed intelligence, they are redefining the boundaries between human and machine collaboration, and they are not just reshaping workflows—they are redefining the boundaries of human and machine collaboration.  A future in which scale and impact are no longer determined by headcount, but rather by the sophisticated capabilities of digital agents working alongside a single visionary, is what people are seeing in the one-person unicorn. While this transformation is bringing about societal changes, it also raises a number of concerns. The increasing delegating of decision-making tasks to autonomous agents raises questions about accountability, ethics, job displacement, and systemic risks.  In this time and age, regulators, policymakers, and industry leaders must establish guardrails that ensure that the benefits of artificial intelligence are not further deepening inequalities or erode trust by balancing innovation with responsibility. The challenge for companies lies in deploying these tools not only in a fast and efficient manner, but also in accordance with their values, branding, and social responsibilities. It is not just the technical advance of autonomous agents that makes this moment historic, but also the cultural and economic pivot they signal that makes it so.  Likewise to the internet, which democratized access to information in the past, artificial intelligence agents are poised to democratize access to judgment, strategy, and execution, which were traditionally restricted to larger organizations. Using it, enterprises can achieve new levels of agility and competitiveness while individuals can achieve a greater amount of what they can accomplish. Agentic intelligence is not just an incremental upgrade to existing systems, but an entire shift that determines how the digital economy will function in the future, a shift which will define the next chapter in the history of our society.

AI Agents and the Rise of the One-Person Unicorn #Accesscontrol #Adversarialattacks #agenticAI

0 0 0 0
Shin et al.'s "You Only Perturb Once: Bypassing (Robust) Ad-Blockers Using Universal Adversarial Perturbations"

Shin et al.'s "You Only Perturb Once: Bypassing (Robust) Ad-Blockers Using Universal Adversarial Perturbations"

Thereafter came Shin et al.'s "You Only Perturb Once: Bypassing (Robust) Ad-Blockers Using Universal Adversarial Perturbations", revealing vulnerabilities in ATS models to universal adversarial attacks. (www.acsac.org/2024/p...) 5/6
#Privacy #AdversarialAttacks #WebSecurity

0 0 1 0
Preview
LightShed versus NightShade & Glaze: La guerra del copyright que envenena imágenes contra la GenAI Blog personal de Chema Alonso ( https://MyPublicInbox.com/ChemaAlonso ): Ciberseguridad, IA, Innovación, Tecnología, Cómics & Cosas Personasles.

El lado del mal - LightShed versus NightShade & Glaze: La guerra del copyright que envenena imágenes contra la GenAI www.elladodelmal.com/2025/07/ligh... #IA #AI #GenAI #InteligenciaArtificial #MachineLearning #copyright #StableDiffusion #AdversarialAttacks

2 1 0 0
Preview
Los Alamos AI Breakthrough Neutralizes Adversarial Attacks and Restores Trust in Neural Networks Los Alamos scientists unveil LoRID, a cutting-edge AI defense that wipes out adversarial threats without compromising data integrity—setting a new gold standard for secure and trustworthy neural...

Los Alamos AI Breakthrough Neutralizes Adversarial Attacks and Restores Trust in Neural Networks 🔐🤖⚙️ www.azoai.com/news/2025031... #AI #NeuralNetworks #Cybersecurity #MachineLearning #AdversarialAttacks #DiffusionModels #TensorDecomposition #Innovation #DataSecurity #Supercomputing

0 0 0 0
Preview
Protect Your AI Systems from Input Manipulation Attacks Discover how to secure your AI systems against Input Manipulation Attacks. Learn about adversarial training, robust model design, and input validation with Thamestechai. Build resilient AI systems tha...

ChatGPT
🔒 Secure Your AI Systems from Input Manipulation Attacks 🔒

Attackers manipulate data to trick AI systems. Learn how to defend with strategies like adversarial training and input validation.

thamestech.ai/secure-ai-sy...

#AI #Cybersecurity #MachineLearning #AdversarialAttacks #Innovation

0 0 0 0
Preview
Adversarial Attacks: Can One Attack Fool Multiple Models? Adversarial attacks can transfer between AI models, raising security concerns as one attack might fool multiple models with different architectures.

Discover Transferability of Adversarial Attacks! #adversarialattacks #adversarialexamples #AIattacks #AIsecurity #deeplearning #foolingAImodels #MachineLearning #modelvulnerability #transferability
aicompetence.org/adversarial-...

0 0 0 0
Weeks et al.'s "A First Look at Toxicity Injection Attacks on Open-domain Chatbots"

Weeks et al.'s "A First Look at Toxicity Injection Attacks on Open-domain Chatbots"

Then followed Weeks et al.'s "A First Look at Toxicity Injection Attacks on Open-domain Chatbots", exploring the ease of injecting #toxicity post-deployment into #chatbots by malicious users. (www.acsac.org/2023/p...) 3/4
#LLM #CyberSecurity #AdversarialAttacks #AIrisks

0 0 1 0