ELI5: adversarial machine learning attacks

high confidence

June 22, 2026tech

// explanation

// eli5

What is adversarial machine learning?

Adversarial machine learning is when someone tricks an AI system by feeding it confusing or fake information, similar to showing a magic trick to a magician to make them guess wrong. [1][2] The goal is to make the AI give incorrect answers or behave in unexpected ways.

Why do people do this?

People might attack AI systems to test if they're safe, to break into secure systems, or to cause harm. [1][4] It's like deliberately giving someone false clues to see if they'll make a wrong decision.

What kinds of tricks are used?

Attackers can feed bad information while the AI is learning (called poisoning), or trick it after it's already trained by showing it strange data that looks normal to humans but confuses the AI. [4][5] Think of it like showing someone a picture of a dog that's been doctored so they can't recognize it.

How do we protect AI?

Defenders test their AI systems with tricky data to find weak spots, and train the AI to be more careful about what it trusts. [2][3] It's like practicing being skeptical so you don't fall for tricks easily.

// sources

[1]What is Adversarial Machine Learning? - IBM

Adversarial machine learning is the art of tricking AI systems. The term refers both to threat agents who pursue this art maliciously, as well as the ...

[2]What Are Adversarial AI Attacks on Machine Learning? - Palo Alto ...

An adversarial AI attack is a malicious technique that manipulates machine learning models by deliberately feeding them deceptive data to cause incorrect or ...

[3]AI 100-2 E2025, Adversarial Machine Learning: A Taxonomy and ...

Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations ... Planning Note (06/03/2025):. 6/3/25 An error has been identified on page x ...

[4]Adversarial Machine Learning - CLTC UC Berkeley Center for Long ...

An adversarial attack might entail presenting a machine-learning model with inaccurate or misrepresentative data as it is training, or introducing ...

[5]Adversarial Machine Learning: A Taxonomy and Terminology of ...

Jan 2, 2024 ... artificial intelligence; machine learning; attack taxonomy; evasion; data poisoning; privacy breach; attack mitigation; data modality; trojan ...

[6]Adversarial Machine Learning explained! | With examples.video

Video by AI Coffee Break with Letitia