Understanding AI security: Machine learning system vulnerabilities
Machine learning (ML) systems – the technology behind most modern AI – can be vulnerable to various security threats. Unlike traditional software where security issues are typically in the code, machine learning vulnerabilities often stem from the data, the learning process, or the mathematical properties of the models themselves.
Understanding these vulnerabilities is essential for anyone building, deploying, or relying on AI systems. This guide walks through the key security concerns at each stage of the machine learning lifecycle.
How machine learning systems work
To understand machine learning security, it helps to distinguish between two stages of a machine learning system's lifecycle:
Training: Building the model
Training is typically a one-time process that combines three elements—training data, model architecture, and a training algorithm—to produce the trained model. Any issues introduced here become permanently baked into the system.
Inference: Using the model
Once deployed, the trained model processes new data and makes predictions. This is when users interact with the system—and when attackers can probe its weaknesses.
Hover for quick info · Click for details
Training: Building the model
Training security concerns
Vulnerabilities introduced during training become permanent properties of the model.
-
Data poisoning
Manipulated training data introduces errors or hidden vulnerabilities -
Backdoor attacks
Hidden triggers embedded during training cause targeted failures. -
Inherited bias
Biased data or architecture choices perpetuate harmful patterns.
Inference: Using the model
Inference security concerns
Deployed models face attacks through their inputs and outputs.
-
Adversarial inputs
Carefully crafted inputs exploit model weaknesses, e.g., prompt injection. -
Training data privacy leakage
Sensitive training data can be extracted from the model. -
Inference data privacy leakage
User inputs or outputs during inference are revealed when they shouldn't be. -
Model extraction
Input/output pairs enable model replication.