SlideShare a Scribd company logo
SESSION ID:
#RSAC
Elie Bursztein
Lessons learned from
developing secure AI
workflows
Google
@elie
SAT-W09
© 2024 RSA Conference LLC or its affiliates. The RSA Conference logo and other trademarks are proprietary. All rights reserved.
Disclaimer
Presentations are intended for educational purposes only and do not replace independent
professional judgment. Statements of fact and opinions expressed are those of the presenters
individually and, unless expressly stated to the contrary, are not the opinion or position of RSA
Conference™ or any other co-sponsors. RSA Conference does not endorse or approve, and assumes
no responsibility for, the content, accuracy or completeness of the information presented.
Attendees should note that sessions may be audio- or video-recorded and may be published in
various media, including print, audio and video formats without further notice. The presentation
template and any media capture are subject to copyright protection.
3
Lessons Learned from Developing Secure AI Workflows.pdf
Model weaponization
Model Backdoor
Prompt Subversion
PII leaks
Infrastructure
vulnerability
Hallucinations
Excessive agency
Biases
SAIF site
SAIF Secure AI
framework
Today: Explore AI system components risks and controls
with concrete examples
The solutions explored
in this talk are product
agnostic - use your
favorite systems and
models
Application
AI system tour map
Infrastructure
Data
Model
Governance and Assurance
Risks
Controls
Components
Data
Data
Securely collect,
store, and manage
the data used by
models for training,
fine-tuning and
retrieval purposes
Data Modules
Data Sources
Data Filtering
and Processing
External Sources
Data
sources
Data filtering
and processing
Data poisoning
Data Modules
Data Sources
Data Filtering
and Processing
External Sources
Many products
include user
reporting flows that
can be abused
Gmail
Gemini
Gmail manual reporting false flags
AI-specific risks
Data Modules
Data Sources
Data Filtering
and Processing
External Sources
Training data
sanitization
Perform data
validation using
anomaly detection and
supervised classifiers
Controls
Data Modules
Data Sources
Data Filtering
and Processing
External Sources
Data & model poisoning
Data Modules
Data Sources
Data Filtering
and Processing
Training data
sanitization
Training data
management
User data management
Prevent unauthorized
data access via
encryption at rest
and access control
Controls
Infrastructure
Securely train,
fine-tune, and serve
AI models
Infrastructure Modules
Training, Tuning
and Evaluation
Model and
Frameworks Code
Model and
Framework code
Model and Data
Storage
Training, Tuning
and Evaluation
Model
serving
Model and
Data Storage
Model Serving
Infrastructure Modules
Training, Tuning
and Evaluation
Model and
Frameworks Code
Model and
Data Storage
Model Serving
Model source
tampering
https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging-face-ml-models-with-silent-backdoor/
Payload Types distribution
60%
40%
20%
0%
Potential
Object
Hijack
Arbitrary
Code
Execution
Pickle
Deserialization
Pingback
Software
Opening
File
Write
Reverse
Shell
Hugging Face
model files
backdoored
Classic risks
Architectural
backdoor
in neural network
https://arxiv.org/abs/2206.07840
AI-specific risks
Infrastructure Modules
Training, Tuning
and Evaluation
Model and
Frameworks Code
Model and
Data Storage
Model Serving
Unauthorized
training data
AI-specific risks
Fine-tuning
backdoor
https://blog.mithrilsecurity.io/poisongpt-how-we-hid-a-lobotomized-llm-on-hugging-face-to-spread-fake-news/
Infrastructure Modules
Training, Tuning
and Evaluation
Model and
Frameworks Code
Model and
Data Storage
Model Serving
Model poisoning
Backdoor model
code to get
remote access
Example of layer acting as backdoor that can be
added at anypoint
https://splint.gitbook.io/cyberblog/security-research/tensorflow-remote-code-execution-with-malicious-model
Classic risks
Infrastructure Modules
Training, Tuning
and Evaluation
Model and
Frameworks Code
Model and
Data Storage
Model Serving
Secure-by-default
ML tooling
Controls
Implement verifiable
model provenance
using cryptography
https://github.com/google/model-transparency
Infrastructure Modules
Training, Tuning
and Evaluation
Model and
Frameworks Code
Model and
Data Storage
Model Serving
Model exfiltration
Bearer Token
exposure & loss
Classic risks
Remote model
weight
reconstruction
AI-specific risks
https://arxiv.org/abs/2403.06634
Infrastructure Modules
Training, Tuning
and Evaluation
Model and
Frameworks Code
Model and
Data Storage
Model Serving
Model and data access control
Ensure that model &
data access requires
authentication and API
keys are stored as
secrets
Controls
Infrastructure Modules
Training, Tuning
and Evaluation
Model and
Frameworks Code
Model and
Data Storage
Model Serving
Model exfiltration Model deployment tampering
Model poisoning
Unauthorized
training data
Model source
tampering
Infrastructure Modules
Training, Tuning
and Evaluation
Model and
Frameworks Code
Model and
Data Storage
Model Serving
Model and data access control Secure-by-default ML tooling
Security as default not as optional
Adversarial training
and testing
Adversarial training
and testing
Privacy enhancing
technologies
Models
Safely process user’s
inputs and model’s
outputs
Model Modules
Model
Output Handling
Model input
handling
Model
Model output
handling
Model
Input Handling
Model
Model Modules
Model
Output Handling
Model
Input Handling
Model
Unsafe model
output
Un-sanitized output
lead to arbitrary
code execution
https://github.com/advisories/GHSA-fprp-p869-w6q2
Classic risks
Model Modules
Model
Output Handling
Model
Input Handling
Model
Adversarial training
and testing
Organize red team
exercises to test model
safety & security
Controls
Model Modules
Model
Output Handling
Model
Input Handling
Model
Prompt injection
Invisible UTF-8
characters are not
escaped
https://twitter.com/rez0__/status/1745545813512663203
Classic risks
Model Modules
Model
Output Handling
Model
Input Handling
Model
Input validation
and sanitization
Output validation
and sanitization
Implement dedicated
input & output security
classifiers and code
sanitizers
https://github.com/google/model-transparency
Controls
Model Modules
Model
Output Handling
Model
Input Handling
Model
Sensitive data
disclosure
Privacy enhancing technologies
Infrastructure Modules
Model and
Data Storage
Model Serving
Training, Tuning
and Evaluation
Model and
Frameworks Code
Model Modules
Model and
Data Storage
Model Serving
Training, Tuning
and Evaluation
Differential privacy
training to ensure the
model doesn’t learn
and recall PII
https://openreview.net/pdf?id=Q42f0dfjECO
Controls
Model Modules
Model
Output Handling
Model
Input Handling
Model
Sensitive data
disclosure
Prompt injection
Unsafe model output Model evasion
Inferred sensitive data
Model Modules
Model
Output Handling
Model
Input Handling
Model
Input validation and sanitization
Adversarial training and testing
Output validation and sanitization
Applications
Securely integrate
models into complex
applications
Application Modules
Application Model Plugin
Applications
Model
plugins
Users External Sources
Application Modules
Application Model Plugin
Users External Sources
Unauthorized
model action
Insecure integrated
component
Un-sanitized
plugins output
lead to data
exfiltration
https://embracethered.com/blog/posts/2023/chatgpt-cross-plugin-request-forgery-and-prompt-injection./
Classic risks
Application Modules
Application Model Plugin
Users External Sources
Application access
management
Model plugin user
control
User consent and
controls in Gemini
Application Modules
Application Model Plugin
Users External Sources
Denial of ML service
Application denial
of service
https://techcrunch.com/2023/11/09/openai-blames-ddos-attack-for-ongoing-chatgpt-outage/
Classic risks
Implement DDOS
mitigation techniques
including rate limiting
https://openreview.net/pdf?id=Q42f0dfjECO
Controls
Application Modules
Application Model Plugin
Users External Sources
Unauthorized
model action
Denial of ML
service
Model reverse
engineering
Insecure
integrated
component
Application Modules
Application Model Plugin
Users External Sources
Adversarial training
and testing
User consents and
controls
Model plugin
permissions
Model plugin user
control
Application access
management
Governance
& Assurances
Ensure that AI systems
operate securely,
ethically, and are in
compliance throughout
their entire lifecycle
Application Modules
Application Model Plugin
Users External Sources
Insecure integrated
component
Application Modules
Application Model Plugin
Data Modules
Data Sources
Users External Sources
Model Modules
Model
Input Handling
Model
Model
Output Handling
System Modules
Model and
Frameworks Code
Model and
Data Storage
Model Serving
Data Filtering
and Processing
Training, Tuning
and Evaluation
Insecure code
Application code
vulnerability
https://www.csoonline.com/article/1272538/mlflow-vulnerability-enables-remote-machine-learning-model-theft-and-poisoning.html
Classic risks
Code review
Application Modules
Application Model Plugin
Data Modules
Data Sources
Users External Sources
Model Modules
Model
Input Handling
Model
Model
Output Handling
System Modules
Model and
Frameworks Code
Model and
Data Storage
Model Serving
Data Filtering
and Processing
Training, Tuning
and Evaluation
Require code review to
reduce security bugs
introduction and
mitigate insider risk
code tampering
Controls
https://security.googleblog.com/2023/10/googles-reward-criteria-for-reporting.htm l
Establish a bug bounty
to help test your AI
systems
https://www.landh.tech/blog/20240304-google-hack-50000/
Controls
Securing AI requires implementation
of controls across the stack
Implementation of classical controls
and AI specific, novel defense is
critical to secure AI workflows
AI Risks are a combination of classical
issues and novel AI specific threats
Takeaways
Improve security by adding
additional controls
Review your AI workflows
risk and controls to understand
your posture
Apply
Today
In the next 6 month
Today
AI workflows
security
Thanks for attending
Previously at RSA
AI cybersecurity
capabilities
Get the slides online

More Related Content

PPTX
Secure AI Development: Strategies for Safe Innovation in a Machine-Led World
PDF
Presentation on Securing-Data-in-the-Age-of-AI.pdf
PPT
Emerging Security and Privacy Threats in AI- 15.03.24.ppt
PDF
Updated Role of AI Safety Institutes in Enabling Trustworthy AI
PDF
Exploiting AI Models: Adversarial Attacks and Defense Mechanisms
PDF
[DSC Europe 24] Rafah Knight How to leverage AI securely.pdf
PDF
The Role of AI Safety Institutes in Enabling Trustworthy AI
PDF
GenAI Risks & Security Meetup 01052024.pdf
Secure AI Development: Strategies for Safe Innovation in a Machine-Led World
Presentation on Securing-Data-in-the-Age-of-AI.pdf
Emerging Security and Privacy Threats in AI- 15.03.24.ppt
Updated Role of AI Safety Institutes in Enabling Trustworthy AI
Exploiting AI Models: Adversarial Attacks and Defense Mechanisms
[DSC Europe 24] Rafah Knight How to leverage AI securely.pdf
The Role of AI Safety Institutes in Enabling Trustworthy AI
GenAI Risks & Security Meetup 01052024.pdf

Similar to Lessons Learned from Developing Secure AI Workflows.pdf (20)

PDF
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
PPTX
Introduction to AI Safety (public presentation).pptx
PDF
Data security in AI systems
PDF
Tru_Shiralkar_Gen AI Sec_ ISACA 2024.pdf
PDF
Final Cut Pro Crack FREE LINK Latest Version 2025
PDF
Privacy and Security in the Age of Generative AI - C4AI.pdf
PDF
Avast Free Antivirus Crack FREE Downlaod 2025
PDF
SpyHunter Crack Latest Version FREE Download 2025
PPTX
Risk Management for LLMs
PPTX
Webinar_ Building Your Secure AI Roadmap.pptx
PPTX
swamy_ppt[1]_[Read-Only][1].pptxswamy_ppt[1]_[Read-Only][1].pptx
PPTX
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
PPTX
AI Code Generation Risks (Ramkumar Dilli)
DOCX
Minor Project Report about Cyber security Effects on AI: Challenges and Mitig...
DOCX
Minor Project ReportCyber security Effects on AI: Challenges and Mitigation S...
PDF
Artificial Intelligence (AI) Security, Attack Vectors, Defense Techniques, Et...
PPTX
Responsible Generative AI: What to Generate and What Not
PDF
AI SAFETY GOVERNANCE Framework with AIML
PDF
LLM Security - Smart to protect, but too smart to be protected
PPTX
Securing your Machine Learning models
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
Introduction to AI Safety (public presentation).pptx
Data security in AI systems
Tru_Shiralkar_Gen AI Sec_ ISACA 2024.pdf
Final Cut Pro Crack FREE LINK Latest Version 2025
Privacy and Security in the Age of Generative AI - C4AI.pdf
Avast Free Antivirus Crack FREE Downlaod 2025
SpyHunter Crack Latest Version FREE Download 2025
Risk Management for LLMs
Webinar_ Building Your Secure AI Roadmap.pptx
swamy_ppt[1]_[Read-Only][1].pptxswamy_ppt[1]_[Read-Only][1].pptx
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
AI Code Generation Risks (Ramkumar Dilli)
Minor Project Report about Cyber security Effects on AI: Challenges and Mitig...
Minor Project ReportCyber security Effects on AI: Challenges and Mitigation S...
Artificial Intelligence (AI) Security, Attack Vectors, Defense Techniques, Et...
Responsible Generative AI: What to Generate and What Not
AI SAFETY GOVERNANCE Framework with AIML
LLM Security - Smart to protect, but too smart to be protected
Securing your Machine Learning models
Ad

More from Priyanka Aash (20)

PDF
From Chatbot to Destroyer of Endpoints - Can ChatGPT Automate EDR Bypasses (1...
PDF
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
PDF
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
PDF
Cyber Defense Matrix Workshop - RSA Conference
PDF
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
PDF
Securing AI - There Is No Try, Only Do!.pdf
PDF
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
PDF
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
PDF
10 Key Challenges for AI within the EU Data Protection Framework.pdf
PDF
Techniques for Automatic Device Identification and Network Assignment.pdf
PDF
Keynote : Presentation on SASE Technology
PDF
Keynote : AI & Future Of Offensive Security
PDF
Redefining Cybersecurity with AI Capabilities
PDF
Demystifying Neural Networks And Building Cybersecurity Applications
PDF
Finetuning GenAI For Hacking and Defending
PDF
(CISOPlatform Summit & SACON 2024) Kids Cyber Security .pdf
PDF
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
PDF
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
PDF
(CISOPlatform Summit & SACON 2024) Workshop _ Most Dangerous Attack Technique...
PDF
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
From Chatbot to Destroyer of Endpoints - Can ChatGPT Automate EDR Bypasses (1...
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Cyber Defense Matrix Workshop - RSA Conference
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
Securing AI - There Is No Try, Only Do!.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Keynote : Presentation on SASE Technology
Keynote : AI & Future Of Offensive Security
Redefining Cybersecurity with AI Capabilities
Demystifying Neural Networks And Building Cybersecurity Applications
Finetuning GenAI For Hacking and Defending
(CISOPlatform Summit & SACON 2024) Kids Cyber Security .pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Workshop _ Most Dangerous Attack Technique...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
Ad

Recently uploaded (20)

PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
A Presentation on Artificial Intelligence
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Big Data Technologies - Introduction.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation_ Review paper, used for researhc scholars
A comparative analysis of optical character recognition models for extracting...
Mobile App Security Testing_ A Comprehensive Guide.pdf
NewMind AI Weekly Chronicles - August'25-Week II
Digital-Transformation-Roadmap-for-Companies.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
A Presentation on Artificial Intelligence
20250228 LYD VKU AI Blended-Learning.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Advanced methodologies resolving dimensionality complications for autism neur...
Building Integrated photovoltaic BIPV_UPV.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Assigned Numbers - 2025 - Bluetooth® Document
Big Data Technologies - Introduction.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Unlocking AI with Model Context Protocol (MCP)
MYSQL Presentation for SQL database connectivity
Per capita expenditure prediction using model stacking based on satellite ima...

Lessons Learned from Developing Secure AI Workflows.pdf