SlideShare a Scribd company logo
.lusoftware verification & validation
VVS
Automated Vulnerability Testing
Using Machine Learning and
Metaheuristic Search
PI: Lionel Briand
Researchers: Annibale Panichella, Cu Nguyen, Nadia Alshahwan
PhD Students: Dennis Appelt, Sadeeq Jan
1
Code Injection
Manipulated data structures
Collect and analyze information
Indicator
Employ probabilistic techniques
Manipulate system resources
Subvert access control
Abuse existing functionality
Engage in deceptive interactions 2 %
2 %
3 %
3 %
3 %
4 %
9 %
32 %
42 %
X-Force Threat Intelligence Index 2017
2
https://www.ibm.com/security/xforce/
More than 40% of all
attacks were injection
attacks (e.g., SQLi)
Web Applications
3
Server SQL DatabaseClient
Web Applications
4
Web form
str1
str2
Username
Password
OK
SQL query
SELECT *
FROM Users WHERE
(usr = ‘str1’ AND psw = ‘str2’)
Name Surname …
John Smith …
Result
Server SQL DatabaseClient
Injection Attacks
5
SQL query
Name Surname …
Aria Stark …
John Snow …
… … …
Query result
SELECT *
FROM Users
WHERE (usr = ‘’ AND
psw = ‘’) OR 1=1 --
Server SQL DatabaseClient
Web form
‘) OR 1=1 --
Username
Password
OK
Protection Layers
Server
SQL
Database
Client
Data input
Validation
and
Sanitization
Database
Firewall
Web
Application
Firewall
6
Protection Layers: Trade-Offs
7
Overhead
Detection
Accuracy
Front-end
Level
Defenses
Web
Application
Firewall
Database
Level
Defences
Testing Challenges
• All protection layers need to be tested
• No single layer can possibly block all attacks
• They need to be effective together
• Testing is extensive: Large input space
• Different test techniques for different layers
• Many types of vulnerabilities
8
Testing Front-end
Web Applications for XMLi
Testing the Front-end (XMLi)
10
Front-end
System
XML
I1
I2
In
Generated XML
Messages
Back-end
Systems
System 1
System 2
System n
Input
Strings
Security Mechanisms in Front-end
Web Applications
• Input Sanitization: rejects inputs
containing malicious characters (e.g., <)
• Input Validation: converts malicious
inputs to valid ones (e.g., deleting XML
tags)
• Other transformation: domain specific
transformation (e.g., JSON to XML,
calculating age)
11
Front-end
System
XML
I1
I2
In
Generated XML
Messages
Back-end
Systems
System 1
System 2
System n
Input
Strings
Testing of the Front-end WAs
12
Does the front-end system (SUT) allow the
generation of XML injection attacks?
YES
The front-end
is vulnerable
NO
The front-end
is secure
Testing of the Front-end WAs
13
Front-end
System
XML
I1
I2
In
Generated XML
Messages
Back-end
Systems
System 1
System 2
System n
<user>
<username>Tom</username>
<password>m1U9q10</password>
<role>user</role>
<mail>role=Adm+ tom@uni.lu</mail>
</user>
Step 1: Create malicious XML messages
Step 2: Verify whether the SUT can generate them
Malicous XML message
Search for
Input String
Step 1: Generating Malicious Messages
Grammar-based Generation: automatically generating malicious
messages for different type of XML injection attacks
14
Our tool SOLMI (ISSTA'16)
Example of message
generated by SOLMI
Step 2: Searching for Input Strings
15
Front-end
System
XML
I1
I2
In
Generated XML
Messages
Back-end
Systems
System 1
System 2
System n
<user>
<username>Tom</username>
<password>m1U9q10</password>
<role>user</role>
<mail>role=Adm+ tom@uni.lu</mail>
</user>
Malicous XML message
Candidate
Input String
The front-end web application (SUT) is a black-box
The search space is very huge: all possible input strings (I1, .., In)
Step 2: Searching for Input Strings
16
Evaluation
Selection
Crossover
Mutation
Search
Algorithm
Initial
Solutions Random Strings
Front-end
System
I1
I2
In
Generated
Messag
Email:“role=Adm”
+tom@uni.lu
Usr: Tom
Psw: m1U9q10
Step 2: Searching for Input Strings
17
Evaluation
Selection
Crossover
Mutation
Search
Algorithm
Initial
Solutions Random Strings
Front-end
System
I1
I2
In
Generated
Messag
Email:“role=Adm”
+tom@uni.lu
Usr: Tom
Psw: m1U9q10
Target Edit
Distance
XMLXML
Step 2: Searching for Input Strings
18
Evaluation
Selection
Crossover
Mutation
Search
Algorithm
Initial
Solutions Random Strings
Front-end
System
I1
I2
In
Generated
Messag
Email:“role=Adm”
+tom@uni.lu
Usr: Tom
Psw: m1U9q10
XML
XML
XML
XML
New Input
Strings
Some Results
19
(W/ validat.) (W/o validat.) (open source) (Industrial)
%CoveredXMLiMessage
0
25
50
75
100
SBANK SSBANK XMLMAO R M
RealCoded GA Standard GA Hill Climbing Random Search
(Industrial)
Testing Web Application
Firewalls (WAFs)
Web Application Firewalls (WAFs)
21
Servermalicious
malicious
malicious
legitimate
WAF
WAF Rule Set
22
Rule set of Apache ModSecurity
https://github.com/SpiderLabs/ModSecurity
Misconfigured WAFs
23
BLOCKED
False Positive
ALLOWED
False Negative
Anatomy of SQLi attacks
24
‘ OR“a”=“a”#
Bypassing Attack
<START>
<sq> <wsp> <sqliAttack> <cmt>
<boolAttack>
<opOR> <boolTrueExpr>
OR <bynaryTrue>
<dq> <ch> <dq> <opEq> <dq> <ch> <dq>
“ a ” = “ a ”
<sQuoteContext>
‘ #_
Decomposition Tree
‘
_
OR”a”=“a”
#
S =
{
Attack Slices
Learning Attack Patterns
25
S1 S2 S3 S4 … Sn Outcome
A1 1 1 0 0 … 0 Passed
A2 0 1 0 0 … 0 Blocked
… … … … … … … …
Am 1 1 1 1 … 1 Blocked
Training Set
Sn
PassedBlocked
S4
YesNo
YesNo
YesNo
S3
S1
S2
…
Decision Tree
Learning Attack Patterns
26
S1 S2 S3 S4 … Sn Outcome
A1 1 1 0 0 … 0 Passed
A2 0 1 0 0 … 0 Blocked
… … … … … … … …
Am 1 1 1 1 … 1 Blocked
Sn
PassedBlocked
S4
YesNo
YesNo
YesNo
S3
S1
S2
…
Training Set Decision Tree
Attack Pattern
S2 ∧ ¬ Sn ∧ S1
Machine Learning
Sn
PassedBlocked
S4
YesNo
YesNo
YesNo
S3
S1
S2
…
Generating Attacks via ML and EAs
27
Prepare
Training
Data
Build
Classifier
Mutate
best
attacks
Execute
new
attacks
Slice
attacks
Initial
Attacks
(μ+λ) Evolutionary Algorithm
Some Results
Apache ModSecurity
28
Apache ModSecurity
• ML techniques outperform
random technique
• ML-Driven E superior to
other ML techniques
DistinctAttacks
Industrial Case
Industrial WAFs
DistinctAttacks
Machine Learning-driven attack generation led to more
distinct, successful attacks being discovered
Automated Repairing of
Vulnerable WAFs
Rule Set Customization
30
Customization is error-prone:
•Complex filter rules
•Limited time and resources
•Lack of automated tools
Rule customization is necessary:
•To protect from new threats
•To avoid false positives
Fixing Vulnerable WAFs
31
SQLi Attacks
Attacks
Decomposition
Machine
Learning (DT)
Attack
Generation
Process
Attack
Patterns
Fixing Vulnerable WAFs
32
SQLi Attacks
Attacks
Decomposition
Machine
Learning (DT)
New Regular
Expressions
Existing
Rule Set
Fixed
Rule Set
# Blocked
Attacks
# Blocked
Legitimate
Request
Multi-Objective Optimization
33
Problem: selecting a subset of the regular expressions produced
by Decision Tree such as to (1) maximizing the recall (blocked
attacks) and (2) minimizing the false positive rate.Recall
False Positive
Pareto
Front
Multi-Objective Genetic Algorithms
34
Evaluation
Selection
Crossover
Mutation
NSGA-II
Initial
Solutions
Multi-Objective Genetic Algorithms
35
R1 R2 R2 R4 … Rk
1 1 0 0 … 0
0 1 1 1 … 1
Initial Solutions
Evaluation
Selection
Crossover
Mutation
NSGA-II
Initial
Solutions
Solutions are evaluated
and selected according
to the Pareto Optimality
Some Results
36
Target WAF:
ModeSecurity
OWASP Core Rule Set
Target Operation:
doPayment()
# Attacks = 1234
# Benign Req = 1567
Hypevolume(NSGAII) >Hypevolume(RS)
Hypervolume Results
37
Hypervolume
0,00
0,25
0,50
0,75
1,00
Op1 Op2 Op3 Op3
NSGA-II Random
Hypervolume
0,00
0,25
0,50
0,75
1,00
doPayment expireTicket simulate-

Payment
NSGA-II Random
ModSecurity Industrial WAF
Detecting Malicious SQL
Statements at Database Level
Using ML to Detect SQLi Statements
39
SQL
egitimate
cution Logs
Parsing Pruning
Edit distance
Training Phase
L
mate
n Logs
Parsing Pruning
Edit distance Clustering
Training Phase
QL
timate
ion Logs
Parsing Pruning
Edit distance Clustering
Training Phase
Parsing Pruning
Edit distance Clustering
ng Phase
Parsing Pruning
Edit
Distance
Clustering
SQL
Legitimate
Execution Logs
Phase 1: Training
SQL
Security
Testing Logs
Parsing Pruning
Testing Phase
Classification
SQL
Security
Testing Logs
SQL
Legitimate
Execution Logs
Parsing Pruning
Edit distan
Training Phase
SQL
Legitimate
Execution Logs
Parsing Pruning
Edit distance Clu
Training Phase
Parsing Pruning
Phase 2: Testing (Detection)
Detection Phase
40
Clustering
Incoming
SQL Statement 1
Incoming
SQL Statement 2
Detection Phase
41
Incoming
SQL Statement 1
Clustering
Incoming
SQL Statement 2
APPROVE
REJECT
Some Results
42
SUT Test Gen. Recall False Positive
HotelRS Xavier 100% 0 %
SugarCRM Xavier 100% 0%
0%
TaskFreak
Burpsuite 100% 0%
0%SqlMap 100% 0,1 %
TheOrganizer
Burpsuite 100% 0,6 %
SqlMap 100% 0,3 %
Wordpress-newstat
Burpsuite 100% 0,2 %
SqlMap 100% 0,2 %
Wordpress-landingpage SqlMap 100% 0,1 %
Publications
Automatic Generation of Tests to Exploit XML Injection Vulnerabilities in Web Applications.
Jan, Sadeeq; Panichella, Annibale; Arcuri, Andrea; Briand, Lionel. To appear in IEEE Transaction on Software
Engineering (TSE), 2017
A Machine Learning-Driven Evolutionary Approach for Testing Web Application Firewalls.
Appelt, Dennis, Nguyen, Duy Cu, Panichella, Annibale, Briand, Lionel. To appear in IEEE Transaction on
Reliability (TR)
Automatically Repairing Web Application Firewalls Based on Successful SQL Injection Attacks.
Appelt, Dennis; Annibale Panichella; Briand, Lionel. In IEEE 28th International Symposium on Software
Reliability Engineering (ISSRE 2017) , Toulouse, France.

Search-based Testing Approach for XML Injection Vulnerabilities in Web Applications
Jan, Sadeeq; Nguyen, Duy Cu; Andrea, Arcuri; Briand, Lionel. Proc. of the 10th IEEE International
Conference on Software Testing, Verification and validation (ICST 2017), Tokyo, Japan

Automated and Effective Testing of Web Services for XML Injection Attacks
Jan, Sadeeq; Nguyen, Duy Cu; Briand, Lionel. In Proc. the International Symposium on Software Testing
and Analysis (ISSTA 2016), Saarbrücken, Germany

SOFIA: An Automated Security Oracle for Black-Box Testing of SQL-Injection Vulnerabilities
Ceccato, Mariano; Nguyen, Duy Cu; Appelt, Dennis; Briand, Lionel. In Proceedings of the 31th IEEE/ACM
International Conference on Automated Software Engineering (ASE 2016)
43
Publications
Known XML Vulnerabilities Are Still a Threat to Popular Parsers and Open Source Systems
Jan, Sadeeq; Nguyen, Duy Cu; Briand, Lionel. In The 2015 IEEE International Conference on
Software Quality, Reliability & Security (QSR 2015), Vancouver, Canada
Behind an Application Firewall, Are We Safe from SQL Injection Attacks?
Appelt, Dennis; Nguyen, Duy Cu; Briand, Lionel. In Proc. of the 8th International Conference on
Software Testing, Verification, and Validation (ICST 2015)

Automated Testing for SQL Injection Vulnerabilities: An Input Mutation Approach
Appelt, Dennis; Nguyen, Duy Cu; Briand, Lionel; Alshahwan, Nadia. In Proc. of the International
Symposium on Software Testing and Analysis (ISSTA 2014)
44
.lusoftware verification & validation
VVS
Automated Vulnerability Testing
Using Machine Learning and
Metaheuristic Search
PI: Lionel Briand
Researchers: Annibale Panichella, Cu Nguyen, Nadia Alshahwan
PhD Students: Dennis Appelt, Sadeeq Jan
45

More Related Content

PDF
Supply Chain Attacks
PPTX
Black hat hackers
PPTX
Social Media Forensics for Investigators
PPTX
Cyber crime
PDF
Unit 5_Introduction to Cyber Security.pdf
PPTX
Email Forensics
PPTX
Phishing
PPTX
Email investigation
Supply Chain Attacks
Black hat hackers
Social Media Forensics for Investigators
Cyber crime
Unit 5_Introduction to Cyber Security.pdf
Email Forensics
Phishing
Email investigation

What's hot (20)

PDF
MITRE ATT&CKcon 2.0: Zeek-based ATT&CK Metrics and Gap Analysis; Allan Thomso...
PPTX
Online Crime and New Cyber Laws in Pakistan
PPTX
Cyber crime.pptx
PDF
Cybercrimes (Against Children)
PPTX
ethical hacking
PDF
Cyber forensics and auditing
ODP
Cyber security awareness
PDF
State of the ATTACK
PDF
Unit 5_Classification of Cyber Crimes.pdf
PPTX
Digital forensics
PPTX
Cyber crime
PPTX
Power point cybercrime
PPTX
Cyber fraud
PPT
CYBER CRIME AND SECURITY
PPT
Criminología 1.1
PPTX
E mail forensics
PPTX
Risk Assessment and Threat Modeling
PPT
Cybercrime
PPTX
Section 66 of Information Technology Act.
PDF
Anatomy of a cyber attack
MITRE ATT&CKcon 2.0: Zeek-based ATT&CK Metrics and Gap Analysis; Allan Thomso...
Online Crime and New Cyber Laws in Pakistan
Cyber crime.pptx
Cybercrimes (Against Children)
ethical hacking
Cyber forensics and auditing
Cyber security awareness
State of the ATTACK
Unit 5_Classification of Cyber Crimes.pdf
Digital forensics
Cyber crime
Power point cybercrime
Cyber fraud
CYBER CRIME AND SECURITY
Criminología 1.1
E mail forensics
Risk Assessment and Threat Modeling
Cybercrime
Section 66 of Information Technology Act.
Anatomy of a cyber attack
Ad

Similar to Automated Vulnerability Testing Using Machine Learning and Metaheuristic Search (20)

PPT
Penetration testing, What’s this?
PPTX
Using Splunk for Information Security
PPTX
Using Splunk for Information Security
PPT
Time-Based Blind SQL Injection using Heavy Queries
PPTX
Understanding and preventing sql injection attacks
PDF
Automated Intrusion Response - CDIS Spring Conference 2024
PPT
SQL Server Security - Attack
PPT
Encoded Attacks And Countermeasures
PDF
Defcon 17-joseph mccray-adv-sql_injection
PPT
How "·$% developers defeat the web vulnerability scanners
PPTX
Application and Website Security -- Fundamental Edition
PDF
hashdays 2011: Felix 'FX' Lindner - Targeted Industrial Control System Attack...
PPTX
DevBeat 2013 - Developer-first Security
DOCX
Ceh certified ethical hacker
PDF
Applications of Machine Learning and Metaheuristic Search to Security Testing
PDF
Automated defense from rootkit attacks
PPTX
Intrusion Detection with Neural Networks
PPTX
Web applications security conference slides
PDF
Would you Rather Have Telemetry into 2 Attacks or 20? An Insight Into Highly ...
PDF
Positive Technologies - S4 - Scada under x-rays
Penetration testing, What’s this?
Using Splunk for Information Security
Using Splunk for Information Security
Time-Based Blind SQL Injection using Heavy Queries
Understanding and preventing sql injection attacks
Automated Intrusion Response - CDIS Spring Conference 2024
SQL Server Security - Attack
Encoded Attacks And Countermeasures
Defcon 17-joseph mccray-adv-sql_injection
How "·$% developers defeat the web vulnerability scanners
Application and Website Security -- Fundamental Edition
hashdays 2011: Felix 'FX' Lindner - Targeted Industrial Control System Attack...
DevBeat 2013 - Developer-first Security
Ceh certified ethical hacker
Applications of Machine Learning and Metaheuristic Search to Security Testing
Automated defense from rootkit attacks
Intrusion Detection with Neural Networks
Web applications security conference slides
Would you Rather Have Telemetry into 2 Attacks or 20? An Insight Into Highly ...
Positive Technologies - S4 - Scada under x-rays
Ad

More from Lionel Briand (20)

PDF
LTM: Scalable and Black-box Similarity-based Test Suite Minimization based on...
PDF
TEASMA: A Practical Methodology for Test Adequacy Assessment of Deep Neural N...
PDF
Automated Test Case Repair Using Language Models
PDF
Automated Testing and Safety Analysis of Deep Neural Networks
PDF
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
PDF
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
PDF
Precise and Complete Requirements? An Elusive Goal
PDF
Large Language Models for Test Case Evolution and Repair
PDF
Metamorphic Testing for Web System Security
PDF
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
PDF
Fuzzing for CPS Mutation Testing
PDF
Data-driven Mutation Analysis for Cyber-Physical Systems
PDF
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
PDF
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
PDF
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
PDF
PRINS: Scalable Model Inference for Component-based System Logs
PDF
Revisiting the Notion of Diversity in Software Testing
PDF
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
PDF
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
PDF
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
LTM: Scalable and Black-box Similarity-based Test Suite Minimization based on...
TEASMA: A Practical Methodology for Test Adequacy Assessment of Deep Neural N...
Automated Test Case Repair Using Language Models
Automated Testing and Safety Analysis of Deep Neural Networks
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Precise and Complete Requirements? An Elusive Goal
Large Language Models for Test Case Evolution and Repair
Metamorphic Testing for Web System Security
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Fuzzing for CPS Mutation Testing
Data-driven Mutation Analysis for Cyber-Physical Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
PRINS: Scalable Model Inference for Component-based System Logs
Revisiting the Notion of Diversity in Software Testing
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...

Recently uploaded (20)

PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
L1 - Introduction to python Backend.pptx
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPTX
Reimagine Home Health with the Power of Agentic AI​
PPTX
assetexplorer- product-overview - presentation
PDF
System and Network Administraation Chapter 3
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Cost to Outsource Software Development in 2025
PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Navsoft: AI-Powered Business Solutions & Custom Software Development
PTS Company Brochure 2025 (1).pdf.......
L1 - Introduction to python Backend.pptx
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Design an Analysis of Algorithms I-SECS-1021-03
Reimagine Home Health with the Power of Agentic AI​
assetexplorer- product-overview - presentation
System and Network Administraation Chapter 3
CHAPTER 2 - PM Management and IT Context
Understanding Forklifts - TECH EHS Solution
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
Upgrade and Innovation Strategies for SAP ERP Customers
How to Choose the Right IT Partner for Your Business in Malaysia
Cost to Outsource Software Development in 2025
Computer Software and OS of computer science of grade 11.pptx
Odoo Companies in India – Driving Business Transformation.pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design

Automated Vulnerability Testing Using Machine Learning and Metaheuristic Search

  • 1. .lusoftware verification & validation VVS Automated Vulnerability Testing Using Machine Learning and Metaheuristic Search PI: Lionel Briand Researchers: Annibale Panichella, Cu Nguyen, Nadia Alshahwan PhD Students: Dennis Appelt, Sadeeq Jan 1
  • 2. Code Injection Manipulated data structures Collect and analyze information Indicator Employ probabilistic techniques Manipulate system resources Subvert access control Abuse existing functionality Engage in deceptive interactions 2 % 2 % 3 % 3 % 3 % 4 % 9 % 32 % 42 % X-Force Threat Intelligence Index 2017 2 https://www.ibm.com/security/xforce/ More than 40% of all attacks were injection attacks (e.g., SQLi)
  • 4. Web Applications 4 Web form str1 str2 Username Password OK SQL query SELECT * FROM Users WHERE (usr = ‘str1’ AND psw = ‘str2’) Name Surname … John Smith … Result Server SQL DatabaseClient
  • 5. Injection Attacks 5 SQL query Name Surname … Aria Stark … John Snow … … … … Query result SELECT * FROM Users WHERE (usr = ‘’ AND psw = ‘’) OR 1=1 -- Server SQL DatabaseClient Web form ‘) OR 1=1 -- Username Password OK
  • 8. Testing Challenges • All protection layers need to be tested • No single layer can possibly block all attacks • They need to be effective together • Testing is extensive: Large input space • Different test techniques for different layers • Many types of vulnerabilities 8
  • 10. Testing the Front-end (XMLi) 10 Front-end System XML I1 I2 In Generated XML Messages Back-end Systems System 1 System 2 System n Input Strings
  • 11. Security Mechanisms in Front-end Web Applications • Input Sanitization: rejects inputs containing malicious characters (e.g., <) • Input Validation: converts malicious inputs to valid ones (e.g., deleting XML tags) • Other transformation: domain specific transformation (e.g., JSON to XML, calculating age) 11 Front-end System XML I1 I2 In Generated XML Messages Back-end Systems System 1 System 2 System n Input Strings
  • 12. Testing of the Front-end WAs 12 Does the front-end system (SUT) allow the generation of XML injection attacks? YES The front-end is vulnerable NO The front-end is secure
  • 13. Testing of the Front-end WAs 13 Front-end System XML I1 I2 In Generated XML Messages Back-end Systems System 1 System 2 System n <user> <username>Tom</username> <password>m1U9q10</password> <role>user</role> <mail>role=Adm+ [email protected]</mail> </user> Step 1: Create malicious XML messages Step 2: Verify whether the SUT can generate them Malicous XML message Search for Input String
  • 14. Step 1: Generating Malicious Messages Grammar-based Generation: automatically generating malicious messages for different type of XML injection attacks 14 Our tool SOLMI (ISSTA'16) Example of message generated by SOLMI
  • 15. Step 2: Searching for Input Strings 15 Front-end System XML I1 I2 In Generated XML Messages Back-end Systems System 1 System 2 System n <user> <username>Tom</username> <password>m1U9q10</password> <role>user</role> <mail>role=Adm+ [email protected]</mail> </user> Malicous XML message Candidate Input String The front-end web application (SUT) is a black-box The search space is very huge: all possible input strings (I1, .., In)
  • 16. Step 2: Searching for Input Strings 16 Evaluation Selection Crossover Mutation Search Algorithm Initial Solutions Random Strings Front-end System I1 I2 In Generated Messag Email:“role=Adm” [email protected] Usr: Tom Psw: m1U9q10
  • 17. Step 2: Searching for Input Strings 17 Evaluation Selection Crossover Mutation Search Algorithm Initial Solutions Random Strings Front-end System I1 I2 In Generated Messag Email:“role=Adm” [email protected] Usr: Tom Psw: m1U9q10 Target Edit Distance XMLXML
  • 18. Step 2: Searching for Input Strings 18 Evaluation Selection Crossover Mutation Search Algorithm Initial Solutions Random Strings Front-end System I1 I2 In Generated Messag Email:“role=Adm” [email protected] Usr: Tom Psw: m1U9q10 XML XML XML XML New Input Strings
  • 19. Some Results 19 (W/ validat.) (W/o validat.) (open source) (Industrial) %CoveredXMLiMessage 0 25 50 75 100 SBANK SSBANK XMLMAO R M RealCoded GA Standard GA Hill Climbing Random Search (Industrial)
  • 21. Web Application Firewalls (WAFs) 21 Servermalicious malicious malicious legitimate WAF
  • 22. WAF Rule Set 22 Rule set of Apache ModSecurity https://github.com/SpiderLabs/ModSecurity
  • 24. Anatomy of SQLi attacks 24 ‘ OR“a”=“a”# Bypassing Attack <START> <sq> <wsp> <sqliAttack> <cmt> <boolAttack> <opOR> <boolTrueExpr> OR <bynaryTrue> <dq> <ch> <dq> <opEq> <dq> <ch> <dq> “ a ” = “ a ” <sQuoteContext> ‘ #_ Decomposition Tree ‘ _ OR”a”=“a” # S = { Attack Slices
  • 25. Learning Attack Patterns 25 S1 S2 S3 S4 … Sn Outcome A1 1 1 0 0 … 0 Passed A2 0 1 0 0 … 0 Blocked … … … … … … … … Am 1 1 1 1 … 1 Blocked Training Set Sn PassedBlocked S4 YesNo YesNo YesNo S3 S1 S2 … Decision Tree
  • 26. Learning Attack Patterns 26 S1 S2 S3 S4 … Sn Outcome A1 1 1 0 0 … 0 Passed A2 0 1 0 0 … 0 Blocked … … … … … … … … Am 1 1 1 1 … 1 Blocked Sn PassedBlocked S4 YesNo YesNo YesNo S3 S1 S2 … Training Set Decision Tree Attack Pattern S2 ∧ ¬ Sn ∧ S1
  • 27. Machine Learning Sn PassedBlocked S4 YesNo YesNo YesNo S3 S1 S2 … Generating Attacks via ML and EAs 27 Prepare Training Data Build Classifier Mutate best attacks Execute new attacks Slice attacks Initial Attacks (μ+λ) Evolutionary Algorithm
  • 28. Some Results Apache ModSecurity 28 Apache ModSecurity • ML techniques outperform random technique • ML-Driven E superior to other ML techniques DistinctAttacks Industrial Case Industrial WAFs DistinctAttacks Machine Learning-driven attack generation led to more distinct, successful attacks being discovered
  • 30. Rule Set Customization 30 Customization is error-prone: •Complex filter rules •Limited time and resources •Lack of automated tools Rule customization is necessary: •To protect from new threats •To avoid false positives
  • 31. Fixing Vulnerable WAFs 31 SQLi Attacks Attacks Decomposition Machine Learning (DT) Attack Generation Process Attack Patterns
  • 32. Fixing Vulnerable WAFs 32 SQLi Attacks Attacks Decomposition Machine Learning (DT) New Regular Expressions Existing Rule Set Fixed Rule Set # Blocked Attacks # Blocked Legitimate Request
  • 33. Multi-Objective Optimization 33 Problem: selecting a subset of the regular expressions produced by Decision Tree such as to (1) maximizing the recall (blocked attacks) and (2) minimizing the false positive rate.Recall False Positive Pareto Front
  • 35. Multi-Objective Genetic Algorithms 35 R1 R2 R2 R4 … Rk 1 1 0 0 … 0 0 1 1 1 … 1 Initial Solutions Evaluation Selection Crossover Mutation NSGA-II Initial Solutions Solutions are evaluated and selected according to the Pareto Optimality
  • 36. Some Results 36 Target WAF: ModeSecurity OWASP Core Rule Set Target Operation: doPayment() # Attacks = 1234 # Benign Req = 1567 Hypevolume(NSGAII) >Hypevolume(RS)
  • 37. Hypervolume Results 37 Hypervolume 0,00 0,25 0,50 0,75 1,00 Op1 Op2 Op3 Op3 NSGA-II Random Hypervolume 0,00 0,25 0,50 0,75 1,00 doPayment expireTicket simulate-
 Payment NSGA-II Random ModSecurity Industrial WAF
  • 39. Using ML to Detect SQLi Statements 39 SQL egitimate cution Logs Parsing Pruning Edit distance Training Phase L mate n Logs Parsing Pruning Edit distance Clustering Training Phase QL timate ion Logs Parsing Pruning Edit distance Clustering Training Phase Parsing Pruning Edit distance Clustering ng Phase Parsing Pruning Edit Distance Clustering SQL Legitimate Execution Logs Phase 1: Training SQL Security Testing Logs Parsing Pruning Testing Phase Classification SQL Security Testing Logs SQL Legitimate Execution Logs Parsing Pruning Edit distan Training Phase SQL Legitimate Execution Logs Parsing Pruning Edit distance Clu Training Phase Parsing Pruning Phase 2: Testing (Detection)
  • 41. Detection Phase 41 Incoming SQL Statement 1 Clustering Incoming SQL Statement 2 APPROVE REJECT
  • 42. Some Results 42 SUT Test Gen. Recall False Positive HotelRS Xavier 100% 0 % SugarCRM Xavier 100% 0% 0% TaskFreak Burpsuite 100% 0% 0%SqlMap 100% 0,1 % TheOrganizer Burpsuite 100% 0,6 % SqlMap 100% 0,3 % Wordpress-newstat Burpsuite 100% 0,2 % SqlMap 100% 0,2 % Wordpress-landingpage SqlMap 100% 0,1 %
  • 43. Publications Automatic Generation of Tests to Exploit XML Injection Vulnerabilities in Web Applications. Jan, Sadeeq; Panichella, Annibale; Arcuri, Andrea; Briand, Lionel. To appear in IEEE Transaction on Software Engineering (TSE), 2017 A Machine Learning-Driven Evolutionary Approach for Testing Web Application Firewalls. Appelt, Dennis, Nguyen, Duy Cu, Panichella, Annibale, Briand, Lionel. To appear in IEEE Transaction on Reliability (TR) Automatically Repairing Web Application Firewalls Based on Successful SQL Injection Attacks. Appelt, Dennis; Annibale Panichella; Briand, Lionel. In IEEE 28th International Symposium on Software Reliability Engineering (ISSRE 2017) , Toulouse, France. Search-based Testing Approach for XML Injection Vulnerabilities in Web Applications Jan, Sadeeq; Nguyen, Duy Cu; Andrea, Arcuri; Briand, Lionel. Proc. of the 10th IEEE International Conference on Software Testing, Verification and validation (ICST 2017), Tokyo, Japan Automated and Effective Testing of Web Services for XML Injection Attacks Jan, Sadeeq; Nguyen, Duy Cu; Briand, Lionel. In Proc. the International Symposium on Software Testing and Analysis (ISSTA 2016), Saarbrücken, Germany SOFIA: An Automated Security Oracle for Black-Box Testing of SQL-Injection Vulnerabilities Ceccato, Mariano; Nguyen, Duy Cu; Appelt, Dennis; Briand, Lionel. In Proceedings of the 31th IEEE/ACM International Conference on Automated Software Engineering (ASE 2016) 43
  • 44. Publications Known XML Vulnerabilities Are Still a Threat to Popular Parsers and Open Source Systems Jan, Sadeeq; Nguyen, Duy Cu; Briand, Lionel. In The 2015 IEEE International Conference on Software Quality, Reliability & Security (QSR 2015), Vancouver, Canada Behind an Application Firewall, Are We Safe from SQL Injection Attacks? Appelt, Dennis; Nguyen, Duy Cu; Briand, Lionel. In Proc. of the 8th International Conference on Software Testing, Verification, and Validation (ICST 2015) Automated Testing for SQL Injection Vulnerabilities: An Input Mutation Approach Appelt, Dennis; Nguyen, Duy Cu; Briand, Lionel; Alshahwan, Nadia. In Proc. of the International Symposium on Software Testing and Analysis (ISSTA 2014) 44
  • 45. .lusoftware verification & validation VVS Automated Vulnerability Testing Using Machine Learning and Metaheuristic Search PI: Lionel Briand Researchers: Annibale Panichella, Cu Nguyen, Nadia Alshahwan PhD Students: Dennis Appelt, Sadeeq Jan 45