SlideShare a Scribd company logo
.lusoftware verification & validation
VVS
Julian Thomé, Lwin Khin Shar,
Domenico Bianculli and Lionel Briand
Search-driven String Constraint Solving for
Vulnerability Detection
Injection vulnerabilities and XSS
are serious threats
2
protected void authenticate() {
String user = req.getParameter("user"); // SOURCE
String pin = req.getParameter("pin"); // SOURCE
String token = req.getParameter("token"); // SOURCE
Document doc = db.parse(new File("users.xml"));
if(user.isEmpty() || pin.isEmpty() ||
!token.matches("[0-9]{8}")) {
// …
} else {
String q = "/users/user[@id='" +
ESAPI.encoder().encodeForXPath(user) +
"' and @pin=" +
ESAPI.encoder().encodeForXPath(pin) +
"]";
// …
NodeList nl=(NodeList)xpath.evaluate(q); // SINK
// …
}
}
3
A vulnerable example program
protected void authenticate() {
String user = req.getParameter("user");
String pin = req.getParameter("pin");
String token = req.getParameter("token");
Document doc = db.parse(new File("users.xml"));
if(user.isEmpty() || pin.isEmpty() ||
!token.matches("[0-9]{8}")) {
// …
} else {
String q = "/users/user[@id='" +
ESAPI.encoder().encodeForXPath(user) +
"' and @pin=" +
ESAPI.encoder().encodeForXPath(pin) +
"]";
// …
NodeList nl=(NodeList)xpath.evaluate(q);
// …
}
}
4
A vulnerable example program
"0 or 1"
"eve"
The program is vulnerable to
XPath Injection attacks
"/users/user[@id='evil' and
@pin=0 or 1]"
Vulnerability Analysis:
State-of-the-Art
5
Vulnerability Analysis: State-of-the-Art
Program
Path
Conditions
Threat
Model
+
Attack
Conditions
Symbolic
Execution
SAT = vulnerable
UNSAT = not vulnerable
Constraint
Solving
6
Limitation of Constraint Solvers
Only limited support for (complex) string operations
provided by the state-of-the-art constraint solvers:
7
- String replacement and/or sanitisation
operations
- String libraries of programming languages
provide hundreds of operations (e.g. Java
String library, Apache Commons)
Workaround 1: Extending Solver
Constraint Solvers could be extended in order to
support new operations
8
Problems:
- Not trivial and requires expert knowledge
- Not scalable to the size of a complete string
library of a modern programming language
Workaround 2: Re-expressing Constraints
Constraints could be re-expressed in terms of
constraints which are natively supported by the
constraint solver
9
Problem:
Increased complexity of generated constraint,
potentially leading to scalability issues
However, in practice …
10
11
Constraint Solvers fail or return an error
CVC4
Z3-str2
Remind audience about the limitation of
state-of-the-art
Our Approach:
Search-driven
String Constraint Solving
12
Search-driven String Constraint Solving
External Solver
(CVC4, Z3-str2, …)
Attack
Condition
constraint with
unsupported
operations
solutions of
constraint with
supported operations
SAT/
UNSAT/
TIMEOUT
Hybrid
Constraint
Solving
13
14
Hybrid Constraint Solving (ACO-Solver)
1. Automata-based solver solves all
constraints it supports and returns a
solution for every variable in terms of an
FSM
2. Search-based solver searches for paths in
the solution automata that make the
constraint satisfiable
Automata-based solver reduces the search space
Search-driven String Constraint Solving
External Solver
(CVC4, Z3-str2, …)
Attack
Condition
constraint with
unsupported
operations
solutions of
constraint with
supported operations
SAT/
UNSAT/
TIMEOUT
Automata-based
Solver
Search-based
Solver
ACO-Solver
15
len(v0user
) > 0
len(v0pin
) > 0
v1user
= encodeForXPath(v0user
)
v0q
= concat("/users/user[@id='", v1user
)
v1q
= concat(v0q
, "' and @pin=")
v1pin
= encodeForXPath(v0pin
)
v2q
= concat(v1q
, v1pin
)
v3q
= concat(v2q
, "]")
matches(v1pin
, "[0-9]+ [Oo][Rr] 1=1")
matches(v0token
, "[0-9]{8}")
Attack Condition Decomposition
16
len(v0user
) > 0
len(v0pin
) > 0
matches(v0token
, "[0-9]{8}")
v1user
= encodeForXPath(v0user
)
v0q
= concat("/users/user[@id='", v1user
)
v1q
= concat(v0q
, "' and @pin=")
v1pin
= encodeForXPath(v0pin
)
v2q
= concat(v1q
, v1pin
)
v3q
= concat(v2q
, "]")
matches(v1pin
, "[0-9]+ [Oo][Rr] 1=1")
Attack Condition Decomposition
17
len(v0user
) > 0
len(v0pin
) > 0
v1user
= encodeForXPath(v0user
)
v0q
= concat("/users/user[@id='", v1user
)
v1q
= concat(v0q
, "' and @pin=")
v1pin
= encodeForXPath(v0pin
)
v2q
= concat(v1q
, v1pin
)
v3q
= concat(v2q
, "]")
matches(v1pin
, "[0-9]+ [Oo][Rr] 1=1")
len(v0user
) > 0
len(v0pin
) > 0
matches(v0token
, "[0-9]{8}")
v1user
= encodeForXPath(v0user
)
v0q
= concat("/users/user[@id='", v1user
)
v1q
= concat(v0q
, "' and @pin=")
v1pin
= encodeForXPath(v0pin
)
v2q
= concat(v1q
, v1pin
)
v3q
= concat(v2q
, "]")
matches(v1pin
, "[0-9]+ [Oo][Rr] 1=1")
matches(v0token
, "[0-9]{8}")
Provide every attack condition partition as input to the
external solver
Search-driven String Constraint Solving
18
External Solver
(CVC4, Z3-str2, …)
Attack
Condition
constraint with
unsupported
operations
solutions of
constraint with
supported operations
SAT/
UNSAT/
TIMEOUT
Automata-based
Solver
Search-based
Solver
ACO-Solver
19
len(v0user
) > 0
len(v0pin
) > 0
v1user
= encodeForXPath(v0user
)
v0q
= concat("/users/user[@id='", v1user
)
v1q
= concat(v0q
, "' and @pin=")
v1pin
= encodeForXPath(v0pin
)
v2q
= concat(v1q
, v1pin
)
v3q
= concat(v2q
, "]")
matches(v1pin
, "[0-9]+ [Oo][Rr] 1=1")
matches(v0token
, "[0-9]{8}")
SAT/UNSAT/Crash
Attack Condition Partition
SAT/UNSAT/Crash
Invoke External Solver
Result
Invoke External Solver
20
matches(v0token
, "[0-9]{8}")
SAT
ResultAttack Condition Partition
Crash
len(v0user
) > 0
len(v0pin
) > 0
v1user
= encodeForXPath(v0user
)
v0q
= concat("/users/user[@id='", v1user
)
v1q
= concat(v0q
, "' and @pin=")
v1pin
= encodeForXPath(v0pin
)
v2q
= concat(v1q
, v1pin
)
v3q
= concat(v2q
, "]")
matches(v1pin
, "[0-9]+ [Oo][Rr] 1=1")
All the Attack Condition partitions with unsupported
operations are solved by ACO-Solver
Search-driven String Constraint Solving
21
External Solver
(CVC4, Z3-str2, …)
Attack
Condition
constraint with
unsupported
operations
solutions of
constraint with
supported operations
SAT/
UNSAT/
TIMEOUT
Automata-based
Solver
Search-based
Solver
ACO-Solver
- An unsupported operation (foo) has to be invokable
and its output out has to be observable
- Search a set of inputs that generate an output (out)
which satisfies all the constraint which are
imposed on it
Search-based Solving
22
out=foo(i0 … in)
Ant Colony Optimisation (ACO)
- Suited for graph searching problems
- Stochastic approach in nature, which allows
for escaping from local optima
- Inherent parallelism
- Inspired by the behaviour of ants (leaving
pheromone traces on paths leading to food)
23
Fitness Function
24
- Assess the quality of a potential solution
- A lower fitness implies a higher quality of the
solution
- Different fitness functions for
1. Numeric constraints (Korel)
2. String constraints (Levenshtein)
3. Regular expressions (Myers and Miller)
ACO Algorithm
1 Construction of solution
1,1 Build set of solution components
1,2 Determine fitness of solution components
1,3 Selecting the best solution components
2 Application of local search
3 Update of pheromone values
25
Evaluation
26
Benchmark and Evaluation Settings
27
- 43 web programs from 9 Java Web
applications/services (1 KLOC - 52 KLOC)
- Attack conditions for 64 vulnerable and 40 non-
vulnerable paths with various vulnerability
types (SQLi, XMLi, Xpathi, LDAPi, XSS)
- The timeout for solving each attack condition
was set to 30s
RQ1: Benefit
How does the proposed approach
improve the effectiveness of state-of-
the-art solvers for solving constraints
related to vulnerability detection?
28
Z3-str2 Z3-str2 + ACO-Solver
✔ vuln. detected ✔ ∆ vuln. detected
19 3 4,7 % 65 46 46 71,9 %
ACO-Solver significantly improves the recall
(# detected vulnerabilities) of Z3-str2/CVC4
RQ1: Benefit
CVC4 CVC4 + ACO-Solver
✔ vuln. detected ✔ ∆ vuln. detected
72 55 85,9 % 83 11 64 100 %
29
explain what the limitations of Z3-str2 are

- Z3-str has some limitations when it comes to sym
RQ2: Cost
Is the cost of using our technique
affordable in practice?
30
31
The cost of using our technique is affordable, because
- we can detect significantly more vulnerabilities
- vulnerability detection is an offline activity
Z3-str2
Z3-str2 + ACO-
Solver
CVC4
CVC4 + ACO-
Solver
time (s) 100,28 1.518,33 4,96 728,57
RQ2: Cost
RQ3: Role of the Automata-based solver
Does the automata-based solver
contribute to the effectiveness of the
search-based procedure?
32
33
The automata-based solver plays a fundamental role
in achieving a higher effectiveness
RQ3: Role of the Automata-based solver
Z3-str2 Z3-str2 + modACO-Solver
✔ vuln. detected t(s) ✔ ∆ vuln. detected t(s)
19 3 4,7 % 100,28 19 0 3 4,7 % 2.651,66
CVC4 CVC4 + modACO-Solver
✔ vuln. detected t(s) ✔ ∆ vuln. detected t(s)
72 55 85,9 % 4,69 73 1 56 87,5 % 927,75
34
Conclusion
Additional Information:
https://github.com/julianthome/acosolver
Making constraint solving for
vulnerability detection practical
35
Additional Information:
https://github.com/julianthome/acosolver
Ad

Recommended

Applications of Machine Learning and Metaheuristic Search to Security Testing
Applications of Machine Learning and Metaheuristic Search to Security Testing
Lionel Briand
 
System Testing of Timing Requirements based on Use Cases and Timed Automata
System Testing of Timing Requirements based on Use Cases and Timed Automata
Lionel Briand
 
A Search-based Testing Approach for XML Injection Vulnerabilities in Web Appl...
A Search-based Testing Approach for XML Injection Vulnerabilities in Web Appl...
Lionel Briand
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...
Lionel Briand
 
Extracting Domain Models from Natural-Language Requirements: Approach and Ind...
Extracting Domain Models from Natural-Language Requirements: Approach and Ind...
Lionel Briand
 
Testing of Cyber-Physical Systems: Diversity-driven Strategies
Testing of Cyber-Physical Systems: Diversity-driven Strategies
Lionel Briand
 
Incremental Reconfiguration of Product Specific Use Case Models for Evolving ...
Incremental Reconfiguration of Product Specific Use Case Models for Evolving ...
Lionel Briand
 
Automated Vulnerability Testing Using Machine Learning and Metaheuristic Search
Automated Vulnerability Testing Using Machine Learning and Metaheuristic Search
Lionel Briand
 
Search-Based Robustness Testing of Data Processing Systems
Search-Based Robustness Testing of Data Processing Systems
Lionel Briand
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
Sung Kim
 
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
Lionel Briand
 
Improving Fault Localization for Simulink Models using Search-Based Testing a...
Improving Fault Localization for Simulink Models using Search-Based Testing a...
Lionel Briand
 
Automated Test Suite Generation for Time-Continuous Simulink Models
Automated Test Suite Generation for Time-Continuous Simulink Models
Lionel Briand
 
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
Lionel Briand
 
Log-Based Slicing for System-Level Test Cases
Log-Based Slicing for System-Level Test Cases
Lionel Briand
 
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Lionel Briand
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Lionel Briand
 
Applying Product Line Use Case Modeling ! in an Industrial Automotive Embedde...
Applying Product Line Use Case Modeling ! in an Industrial Automotive Embedde...
Lionel Briand
 
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Sung Kim
 
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
Lionel Briand
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Lionel Briand
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
Sung Kim
 
Combining genetic algoriths and constraint programming to support stress test...
Combining genetic algoriths and constraint programming to support stress test...
Lionel Briand
 
AN EMPIRICAL STUDY ON THE POTENTIAL USEFULNESS OF DOMAIN MODELS FOR COMPLETEN...
AN EMPIRICAL STUDY ON THE POTENTIAL USEFULNESS OF DOMAIN MODELS FOR COMPLETEN...
Lionel Briand
 
Mining Assumptions for Software Components using Machine Learning
Mining Assumptions for Software Components using Machine Learning
Lionel Briand
 
SSBSE 2020 keynote
SSBSE 2020 keynote
Shiva Nejati
 
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Lionel Briand
 
A Natural Language Programming Approach for Requirements-based Security Testing
A Natural Language Programming Approach for Requirements-based Security Testing
Lionel Briand
 
Georgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software security
DefconRussia
 
Building High-Performance Language Implementations With Low Effort
Building High-Performance Language Implementations With Low Effort
Stefan Marr
 

More Related Content

What's hot (20)

Search-Based Robustness Testing of Data Processing Systems
Search-Based Robustness Testing of Data Processing Systems
Lionel Briand
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
Sung Kim
 
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
Lionel Briand
 
Improving Fault Localization for Simulink Models using Search-Based Testing a...
Improving Fault Localization for Simulink Models using Search-Based Testing a...
Lionel Briand
 
Automated Test Suite Generation for Time-Continuous Simulink Models
Automated Test Suite Generation for Time-Continuous Simulink Models
Lionel Briand
 
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
Lionel Briand
 
Log-Based Slicing for System-Level Test Cases
Log-Based Slicing for System-Level Test Cases
Lionel Briand
 
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Lionel Briand
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Lionel Briand
 
Applying Product Line Use Case Modeling ! in an Industrial Automotive Embedde...
Applying Product Line Use Case Modeling ! in an Industrial Automotive Embedde...
Lionel Briand
 
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Sung Kim
 
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
Lionel Briand
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Lionel Briand
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
Sung Kim
 
Combining genetic algoriths and constraint programming to support stress test...
Combining genetic algoriths and constraint programming to support stress test...
Lionel Briand
 
AN EMPIRICAL STUDY ON THE POTENTIAL USEFULNESS OF DOMAIN MODELS FOR COMPLETEN...
AN EMPIRICAL STUDY ON THE POTENTIAL USEFULNESS OF DOMAIN MODELS FOR COMPLETEN...
Lionel Briand
 
Mining Assumptions for Software Components using Machine Learning
Mining Assumptions for Software Components using Machine Learning
Lionel Briand
 
SSBSE 2020 keynote
SSBSE 2020 keynote
Shiva Nejati
 
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Lionel Briand
 
A Natural Language Programming Approach for Requirements-based Security Testing
A Natural Language Programming Approach for Requirements-based Security Testing
Lionel Briand
 
Search-Based Robustness Testing of Data Processing Systems
Search-Based Robustness Testing of Data Processing Systems
Lionel Briand
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
Sung Kim
 
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
Lionel Briand
 
Improving Fault Localization for Simulink Models using Search-Based Testing a...
Improving Fault Localization for Simulink Models using Search-Based Testing a...
Lionel Briand
 
Automated Test Suite Generation for Time-Continuous Simulink Models
Automated Test Suite Generation for Time-Continuous Simulink Models
Lionel Briand
 
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
Lionel Briand
 
Log-Based Slicing for System-Level Test Cases
Log-Based Slicing for System-Level Test Cases
Lionel Briand
 
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Lionel Briand
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Lionel Briand
 
Applying Product Line Use Case Modeling ! in an Industrial Automotive Embedde...
Applying Product Line Use Case Modeling ! in an Industrial Automotive Embedde...
Lionel Briand
 
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Sung Kim
 
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
Lionel Briand
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Lionel Briand
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
Sung Kim
 
Combining genetic algoriths and constraint programming to support stress test...
Combining genetic algoriths and constraint programming to support stress test...
Lionel Briand
 
AN EMPIRICAL STUDY ON THE POTENTIAL USEFULNESS OF DOMAIN MODELS FOR COMPLETEN...
AN EMPIRICAL STUDY ON THE POTENTIAL USEFULNESS OF DOMAIN MODELS FOR COMPLETEN...
Lionel Briand
 
Mining Assumptions for Software Components using Machine Learning
Mining Assumptions for Software Components using Machine Learning
Lionel Briand
 
SSBSE 2020 keynote
SSBSE 2020 keynote
Shiva Nejati
 
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Lionel Briand
 
A Natural Language Programming Approach for Requirements-based Security Testing
A Natural Language Programming Approach for Requirements-based Security Testing
Lionel Briand
 

Similar to Search-driven String Constraint Solving for Vulnerability Detection (20)

Georgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software security
DefconRussia
 
Building High-Performance Language Implementations With Low Effort
Building High-Performance Language Implementations With Low Effort
Stefan Marr
 
StatsCraft 2015: Monitoring using riemann - Moshe Zada
StatsCraft 2015: Monitoring using riemann - Moshe Zada
StatsCraft
 
Analyzing the Performance Effects of Meltdown + Spectre on Apache Spark Workl...
Analyzing the Performance Effects of Meltdown + Spectre on Apache Spark Workl...
Databricks
 
PVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications development
OOO "Program Verification Systems"
 
How to write clean & testable code without losing your mind
How to write clean & testable code without losing your mind
Andreas Czakaj
 
Csw2016 gawlik bypassing_differentdefenseschemes
Csw2016 gawlik bypassing_differentdefenseschemes
CanSecWest
 
IBM IMPACT 2014 - AMC-1883 - Where's My Message - Analyze IBM WebSphere MQ Re...
IBM IMPACT 2014 - AMC-1883 - Where's My Message - Analyze IBM WebSphere MQ Re...
Peter Broadhurst
 
Designing Modern Streaming Data Applications
Designing Modern Streaming Data Applications
Arun Kejariwal
 
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Gavin Guo
 
CodeChecker summary 21062021
CodeChecker summary 21062021
Olivera Milenkovic
 
Ml5 svm and-kernels
Ml5 svm and-kernels
ankit_ppt
 
Drd secr final1_3
Drd secr final1_3
Devexperts
 
Certification
Certification
subhransu mishra
 
Cassandra 2.1 boot camp, Overview
Cassandra 2.1 boot camp, Overview
Joshua McKenzie
 
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Pluribus One
 
alexnet.pdf
alexnet.pdf
BhautikDaxini1
 
MongoDB World 2019: Life In Stitch-es
MongoDB World 2019: Life In Stitch-es
MongoDB
 
Android RenderScript on LLVM
Android RenderScript on LLVM
John Lee
 
Static analysis: looking for errors ... and vulnerabilities?
Static analysis: looking for errors ... and vulnerabilities?
Andrey Karpov
 
Georgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software security
DefconRussia
 
Building High-Performance Language Implementations With Low Effort
Building High-Performance Language Implementations With Low Effort
Stefan Marr
 
StatsCraft 2015: Monitoring using riemann - Moshe Zada
StatsCraft 2015: Monitoring using riemann - Moshe Zada
StatsCraft
 
Analyzing the Performance Effects of Meltdown + Spectre on Apache Spark Workl...
Analyzing the Performance Effects of Meltdown + Spectre on Apache Spark Workl...
Databricks
 
PVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications development
OOO "Program Verification Systems"
 
How to write clean & testable code without losing your mind
How to write clean & testable code without losing your mind
Andreas Czakaj
 
Csw2016 gawlik bypassing_differentdefenseschemes
Csw2016 gawlik bypassing_differentdefenseschemes
CanSecWest
 
IBM IMPACT 2014 - AMC-1883 - Where's My Message - Analyze IBM WebSphere MQ Re...
IBM IMPACT 2014 - AMC-1883 - Where's My Message - Analyze IBM WebSphere MQ Re...
Peter Broadhurst
 
Designing Modern Streaming Data Applications
Designing Modern Streaming Data Applications
Arun Kejariwal
 
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Gavin Guo
 
Ml5 svm and-kernels
Ml5 svm and-kernels
ankit_ppt
 
Drd secr final1_3
Drd secr final1_3
Devexperts
 
Cassandra 2.1 boot camp, Overview
Cassandra 2.1 boot camp, Overview
Joshua McKenzie
 
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Pluribus One
 
MongoDB World 2019: Life In Stitch-es
MongoDB World 2019: Life In Stitch-es
MongoDB
 
Android RenderScript on LLVM
Android RenderScript on LLVM
John Lee
 
Static analysis: looking for errors ... and vulnerabilities?
Static analysis: looking for errors ... and vulnerabilities?
Andrey Karpov
 
Ad

More from Lionel Briand (20)

LTM: Scalable and Black-box Similarity-based Test Suite Minimization based on...
LTM: Scalable and Black-box Similarity-based Test Suite Minimization based on...
Lionel Briand
 
TEASMA: A Practical Methodology for Test Adequacy Assessment of Deep Neural N...
TEASMA: A Practical Methodology for Test Adequacy Assessment of Deep Neural N...
Lionel Briand
 
Automated Test Case Repair Using Language Models
Automated Test Case Repair Using Language Models
Lionel Briand
 
Automated Testing and Safety Analysis of Deep Neural Networks
Automated Testing and Safety Analysis of Deep Neural Networks
Lionel Briand
 
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
Lionel Briand
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
Lionel Briand
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
Lionel Briand
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
PRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System Logs
Lionel Briand
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Lionel Briand
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
LTM: Scalable and Black-box Similarity-based Test Suite Minimization based on...
LTM: Scalable and Black-box Similarity-based Test Suite Minimization based on...
Lionel Briand
 
TEASMA: A Practical Methodology for Test Adequacy Assessment of Deep Neural N...
TEASMA: A Practical Methodology for Test Adequacy Assessment of Deep Neural N...
Lionel Briand
 
Automated Test Case Repair Using Language Models
Automated Test Case Repair Using Language Models
Lionel Briand
 
Automated Testing and Safety Analysis of Deep Neural Networks
Automated Testing and Safety Analysis of Deep Neural Networks
Lionel Briand
 
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
Lionel Briand
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
Lionel Briand
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
Lionel Briand
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
PRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System Logs
Lionel Briand
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Lionel Briand
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
Ad

Recently uploaded (20)

Enable Your Cloud Journey With Microsoft Trusted Partner | IFI Tech
Enable Your Cloud Journey With Microsoft Trusted Partner | IFI Tech
IFI Techsolutions
 
Simplify Task, Team, and Project Management with Orangescrum Work
Simplify Task, Team, and Project Management with Orangescrum Work
Orangescrum
 
Canva Pro Crack Free Download 2025-FREE LATEST
Canva Pro Crack Free Download 2025-FREE LATEST
grete1122g
 
University Campus Navigation for All - Peak of Data & AI
University Campus Navigation for All - Peak of Data & AI
Safe Software
 
arctitecture application system design os dsa
arctitecture application system design os dsa
za241967
 
Humans vs AI Call Agents - Qcall.ai's Special Report
Humans vs AI Call Agents - Qcall.ai's Special Report
Udit Goenka
 
From Code to Commerce, a Backend Java Developer's Galactic Journey into Ecomm...
From Code to Commerce, a Backend Java Developer's Galactic Journey into Ecomm...
Jamie Coleman
 
AI for PV: Development and Governance for a Regulated Industry
AI for PV: Development and Governance for a Regulated Industry
Biologit
 
A Guide to Telemedicine Software Development.pdf
A Guide to Telemedicine Software Development.pdf
Olivero Bozzelli
 
Test Case Design Techniques – Practical Examples & Best Practices in Software...
Test Case Design Techniques – Practical Examples & Best Practices in Software...
Muhammad Fahad Bashir
 
Foundations of Marketo Engage - Programs, Campaigns & Beyond - June 2025
Foundations of Marketo Engage - Programs, Campaigns & Beyond - June 2025
BradBedford3
 
Which Hiring Management Tools Offer the Best ROI?
Which Hiring Management Tools Offer the Best ROI?
HireME
 
On-Device AI: Is It Time to Go All-In, or Do We Still Need the Cloud?
On-Device AI: Is It Time to Go All-In, or Do We Still Need the Cloud?
Hassan Abid
 
OpenChain Webinar - AboutCode - Practical Compliance in One Stack – Licensing...
OpenChain Webinar - AboutCode - Practical Compliance in One Stack – Licensing...
Shane Coughlan
 
Digital Transformation: Automating the Placement of Medical Interns
Digital Transformation: Automating the Placement of Medical Interns
Safe Software
 
Top Time Tracking Solutions for Accountants
Top Time Tracking Solutions for Accountants
oliviareed320
 
Streamlining CI/CD with FME Flow: A Practical Guide
Streamlining CI/CD with FME Flow: A Practical Guide
Safe Software
 
CodeCleaner: Mitigating Data Contamination for LLM Benchmarking
CodeCleaner: Mitigating Data Contamination for LLM Benchmarking
arabelatso
 
Modern Platform Engineering with Choreo - The AI-Native Internal Developer Pl...
Modern Platform Engineering with Choreo - The AI-Native Internal Developer Pl...
WSO2
 
Y - Recursion The Hard Way GopherCon EU 2025
Y - Recursion The Hard Way GopherCon EU 2025
Eleanor McHugh
 
Enable Your Cloud Journey With Microsoft Trusted Partner | IFI Tech
Enable Your Cloud Journey With Microsoft Trusted Partner | IFI Tech
IFI Techsolutions
 
Simplify Task, Team, and Project Management with Orangescrum Work
Simplify Task, Team, and Project Management with Orangescrum Work
Orangescrum
 
Canva Pro Crack Free Download 2025-FREE LATEST
Canva Pro Crack Free Download 2025-FREE LATEST
grete1122g
 
University Campus Navigation for All - Peak of Data & AI
University Campus Navigation for All - Peak of Data & AI
Safe Software
 
arctitecture application system design os dsa
arctitecture application system design os dsa
za241967
 
Humans vs AI Call Agents - Qcall.ai's Special Report
Humans vs AI Call Agents - Qcall.ai's Special Report
Udit Goenka
 
From Code to Commerce, a Backend Java Developer's Galactic Journey into Ecomm...
From Code to Commerce, a Backend Java Developer's Galactic Journey into Ecomm...
Jamie Coleman
 
AI for PV: Development and Governance for a Regulated Industry
AI for PV: Development and Governance for a Regulated Industry
Biologit
 
A Guide to Telemedicine Software Development.pdf
A Guide to Telemedicine Software Development.pdf
Olivero Bozzelli
 
Test Case Design Techniques – Practical Examples & Best Practices in Software...
Test Case Design Techniques – Practical Examples & Best Practices in Software...
Muhammad Fahad Bashir
 
Foundations of Marketo Engage - Programs, Campaigns & Beyond - June 2025
Foundations of Marketo Engage - Programs, Campaigns & Beyond - June 2025
BradBedford3
 
Which Hiring Management Tools Offer the Best ROI?
Which Hiring Management Tools Offer the Best ROI?
HireME
 
On-Device AI: Is It Time to Go All-In, or Do We Still Need the Cloud?
On-Device AI: Is It Time to Go All-In, or Do We Still Need the Cloud?
Hassan Abid
 
OpenChain Webinar - AboutCode - Practical Compliance in One Stack – Licensing...
OpenChain Webinar - AboutCode - Practical Compliance in One Stack – Licensing...
Shane Coughlan
 
Digital Transformation: Automating the Placement of Medical Interns
Digital Transformation: Automating the Placement of Medical Interns
Safe Software
 
Top Time Tracking Solutions for Accountants
Top Time Tracking Solutions for Accountants
oliviareed320
 
Streamlining CI/CD with FME Flow: A Practical Guide
Streamlining CI/CD with FME Flow: A Practical Guide
Safe Software
 
CodeCleaner: Mitigating Data Contamination for LLM Benchmarking
CodeCleaner: Mitigating Data Contamination for LLM Benchmarking
arabelatso
 
Modern Platform Engineering with Choreo - The AI-Native Internal Developer Pl...
Modern Platform Engineering with Choreo - The AI-Native Internal Developer Pl...
WSO2
 
Y - Recursion The Hard Way GopherCon EU 2025
Y - Recursion The Hard Way GopherCon EU 2025
Eleanor McHugh
 

Search-driven String Constraint Solving for Vulnerability Detection

  • 1. .lusoftware verification & validation VVS Julian Thomé, Lwin Khin Shar, Domenico Bianculli and Lionel Briand Search-driven String Constraint Solving for Vulnerability Detection
  • 2. Injection vulnerabilities and XSS are serious threats 2
  • 3. protected void authenticate() { String user = req.getParameter("user"); // SOURCE String pin = req.getParameter("pin"); // SOURCE String token = req.getParameter("token"); // SOURCE Document doc = db.parse(new File("users.xml")); if(user.isEmpty() || pin.isEmpty() || !token.matches("[0-9]{8}")) { // … } else { String q = "/users/user[@id='" + ESAPI.encoder().encodeForXPath(user) + "' and @pin=" + ESAPI.encoder().encodeForXPath(pin) + "]"; // … NodeList nl=(NodeList)xpath.evaluate(q); // SINK // … } } 3 A vulnerable example program
  • 4. protected void authenticate() { String user = req.getParameter("user"); String pin = req.getParameter("pin"); String token = req.getParameter("token"); Document doc = db.parse(new File("users.xml")); if(user.isEmpty() || pin.isEmpty() || !token.matches("[0-9]{8}")) { // … } else { String q = "/users/user[@id='" + ESAPI.encoder().encodeForXPath(user) + "' and @pin=" + ESAPI.encoder().encodeForXPath(pin) + "]"; // … NodeList nl=(NodeList)xpath.evaluate(q); // … } } 4 A vulnerable example program "0 or 1" "eve" The program is vulnerable to XPath Injection attacks "/users/user[@id='evil' and @pin=0 or 1]"
  • 7. Limitation of Constraint Solvers Only limited support for (complex) string operations provided by the state-of-the-art constraint solvers: 7 - String replacement and/or sanitisation operations - String libraries of programming languages provide hundreds of operations (e.g. Java String library, Apache Commons)
  • 8. Workaround 1: Extending Solver Constraint Solvers could be extended in order to support new operations 8 Problems: - Not trivial and requires expert knowledge - Not scalable to the size of a complete string library of a modern programming language
  • 9. Workaround 2: Re-expressing Constraints Constraints could be re-expressed in terms of constraints which are natively supported by the constraint solver 9 Problem: Increased complexity of generated constraint, potentially leading to scalability issues
  • 11. 11 Constraint Solvers fail or return an error CVC4 Z3-str2 Remind audience about the limitation of state-of-the-art
  • 13. Search-driven String Constraint Solving External Solver (CVC4, Z3-str2, …) Attack Condition constraint with unsupported operations solutions of constraint with supported operations SAT/ UNSAT/ TIMEOUT Hybrid Constraint Solving 13
  • 14. 14 Hybrid Constraint Solving (ACO-Solver) 1. Automata-based solver solves all constraints it supports and returns a solution for every variable in terms of an FSM 2. Search-based solver searches for paths in the solution automata that make the constraint satisfiable Automata-based solver reduces the search space
  • 15. Search-driven String Constraint Solving External Solver (CVC4, Z3-str2, …) Attack Condition constraint with unsupported operations solutions of constraint with supported operations SAT/ UNSAT/ TIMEOUT Automata-based Solver Search-based Solver ACO-Solver 15
  • 16. len(v0user ) > 0 len(v0pin ) > 0 v1user = encodeForXPath(v0user ) v0q = concat("/users/user[@id='", v1user ) v1q = concat(v0q , "' and @pin=") v1pin = encodeForXPath(v0pin ) v2q = concat(v1q , v1pin ) v3q = concat(v2q , "]") matches(v1pin , "[0-9]+ [Oo][Rr] 1=1") matches(v0token , "[0-9]{8}") Attack Condition Decomposition 16 len(v0user ) > 0 len(v0pin ) > 0 matches(v0token , "[0-9]{8}") v1user = encodeForXPath(v0user ) v0q = concat("/users/user[@id='", v1user ) v1q = concat(v0q , "' and @pin=") v1pin = encodeForXPath(v0pin ) v2q = concat(v1q , v1pin ) v3q = concat(v2q , "]") matches(v1pin , "[0-9]+ [Oo][Rr] 1=1")
  • 17. Attack Condition Decomposition 17 len(v0user ) > 0 len(v0pin ) > 0 v1user = encodeForXPath(v0user ) v0q = concat("/users/user[@id='", v1user ) v1q = concat(v0q , "' and @pin=") v1pin = encodeForXPath(v0pin ) v2q = concat(v1q , v1pin ) v3q = concat(v2q , "]") matches(v1pin , "[0-9]+ [Oo][Rr] 1=1") len(v0user ) > 0 len(v0pin ) > 0 matches(v0token , "[0-9]{8}") v1user = encodeForXPath(v0user ) v0q = concat("/users/user[@id='", v1user ) v1q = concat(v0q , "' and @pin=") v1pin = encodeForXPath(v0pin ) v2q = concat(v1q , v1pin ) v3q = concat(v2q , "]") matches(v1pin , "[0-9]+ [Oo][Rr] 1=1") matches(v0token , "[0-9]{8}")
  • 18. Provide every attack condition partition as input to the external solver Search-driven String Constraint Solving 18 External Solver (CVC4, Z3-str2, …) Attack Condition constraint with unsupported operations solutions of constraint with supported operations SAT/ UNSAT/ TIMEOUT Automata-based Solver Search-based Solver ACO-Solver
  • 19. 19 len(v0user ) > 0 len(v0pin ) > 0 v1user = encodeForXPath(v0user ) v0q = concat("/users/user[@id='", v1user ) v1q = concat(v0q , "' and @pin=") v1pin = encodeForXPath(v0pin ) v2q = concat(v1q , v1pin ) v3q = concat(v2q , "]") matches(v1pin , "[0-9]+ [Oo][Rr] 1=1") matches(v0token , "[0-9]{8}") SAT/UNSAT/Crash Attack Condition Partition SAT/UNSAT/Crash Invoke External Solver Result
  • 20. Invoke External Solver 20 matches(v0token , "[0-9]{8}") SAT ResultAttack Condition Partition Crash len(v0user ) > 0 len(v0pin ) > 0 v1user = encodeForXPath(v0user ) v0q = concat("/users/user[@id='", v1user ) v1q = concat(v0q , "' and @pin=") v1pin = encodeForXPath(v0pin ) v2q = concat(v1q , v1pin ) v3q = concat(v2q , "]") matches(v1pin , "[0-9]+ [Oo][Rr] 1=1")
  • 21. All the Attack Condition partitions with unsupported operations are solved by ACO-Solver Search-driven String Constraint Solving 21 External Solver (CVC4, Z3-str2, …) Attack Condition constraint with unsupported operations solutions of constraint with supported operations SAT/ UNSAT/ TIMEOUT Automata-based Solver Search-based Solver ACO-Solver
  • 22. - An unsupported operation (foo) has to be invokable and its output out has to be observable - Search a set of inputs that generate an output (out) which satisfies all the constraint which are imposed on it Search-based Solving 22 out=foo(i0 … in)
  • 23. Ant Colony Optimisation (ACO) - Suited for graph searching problems - Stochastic approach in nature, which allows for escaping from local optima - Inherent parallelism - Inspired by the behaviour of ants (leaving pheromone traces on paths leading to food) 23
  • 24. Fitness Function 24 - Assess the quality of a potential solution - A lower fitness implies a higher quality of the solution - Different fitness functions for 1. Numeric constraints (Korel) 2. String constraints (Levenshtein) 3. Regular expressions (Myers and Miller)
  • 25. ACO Algorithm 1 Construction of solution 1,1 Build set of solution components 1,2 Determine fitness of solution components 1,3 Selecting the best solution components 2 Application of local search 3 Update of pheromone values 25
  • 27. Benchmark and Evaluation Settings 27 - 43 web programs from 9 Java Web applications/services (1 KLOC - 52 KLOC) - Attack conditions for 64 vulnerable and 40 non- vulnerable paths with various vulnerability types (SQLi, XMLi, Xpathi, LDAPi, XSS) - The timeout for solving each attack condition was set to 30s
  • 28. RQ1: Benefit How does the proposed approach improve the effectiveness of state-of- the-art solvers for solving constraints related to vulnerability detection? 28
  • 29. Z3-str2 Z3-str2 + ACO-Solver ✔ vuln. detected ✔ ∆ vuln. detected 19 3 4,7 % 65 46 46 71,9 % ACO-Solver significantly improves the recall (# detected vulnerabilities) of Z3-str2/CVC4 RQ1: Benefit CVC4 CVC4 + ACO-Solver ✔ vuln. detected ✔ ∆ vuln. detected 72 55 85,9 % 83 11 64 100 % 29 explain what the limitations of Z3-str2 are - Z3-str has some limitations when it comes to sym
  • 30. RQ2: Cost Is the cost of using our technique affordable in practice? 30
  • 31. 31 The cost of using our technique is affordable, because - we can detect significantly more vulnerabilities - vulnerability detection is an offline activity Z3-str2 Z3-str2 + ACO- Solver CVC4 CVC4 + ACO- Solver time (s) 100,28 1.518,33 4,96 728,57 RQ2: Cost
  • 32. RQ3: Role of the Automata-based solver Does the automata-based solver contribute to the effectiveness of the search-based procedure? 32
  • 33. 33 The automata-based solver plays a fundamental role in achieving a higher effectiveness RQ3: Role of the Automata-based solver Z3-str2 Z3-str2 + modACO-Solver ✔ vuln. detected t(s) ✔ ∆ vuln. detected t(s) 19 3 4,7 % 100,28 19 0 3 4,7 % 2.651,66 CVC4 CVC4 + modACO-Solver ✔ vuln. detected t(s) ✔ ∆ vuln. detected t(s) 72 55 85,9 % 4,69 73 1 56 87,5 % 927,75
  • 35. Making constraint solving for vulnerability detection practical 35 Additional Information: https://github.com/julianthome/acosolver