Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems

Many-Objective Reinforcement Learning for Online
Testing of DNN-Enabled Systems
Fitash Ul Haq1, Donghwan Shin3, Lionel Briand1,2
1) University of Luxembourg
2) University of Ottawa
3) University of Sheffield
Date: 19th May 2023

2
Introduction
DNN-Enabled System (DADS):
Composed of multiple DNNs capable of various tasks as object tracking, object
classification, traffic light detection and traffic sign detection
Self-Driving Cars Autonomous Drones

3
Cannot Find
Safety Violation Online Testing
Can Find
Testing DNN-Enabled System (DADS)
Offline Testing
Introduction

4
Challenges for Online Testing:
To address the challenges, we propose MORLOT (Many-Objective Reinforcement
Learning for Online Testing) by leveraging many-objective search and
Reinforcement Learning (RL)
Static
Environment
Many Safety
Requirements
Large Input
Space
3rd Party DNNs
Introduction

5
Key Idea
Many-Objective Search
Reinforcement Learning
(Q-learning) Environment
Action (e.g,
increase speed)
Fitness Values
(rewards)
Reward (e.g, distance)
Action (e.g,
increase speed)

6
Many-Objective Reinforcement Learning for Online Testing
MORLOT: Overview
Objectives
(i.e., reqs),
Q-tables
Observe
State
Choose Action
Randomly or using
Reset
Environment

7
Choose Action using Many-Objective Search
Actions are chosen based on the objective that is closest to being
satisfied
O1
O2
O3
QO_1
QO_2
QO_3
Select Action
F
_
v
=
0
.
3
F_v = 0.7
F
_
v
=
0
.
5
Action from Q table of O2
(e.g., Decrease speed of VIF)
VIF: Vehicle-In-Front

8
MORLOT: Overview
Objectives
(i.e., reqs),
Q-tables
Observe
State
Choose Action
Randomly or using
Reset
Environment
Take Action/
Receive
Rewards
Update Q-
tables/
Archive
Minimal Test Suite

9
Research Questions
How do alternative
approaches fare in terms of
test effectiveness?
How do alternative
approaches fare in terms of
test efficiency?
RQ1
RQ2

10
• Transfuser (DNN-enabled
ADS):
Highest rank publicly available
DNN-enabled ADS in the CARLA
Autonomous Driving
Leaderboard Sensors Track
Case Study Subject
Transfuser
• CARLA (Simulator):
Open-source simulator based on
the Unreal Engine designed to
support training, development,
and validation of ADS
CARLA Simulator

11
Safety and Functional Requirements
We use the following six safety and
functional requirements:
• R1: EV should not go out of lane;
• R2: EV should not collide with other
vehicles;
EV
EV: Ego Vehicle

12
vehicles;
• R3: EV should not collide with
pedestrians;
static meshes (i.e., traffic lights, traffic
signs etc.);
• R5: EV should reach its destination in
defined time budget;
EV: Ego Vehicle
EV

13
vehicles;
pedestrians;
static meshes (i.e., traffic lights, traffic
signs etc.);
• R5: EV should reach its destination in
defined time budget;
• R6: EV should not violate traffic lights.
EV: Ego Vehicle
EV

14
State and Action Definition
State Contain information
about:
• Ego Vehicle (EV)
• Vehicle in Front (VIF)
• Pedestrians
• Weather Conditions
• Fog Conditions and
• Lighting conditions
EV
VIF

15
State and Action Definition
EV: Ego Vehicle,
VIF: Vehicle-In-Front
Possible Actions
• Changing throttle of Vehicle in
Front (VIF)
• Changing steering of Vehicle in
Front (VIF)
• Changing pedestrian movement
• Changing weather
• Changing fog
• Changing lighting
EV
VIF

16
RQ1: Test Effectiveness
Online Testing of
DADS
Test Suite
Effectiveness
(TSE)
§ 4 hours
§ 10 repetitions
§ Three
environments
§ Straight
§ Left-Turn
§ Right-Turn
Algorithms
MORLOT
MOSA
FITEST
Random
Search (RS)
𝑇𝑆𝐸 =
# 𝑜𝑓 𝑆𝑎𝑓𝑒𝑡𝑦 𝑣𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛𝑠
# 𝑜𝑓 𝑆𝑎𝑓𝑒𝑡𝑦 𝑅𝑒𝑞𝑢𝑖𝑟𝑒𝑚𝑒𝑛𝑡𝑠

17
RQ1: Results
MORLOT is significantly more effective than random search and
alternative many-objective search approaches tailored for test suite
generation.

18
RQ2: Test Efficiency
Online Testing of
DADS
Test Efficiency
§ 4 hours
§ 10 repetitions
§ Three
environments
§ Straight
§ Left-Turn
§ Right-Turn
§ Average TSE achieved by
10 runs
§ 20 minutes interval
Algorithms
MORLOT
MOSA
FITEST
Random
Search (RS)

19
RQ2: Results
MORLOT is significantly more efficient than random search and alternative approaches.
For any given budget, MORLOT achieves a significantly higher average TSE, and this
difference keeps increasing over time.

Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems

More Related Content

Similar to Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems (20)

More from Lionel Briand (20)

Recently uploaded (20)

Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems