SlideShare a Scribd company logo
Application of Reinforcement Learning in Network Routing By Chaopin Zhu
Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning
Supervised Learning Feature: Learning with a teacher Phases Training phase Testing phase Application Pattern recognition Function approximation
Unsupervised Leaning Feature Learning without a teacher Application Feature extraction Other preprocessing
Reinforcement Learning Feature: Learning with a critic Application Optimization Function approximation
Elements of Reinforcement Learning Agent Environment Policy Reward function Value function Model of environment (optional)
Reinforcement Learning Problem
Markov Decision Process (MDP) Definition: A reinforcement learning task that satisfies the Markov property Transition probabilities
An Example of MDP
Markov Decision Process (cont.) Parameters Value functions
Elementary Methods for Reinforcement Learning Problem Dynamic programming Monte Carlo Methods Temporal-Difference Learning
Bellman’s Equations
Dynamic Programming Methods Policy evaluation Policy improvement
Dynamic Programming (cont.) E ---- policy evaluation I  ---- policy improvement Policy Iteration Value Iteration
Monte Carlo Methods Feature Learning from experience Do not need complete transition probabilities Idea Partition experience into episodes Average sample return Update at episode-by-episode base
Temporal-Difference Learning Features  (Combination of Monte Carlo and DP ideas) Learn from experience (Monte Carlo) Update estimates based in part on other learned estimates (DP) TD(  ) algorithm seemlessly integrates TD and Monte Carlo Methods
TD(0) Learning Initialize V(x) arbitrarily to the policy to be evaluated Repeat (for each episode): Initialize x Repeat (for each step of episode) a  action given by    for x Take action a; observe reward r and next state x’ x  x’ until x is terminal
Q-Learning Initialize Q(x,a) arbitrarily Repeat (for each episode) Initialize x Repeat (for each step of episode): Choose a from x using policy derived from Q Take action a, observe r, x’ x  x’ until x is terminal
Q-Routing Q x (y,d)----estimated time that a packet would take to reach the destination node d from current node x via x’s neighbor node y T y (d) ------y’s estimate for the time remaining in the trip q y  ---------queuing time in node y T xy  --------transmission time between x and y
Algorithm of Q-Routing Set initial Q-values for each node Get the first packet from the packet queue of node x Choose the best neighbor node  and forward the packet to node  by Get the estimated value  from node Update  Go to 2.
Dual Reinforcement Q-Routing
Network Model
Network Model (cont.)
Node Model
Routing Controller
Initialization/ Termination Procedures Initilization Initialize and / or register global variable Initialize routing table Termination Destroy routing table Release memory
Arrival Procedure Data packet arrival Update routing table Route it with control information or destroy the packet if it reaches the destination Control information packet arrival Update routing table Destroy the packet
Departure Procedure Set all fields of the packet Get a shortest route Send the packet according to the route
References [1] Richard S. Sutton and Andrew G. Barto, Reinforcement Learning—An Introduction [2] Chengan Guo, Applications of Reinforcement Learning in Sequence Detection and Network Routing [3] Simon Haykin, Neural Networks– A Comprehensive Foundation

More Related Content

PDF
Automated Parameterization of Performance Models from Measurements
PPTX
ECE 565 FInal Project
PPTX
Using Machine Learning For Solving Time Series Probelms
PDF
Traffic Class Assignment for Mixed-Criticality Frames in TTEthernet
PPT
Chap10 slides
PPT
Chap8 slides
PPTX
Design & implementation of machine learning algorithm in (2)
PDF
TUKE System for MediaEval 2014 QUESST
Automated Parameterization of Performance Models from Measurements
ECE 565 FInal Project
Using Machine Learning For Solving Time Series Probelms
Traffic Class Assignment for Mixed-Criticality Frames in TTEthernet
Chap10 slides
Chap8 slides
Design & implementation of machine learning algorithm in (2)
TUKE System for MediaEval 2014 QUESST

What's hot (15)

PPTX
Priority queuing
PPT
` Traffic Classification based on Machine Learning
PPT
Chap12 slides
PPTX
PPT
Chap4 slides
PPT
0006.scheduling not-ilp-not-force
PPT
Chap9 slides
PPTX
BIRTE-13-Kawashima
PDF
Elvior Company Introduction T3UC Beijing 2010
PDF
A Generate-Test-Aggregate Parallel Programming Library on Spark
PDF
Pretzel: optimized Machine Learning framework for low-latency and high throug...
PDF
Self-adaptive container monitoring with performance-aware Load-Shedding policies
PDF
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
PPT
Basic Communication
PPT
Chap11 slides
Priority queuing
` Traffic Classification based on Machine Learning
Chap12 slides
Chap4 slides
0006.scheduling not-ilp-not-force
Chap9 slides
BIRTE-13-Kawashima
Elvior Company Introduction T3UC Beijing 2010
A Generate-Test-Aggregate Parallel Programming Library on Spark
Pretzel: optimized Machine Learning framework for low-latency and high throug...
Self-adaptive container monitoring with performance-aware Load-Shedding policies
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
Basic Communication
Chap11 slides
Ad

Viewers also liked (20)

PPTX
Streamlining Technology to Reduce Complexity and Improve Productivity
PDF
Graphical Models for chains, trees and grids
PDF
One Size Doesn't Fit All: The New Database Revolution
PDF
Some Take-Home Message about Machine Learning
PDF
Power of Code: What you don’t know about what you know
PDF
Supervised Approach to Extract Sentiments from Unstructured Text
PPTX
Machine Learning techniques
PDF
07 history of cv vision paradigms - system - algorithms - applications - eva...
PPTX
Pattern Recognition and Machine Learning : Graphical Models
PDF
graphical models for the Internet
PDF
Les outils de modélisation des Big Data
PPTX
Nearest Neighbor Customer Insight
PDF
Web Crawling and Reinforcement Learning
PDF
A real-time big data architecture for glasses detection using computer vision...
PPTX
[PRML 3.1~3.2] Linear Regression / Bias-Variance Decomposition
DOCX
A system to filter unwanted messages from osn user walls
PPT
Aggregation for searching complex information spaces
PDF
Big Data Paradigm - Analysis, Application and Challenges
PPTX
Sourcing talent a key recruiting differentiator part 2 - the (Big) Data Lands...
ODP
On cascading small decision trees
Streamlining Technology to Reduce Complexity and Improve Productivity
Graphical Models for chains, trees and grids
One Size Doesn't Fit All: The New Database Revolution
Some Take-Home Message about Machine Learning
Power of Code: What you don’t know about what you know
Supervised Approach to Extract Sentiments from Unstructured Text
Machine Learning techniques
07 history of cv vision paradigms - system - algorithms - applications - eva...
Pattern Recognition and Machine Learning : Graphical Models
graphical models for the Internet
Les outils de modélisation des Big Data
Nearest Neighbor Customer Insight
Web Crawling and Reinforcement Learning
A real-time big data architecture for glasses detection using computer vision...
[PRML 3.1~3.2] Linear Regression / Bias-Variance Decomposition
A system to filter unwanted messages from osn user walls
Aggregation for searching complex information spaces
Big Data Paradigm - Analysis, Application and Challenges
Sourcing talent a key recruiting differentiator part 2 - the (Big) Data Lands...
On cascading small decision trees
Ad

Similar to Applying Reinforcement Learning for Network Routing (20)

PDF
Flexible and Scalable Routing Approach for Mobile Ad Hoc Networks Based on Re...
PPT
Lecture -10 AI Reinforcement Learning.ppt
PDF
Reinforcement Learning - Learning from Experience like a Human
PDF
anintroductiontoreinforcementlearning-180912151720.pdf
PPTX
An introduction to reinforcement learning
PDF
17 21 jan17 9dec16 13655 27902-1-ed(edit)
PPTX
REINFORCEMENT_LEARNING POWER POINT PRESENTATION.pptx
PDF
IRLR: an Improved Reinforcement Learning-Based Routing Algorithm for Wireless...
PDF
IRLR: an Improved Reinforcement Learning-Based Routing Algorithm for Wireless...
PPTX
Reinforcement learning
PDF
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
PPTX
Navigation in 3 d environment with reinforcement learning by Predrag Njegovan...
PDF
Deep reinforcement learning from scratch
PPTX
Deep Reinforcement Learning for control of PBNs--CNA2020
PDF
Inter IIT Tech Meet 2k19, IIT Jodhpur
PDF
Algorithms for Reinforcement Learning
PPTX
Designing an AI that gains experience for absolute beginners
PPTX
An efficient use of temporal difference technique in Computer Game Learning
PDF
Reinforcement learning, Q-Learning
PDF
Continuous control with deep reinforcement learning (DDPG)
Flexible and Scalable Routing Approach for Mobile Ad Hoc Networks Based on Re...
Lecture -10 AI Reinforcement Learning.ppt
Reinforcement Learning - Learning from Experience like a Human
anintroductiontoreinforcementlearning-180912151720.pdf
An introduction to reinforcement learning
17 21 jan17 9dec16 13655 27902-1-ed(edit)
REINFORCEMENT_LEARNING POWER POINT PRESENTATION.pptx
IRLR: an Improved Reinforcement Learning-Based Routing Algorithm for Wireless...
IRLR: an Improved Reinforcement Learning-Based Routing Algorithm for Wireless...
Reinforcement learning
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Navigation in 3 d environment with reinforcement learning by Predrag Njegovan...
Deep reinforcement learning from scratch
Deep Reinforcement Learning for control of PBNs--CNA2020
Inter IIT Tech Meet 2k19, IIT Jodhpur
Algorithms for Reinforcement Learning
Designing an AI that gains experience for absolute beginners
An efficient use of temporal difference technique in Computer Game Learning
Reinforcement learning, Q-Learning
Continuous control with deep reinforcement learning (DDPG)

More from butest (20)

PDF
EL MODELO DE NEGOCIO DE YOUTUBE
DOC
1. MPEG I.B.P frame之不同
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPT
Timeline: The Life of Michael Jackson
DOCX
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPTX
Com 380, Summer II
PPT
PPT
DOCX
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
DOC
MICHAEL JACKSON.doc
PPTX
Social Networks: Twitter Facebook SL - Slide 1
PPT
Facebook
DOCX
Executive Summary Hare Chevrolet is a General Motors dealership ...
DOC
Welcome to the Dougherty County Public Library's Facebook and ...
DOC
NEWS ANNOUNCEMENT
DOC
C-2100 Ultra Zoom.doc
DOC
MAC Printing on ITS Printers.doc.doc
DOC
Mac OS X Guide.doc
DOC
hier
DOC
WEB DESIGN!
EL MODELO DE NEGOCIO DE YOUTUBE
1. MPEG I.B.P frame之不同
LESSONS FROM THE MICHAEL JACKSON TRIAL
Timeline: The Life of Michael Jackson
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
LESSONS FROM THE MICHAEL JACKSON TRIAL
Com 380, Summer II
PPT
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
MICHAEL JACKSON.doc
Social Networks: Twitter Facebook SL - Slide 1
Facebook
Executive Summary Hare Chevrolet is a General Motors dealership ...
Welcome to the Dougherty County Public Library's Facebook and ...
NEWS ANNOUNCEMENT
C-2100 Ultra Zoom.doc
MAC Printing on ITS Printers.doc.doc
Mac OS X Guide.doc
hier
WEB DESIGN!

Applying Reinforcement Learning for Network Routing