Reinforcement Learning with OpenAI Gym - Value Iteration Frozen Lake - Code Heroku

Pleaseturn off your webcam
If you arejoining from a mobile phone
besureto click on
Join via Device Audio
Weare waiting for other participants to join
Wewill begin at 4:30 PM IST

Mihir Thakkar
Founderand Instructor
hello@codeheroku.com
Reinforcement Learning with
OpenAIGym

SESSION OBJECTIVES
• Quick Recap
• Bellman’sEquations
• Value Iterationin OpenAI Gym

www.codeheroku.com Introduction toMachine Learning –Reinforcement Learning

RL Problem

Q Function

Quiz
Given the following Reward Table,estimatethe value of Q(A3,East)

Quiz
Given the following Reward Table,estimatethe value of Q(B3,North)

Given a Value Function
Extract Policy

Value Iteration
OpenAI Gym
https://drive.google.com/file/d/16xMyG7bKrtT_6SId1kqLpR2vL1Km_us8/view?usp=sharing
https://github.com/codeheroku/Introduction-to-Machine-Learning/tree/master/Reinforcement%20Learning/RL2%20Value%20Iteration

Value Iteration Algorithm

Reinforcement Learning
Challenges
• Access to the Environment
• Delayed Reward (Temporal Credit RiskAssignment)
• High Cost Actions
• Distribution of data changes by the choice of actions you
take
• Efficient state representations?
• Good Rewards functions?

Reinforcement Learning with OpenAI Gym - Value Iteration Frozen Lake - Code Heroku

Markov Decision Process (MDP)

Multi Arm Bandit
• Unknown Reward Distribution
• Deterministic Actions
• Objective:FindSequence of actions
whichwillmaximizetotal reward

Exploration Vs Exploitation
To approximatevaluesof actionsAgent must choose actionsthatare non-
optimalto start with.
Once an agent has approximatedthe values, it can greedily pick the
highest value action.

Iterative Averaging

Reinforcement Learning with OpenAI Gym - Value Iteration Frozen Lake - Code Heroku

More Related Content

Similar to Reinforcement Learning with OpenAI Gym - Value Iteration Frozen Lake - Code Heroku (20)

More from codeheroku (10)

Recently uploaded (20)

Reinforcement Learning with OpenAI Gym - Value Iteration Frozen Lake - Code Heroku

Editor's Notes