Policy Gradient Algorithm

By Medium - 2021-03-20

Description

Reinforcement Learning - Policy Gradient REINFORCE algorithm with Baseline, concepts and Python implementation in Tensorflow2 to play CartPole

Summary

  • Policy Gradient REINFORCE Algorithm with Baseline Algorithm and Implementation in Tensorflow Policy gradient methods are very popular reinforcement learning(RL) algorithms.
  • To do so, we search for the maxima in V(θ) by ascending the gradient of the policy, w.r.t parameters θ.
  • In fact, the value function itself is a good candidate for baseline.
  • Let’s run the code and render a video once training is done.

 

Topics

  1. Machine_Learning (0.31)
  2. Stock (0.12)
  3. Mobile (0.06)

Similar Articles

Q-Learning Algorithm: From Explanation to Implementation

By Medium - 2020-12-13

In my today’s medium post, I will teach you how to implement the Q-Learning algorithm. But before that, I will first explain the idea behind Q-Learning and its limitation. Please be sure to have some…