Description
This article was written by Steeve Huang. Reinforcement Learning (RL) refers to a kind of Machine Learning method in which the agent receives a delayed rewar…
Summary
- Reinforcement Learning (RL) refers to a kind of Machine Learning method in which the agent receives a delayed reward in the next time step to evaluate its previous action.
- Recently, as the algorithm evolves with the combination of Neural Networks, it is capable of solving more complex tasks, such as the pendulum problem.
- Q-value is similar to Value, except that it takes an extra parameter, the current action a. Qπ(s, a) refers to the long-term return of the current state s, taking action a under policyπ.
- If the transition probability is successfully learned, the agent will know how likely to enter a specific state given current state and action.