Machine Learning & Training

What Is a Policy Gradient?

Policy gradient methods are a class of reinforcement learning algorithms that optimize the agent's policy directly rather than estimating value functions first. They adjust the policy's parameters in the direction that increases expected reward. These methods are well suited to problems with continuous or high-dimensional action spaces.

Further reading

Read more about policy gradient — articles and blogs from around the web: