Machine Learning & Training

What Is a Policy Gradient?

Policy gradient methods are a class of reinforcement learning algorithms that optimize the agent's policy directly rather than estimating value functions first. They adjust the policy's parameters in the direction that increases expected reward. These methods are well suited to problems with continuous or high-dimensional action spaces.

What Is a Policy Gradient?

Related topics

Further reading