pytorch/examples
View on GitHubActor critic example not using discount rate properly
Open
#744 opened on Mar 27, 2020
good first issuetriaged
Repository metrics
- Stars
- (21,634 stars)
- PR merge metrics
- (No merged PRs in 30d)
Description
The Actor Critic example (which is actually an implementation of REINFORCE-with-baseline as pointed out in https://github.com/pytorch/examples/issues/573), does not use the discount rate properly.
The loss should include \gamma ^ t, as shown in the box on page 330 of Sutton & Barto:
