NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
This work is an interesting contribution to deep RL that considers using Anderson acceleration to improve off-policy TD based algorithms. The approach is supported by some theory as well as experiments on standard benchmark problems. Overall, reviewers like the paper and agree it should be accepted.