NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
This work proposes a neurally plausible approach to reinforcement learning in partially-observed MDPs based on distributional successor features. The approach allows for efficient value function computation as demonstrated empirically. The three expert reviewers were unanimous that this paper should be accepted, and I see no reason to contradict their opinions.