A novel multi-step Q-learning method to improve data efficiency for deep reinforcement learning
作者:
Highlights:
• A novel multi-step Q-learning method is proposed to improve data efficiency for DRL.
• The proposed multi-step Q-learning method is derived by adopting a new return function.
• The new return function alters the discount of future rewards and loosens the impact of the immediate reward.
• Experimental-results shows the proposed methods can improve the data efficiency of DRL agents.
摘要
•A novel multi-step Q-learning method is proposed to improve data efficiency for DRL.•The proposed multi-step Q-learning method is derived by adopting a new return function.•The new return function alters the discount of future rewards and loosens the impact of the immediate reward.•Experimental-results shows the proposed methods can improve the data efficiency of DRL agents.
论文关键词:Deep reinforcement learning,Robotics,Multi-step methods,Data efficiency
论文评审过程:Received 5 November 2018, Revised 7 March 2019, Accepted 18 March 2019, Available online 21 March 2019, Version of Record 26 April 2019.
论文官网地址:https://doi.org/10.1016/j.knosys.2019.03.018