Sampling diversity driven exploration with state difference guidance
作者:
Highlights:
• We design a novel intrinsic reward for exploration.
• The intrinsic reward has both local and global components.
• We propose a new framework to combine intrinsic and extrinsic rewards.
• The exploration method SDD can explore efficiently in a variety of environments.
摘要
•We design a novel intrinsic reward for exploration.•The intrinsic reward has both local and global components.•We propose a new framework to combine intrinsic and extrinsic rewards.•The exploration method SDD can explore efficiently in a variety of environments.
论文关键词:Reinforcement learning,Exploration,Intrinsic rewards,Off-policy,Actor–critic algorithm
论文评审过程:Received 2 January 2022, Revised 2 April 2022, Accepted 25 April 2022, Available online 6 May 2022, Version of Record 13 May 2022.
论文官网地址:https://doi.org/10.1016/j.eswa.2022.117418