Sampling diversity driven exploration with state difference guidance

作者:

Highlights:

• We design a novel intrinsic reward for exploration.

• The intrinsic reward has both local and global components.

• We propose a new framework to combine intrinsic and extrinsic rewards.

• The exploration method SDD can explore efficiently in a variety of environments.

摘要

•We design a novel intrinsic reward for exploration.•The intrinsic reward has both local and global components.•We propose a new framework to combine intrinsic and extrinsic rewards.•The exploration method SDD can explore efficiently in a variety of environments.

论文关键词:Reinforcement learning,Exploration,Intrinsic rewards,Off-policy,Actor–critic algorithm

论文评审过程:Received 2 January 2022, Revised 2 April 2022, Accepted 25 April 2022, Available online 6 May 2022, Version of Record 13 May 2022.

论文官网地址:https://doi.org/10.1016/j.eswa.2022.117418