Clustering experience replay for the effective exploitation in reinforcement learning

作者：

Highlights：

• The limitation of the exploitation efficiency in existing reinforcement learning methods is analyzed in detail.

• Clustering is combined into the experience replay by a divide-and-conquer framework to improve the exploitation efficiency.

• Our experience replay can sufficiently replay all kinds of transitions in the current training with low time consumption.

• A new reinforcement learning method is proposed to implement our experience replay.

摘要

•The limitation of the exploitation efficiency in existing reinforcement learning methods is analyzed in detail.•Clustering is combined into the experience replay by a divide-and-conquer framework to improve the exploitation efficiency.•Our experience replay can sufficiently replay all kinds of transitions in the current training with low time consumption.•A new reinforcement learning method is proposed to implement our experience replay.

论文关键词：Reinforcement learning,Clustering,Experience replay,Exploitation efficiency,Time division

论文评审过程：Received 7 May 2021, Revised 18 May 2022, Accepted 26 June 2022, Available online 27 June 2022, Version of Record 3 July 2022.

论文官网地址：https://doi.org/10.1016/j.patcog.2022.108875