Image captioning via proximal policy optimization

作者:

Highlights:

• Proximal policy optimization is capable of enforcing trust-region constraints.

• Performance decreases when combining dropout with proximal policy optimization.

• Word-level, rather than sentence-level, baselines are preferred in CIDEr optimization.

摘要

•Proximal policy optimization is capable of enforcing trust-region constraints.•Performance decreases when combining dropout with proximal policy optimization.•Word-level, rather than sentence-level, baselines are preferred in CIDEr optimization.

论文关键词:Image captioning,Reinforcement learning,Proximal policy optimization

论文评审过程:Received 14 November 2020, Revised 31 January 2021, Accepted 2 February 2021, Available online 8 February 2021, Version of Record 16 February 2021.

论文官网地址:https://doi.org/10.1016/j.imavis.2021.104126