Image captioning via proximal policy optimization
作者:
Highlights:
• Proximal policy optimization is capable of enforcing trust-region constraints.
• Performance decreases when combining dropout with proximal policy optimization.
• Word-level, rather than sentence-level, baselines are preferred in CIDEr optimization.
摘要
•Proximal policy optimization is capable of enforcing trust-region constraints.•Performance decreases when combining dropout with proximal policy optimization.•Word-level, rather than sentence-level, baselines are preferred in CIDEr optimization.
论文关键词:Image captioning,Reinforcement learning,Proximal policy optimization
论文评审过程:Received 14 November 2020, Revised 31 January 2021, Accepted 2 February 2021, Available online 8 February 2021, Version of Record 16 February 2021.
论文官网地址:https://doi.org/10.1016/j.imavis.2021.104126