Continuous-time reinforcement learning approach for portfolio management with time penalization

作者：

Highlights：

• Proposes a new continuous-time RL algorithm for solving the portfolio problem.

• Considers an actor/critic reinforcement learning architecture.

• Provides a new solution characterized by transaction costs and time penalization.

• Employs a proximal optimization novel approach involving time penalization.

• Estimates the transition rate matrices and rewards.

摘要

•Proposes a new continuous-time RL algorithm for solving the portfolio problem.•Considers an actor/critic reinforcement learning architecture.•Provides a new solution characterized by transaction costs and time penalization.•Employs a proximal optimization novel approach involving time penalization.•Estimates the transition rate matrices and rewards.

论文关键词：Portfolio,Reinforcement learning,Transaction costs,Continuous-time,Markov chains

论文评审过程：Received 8 September 2018, Revised 21 March 2019, Accepted 30 March 2019, Available online 1 April 2019, Version of Record 3 April 2019.

论文官网地址：https://doi.org/10.1016/j.eswa.2019.03.055