A reinforcement learning-Variable neighborhood search method for the capacitated Vehicle Routing Problem
作者:
Highlights:
• An online-learning, multi-agent hyper-heuristic framework applied on CVRP.
• Five UCB algorithms are analyzed to find the optimum learning scheme.
• Introduction of a hybrid UCB method by combining different UCB algorithmic elements.
• Integration of a concept drift detector to auto-correct the learning procedure.
摘要
•An online-learning, multi-agent hyper-heuristic framework applied on CVRP.•Five UCB algorithms are analyzed to find the optimum learning scheme.•Introduction of a hybrid UCB method by combining different UCB algorithmic elements.•Integration of a concept drift detector to auto-correct the learning procedure.
论文关键词:Reinforcement Learning,Multi-Armed Bandits,Intelligent Optimization,Bandit Learning,Metaheuristics,Variable Neighborhood Search,Vehicle Routing Problem
论文评审过程:Received 24 October 2021, Revised 12 August 2022, Accepted 8 September 2022, Available online 17 September 2022, Version of Record 30 September 2022.
论文官网地址:https://doi.org/10.1016/j.eswa.2022.118812