Balancing exploration and exploitation in incomplete Min/Max-sum inference for distributed constraint optimization

摘要

Distributed Constraint Optimization Problems (DCOPs) are NP-hard and therefore the number of studies that consider incomplete algorithms for solving them is growing. Specifically, the Max-sum algorithm has drawn attention in recent years and has been applied to a number of realistic applications. Unfortunately, in many cases Max-sum does not produce high-quality solutions. More specifically, Max-sum does not converge and explores solutions of low quality when run on problems whose constraint graph representation contains multiple cycles of different sizes. In this paper we advance the state-of-the-art in incomplete algorithms for DCOPs by: (1) proposing a version of the Max-sum algorithm that operates on an alternating directed acyclic graph (Max-sum_AD), which guarantees convergence in linear time; (2) solving a major weakness of Max-sum and Max-sum_AD that causes inconsistent costs/utilities to be propagated and affect the assignment selection, by introducing value propagation to Max-sum_AD (Max-sum_ADVP); and (3) proposing exploration heuristic methods that evidently improve the algorithms performance further. We prove that Max-sum_ADVP converges to monotonically improving states after each change of direction, and that it is guaranteed to converge in pseudo-polynomial time to a stable solution that does not change with further changes of direction. Our empirical study reveals a large improvement in the quality of the solutions produced by Max-sum_ADVP on various benchmarks, compared to the solutions produced by the standard Max-sum algorithm, Bounded Max-sum and Max-sum_AD with no value propagation. It is found to be the best guaranteed convergence inference algorithm for DCOPs. The exploration methods we propose for Max-sum_ADVP improve its performance further. However, anytime results demonstrate that their exploration level is not as efficient as a version of Max-sum, which uses Damping.