Heuristics for planning with penalties and rewards formulated in logic and computed through circuits
作者:
Highlights:
•
摘要
The automatic derivation of heuristic functions for guiding the search for plans is a fundamental technique in planning. The type of heuristics that have been considered so far, however, deal only with simple planning models where costs are associated with actions but not with states. In this work we address this limitation by formulating a more expressive planning model and a corresponding heuristic where preferences in the form of penalties and rewards are associated with fluents as well. The heuristic, that is a generalization of the well-known delete-relaxation heuristic, is admissible, informative, but intractable. Exploiting a correspondence between heuristics and preferred models, and a property of formulas compiled in d-DNNF, we show however that if a suitable relaxation of the domain, expressed as the strong completion of a logic program with no time indices or horizon is compiled into d-DNNF, the heuristic can be computed for any search state in time that is linear in the size of the compiled representation. This representation defines an evaluation network or circuit that maps states into heuristic values in linear-time. While this circuit may have exponential size in the worst case, as for OBDDs, this is not necessarily so. We report empirical results, discuss the application of the framework in settings where there are no goals but just preferences, and illustrate the versatility of the account by developing a new heuristic that overcomes limitations of delete-based relaxations through the use of valid but implicit plan constraints. In particular, for the Traveling Salesman Problem, the new heuristic captures the exact cost while the delete-relaxation heuristic, which is also exponential in the worst case, captures only the Minimum Spanning Tree lower bound.
论文关键词:Planning,Planning heuristics,Planning with rewards,Knowledge compilation
论文评审过程:Received 17 August 2007, Revised 2 March 2008, Accepted 6 March 2008, Available online 29 March 2008.
论文官网地址:https://doi.org/10.1016/j.artint.2008.03.004