Soft constraints for pattern mining
作者:Willy Ugarte, Patrice Boizumault, Samir Loudni, Bruno Crémilleux, Alban Lepailleur
摘要
Constraint-based pattern discovery is at the core of numerous data mining tasks. Patterns are extracted with respect to a given set of constraints (frequency, closedness, size, etc). In practice, many constraints require threshold values whose choice is often arbitrary. This difficulty is even harder when several thresholds are required and have to be combined. Moreover, patterns barely missing a threshold will not be extracted even if they may be relevant. The paper advocates the introduction of softness into the pattern discovery process. By using Constraint Programming, we propose efficient methods to relax threshold constraints as well as constraints involved in patterns such as the top-k patterns and the skypatterns. We show the relevance and the efficiency of our approach through a case study in chemoinformatics for discovering toxicophores.
论文关键词:Constraint-based pattern mining, Soft constraints, Soft skypatterns, Constraint Programming, Disjonctive relaxation, Chemoinformatics
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10844-013-0281-4