A new semi-supervised hierarchical active clustering based on ranking constraints for analysts groupization
作者:Eya Ben Ahmed, Ahlem Nabli, Faiez Gargouri
摘要
The groupization aims to enrich the individual preferences using similar individual’s data. It may efficiently adapt the query results to the user expectations. In this paper, we aim to optimally identify the analyst’ groups in a data warehouse. For that reason, we study the similarity between the selected queries in the analytical history. To enhance the quality of derived groups of analysts, we introduce a new method of semi-supervised hierarchical clustering under constraints ranking for handling cases when some constraints are more important than others and must be firstly enforced during the groupization process. Four axis for group identification are distinguished: (i) the function exerted, (ii) the granted responsibilities to accomplish goals, (iii) the source of groups identification, (iv) the dynamicity of discovered groups. Carried out experiments on real log files used for decision-maker groupization in data warehouse confirm the soundness of our approach. Our findings demonstrate that groupization improves upon personalization for several group types, mainly for function-based groupization and explicitly identified groups.
论文关键词:Personalization, Groupization, Semi-supervised hierarchical clustering, Constraint, Ranked constraints, OLAP log files, Data warehouse
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10489-012-0407-3