Type Extension Trees for feature construction and learning in relational domains

作者:

摘要

Type Extension Trees are a powerful representation language for “count-of-count” features characterizing the combinatorial structure of neighborhoods of entities in relational domains. In this paper we present a learning algorithm for Type Extension Trees (TET) that discovers informative count-of-count features in the supervised learning setting. Experiments on bibliographic data show that TET-learning is able to discover the count-of-count feature underlying the definition of the h-index, and the inverse document frequency feature commonly used in information retrieval. We also introduce a metric on TET feature values. This metric is defined as a recursive application of the Wasserstein–Kantorovich metric. Experiments with a k-NN classifier show that exploiting the recursive count-of-count statistics encoded in TET values improves classification accuracy over alternative methods based on simple count statistics.

论文关键词:Statistical relational learning,Inductive logic programming,Feature discovery

论文评审过程:Received 23 October 2012, Revised 31 July 2013, Accepted 8 August 2013, Available online 20 August 2013.

论文官网地址:https://doi.org/10.1016/j.artint.2013.08.002