Collective entity resolution in multi-relational familial networks
作者:Pigi Kouki, Jay Pujara, Christopher Marcum, Laura Koehly, Lise Getoor
摘要
Entity resolution in settings with rich relational structure often introduces complex dependencies between co-references. Exploiting these dependencies is challenging—it requires seamlessly combining statistical, relational, and logical dependencies. One task of particular interest is entity resolution in familial networks. In this setting, multiple partial representations of a family tree are provided, from the perspective of different family members, and the challenge is to reconstruct a family tree from these multiple, noisy, partial views. This reconstruction is crucial for applications such as understanding genetic inheritance, tracking disease contagion, and performing census surveys. Here, we design a model that incorporates statistical signals (such as name similarity), relational information (such as sibling overlap), logical constraints (such as transitivity and bijective matching), and predictions from other algorithms (such as logistic regression and support vector machines), in a collective model. We show how to integrate these features using probabilistic soft logic, a scalable probabilistic programming framework. In experiments on real-world data, our model significantly outperforms state-of-the-art classifiers that use relational features but are incapable of collective reasoning.
论文关键词:Entity resolution, Data integration, Familial networks, Multi-relational networks, Collective classification, Family reconstruction, Probabilistic soft logic
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10115-018-1246-2