Vector-based similarity measurements for historical figures

作者:

Highlights:

• We use labeled features from Wikipedia to generated effective evaluation standards.

• The best approach using Deepwalk, utilized graph structure of words.

• We provide an interactive demo at http://peoplesimilarity.appspot.com/.

• We identify the best distance function for each single model.

• We tried model combination to balance graph structures and semantics.

• We identify the most salient categories associated with Wikipedia entities.

• We collect human responses from Crowdflower for verification.

• We also have fashioned an iOS game app (FameMatch, available on iTunes) for testing.

• Our ranking of Wikipedia categories agree with 88.27% of human judgment.

摘要

Highlights•We use labeled features from Wikipedia to generated effective evaluation standards.•The best approach using Deepwalk, utilized graph structure of words.•We provide an interactive demo at http://peoplesimilarity.appspot.com/.•We identify the best distance function for each single model.•We tried model combination to balance graph structures and semantics.•We identify the most salient categories associated with Wikipedia entities.•We collect human responses from Crowdflower for verification.•We also have fashioned an iOS game app (FameMatch, available on iTunes) for testing.•Our ranking of Wikipedia categories agree with 88.27% of human judgment.

论文关键词:Vector representations,People similarity,Deepwalk

论文评审过程:Received 1 December 2015, Revised 25 April 2016, Accepted 1 July 2016, Available online 13 July 2016, Version of Record 20 December 2016.

论文官网地址:https://doi.org/10.1016/j.is.2016.07.001