Three-dimensional Entity Resolution with JedAI
作者:
Highlights:
• JedAI is an open-source system that allows for composing the state-of-the-art individual methods into millions of end-to-end workflows for Entity Resolution.
• JedAI supports both batch and progressive Entity Resolution.
• JedAI supports both blocking-based and join-based Entity Resolution.
• JedAI supports both serialized and massively parallel execution (on top of Apache Spark).
• JedAI achieves comparable effectiveness to the state-of-the-art (supervised) ER tools at a significantly lower running time.
摘要
•JedAI is an open-source system that allows for composing the state-of-the-art individual methods into millions of end-to-end workflows for Entity Resolution.•JedAI supports both batch and progressive Entity Resolution.•JedAI supports both blocking-based and join-based Entity Resolution.•JedAI supports both serialized and massively parallel execution (on top of Apache Spark).•JedAI achieves comparable effectiveness to the state-of-the-art (supervised) ER tools at a significantly lower running time.
论文关键词:Entity Resolution,Blocking,Matching,Clustering,Batch methods,Progressive methods,Massive parallelization
论文评审过程:Received 6 May 2020, Revised 22 May 2020, Accepted 23 May 2020, Available online 27 May 2020, Version of Record 28 May 2020.
论文官网地址:https://doi.org/10.1016/j.is.2020.101565