APFA: Automated product feature alignment for duplicate detection

作者:

Highlights:

• We propose an automated pre-processing algorithm for product duplicate detection.

• The pre-processing phase employs new key metrics for feature alignment.

• A novel brand analyzer and title analyzer are presented.

• The proposed methods are tested using data from four real-world Web shops.

• The pre-processing phase significantly improves both effectiveness and speed.

摘要

•We propose an automated pre-processing algorithm for product duplicate detection.•The pre-processing phase employs new key metrics for feature alignment.•A novel brand analyzer and title analyzer are presented.•The proposed methods are tested using data from four real-world Web shops.•The pre-processing phase significantly improves both effectiveness and speed.

论文关键词:Duplicate detection,Automated pre-processing,Product comparison,E-commerce

论文评审过程:Received 20 July 2020, Revised 7 January 2021, Accepted 17 February 2021, Available online 26 February 2021, Version of Record 11 March 2021.

论文官网地址:https://doi.org/10.1016/j.eswa.2021.114759