APFA: Automated product feature alignment for duplicate detection
作者:
Highlights:
• We propose an automated pre-processing algorithm for product duplicate detection.
• The pre-processing phase employs new key metrics for feature alignment.
• A novel brand analyzer and title analyzer are presented.
• The proposed methods are tested using data from four real-world Web shops.
• The pre-processing phase significantly improves both effectiveness and speed.
摘要
•We propose an automated pre-processing algorithm for product duplicate detection.•The pre-processing phase employs new key metrics for feature alignment.•A novel brand analyzer and title analyzer are presented.•The proposed methods are tested using data from four real-world Web shops.•The pre-processing phase significantly improves both effectiveness and speed.
论文关键词:Duplicate detection,Automated pre-processing,Product comparison,E-commerce
论文评审过程:Received 20 July 2020, Revised 7 January 2021, Accepted 17 February 2021, Available online 26 February 2021, Version of Record 11 March 2021.
论文官网地址:https://doi.org/10.1016/j.eswa.2021.114759