FLOPPIES: A Framework for Large-Scale Ontology Population of Product Information from Tabular Data in E-commerce Stores

作者:

Highlights:

• FLOPPIES enable (semi-)automatic ontology population of online product information.

• We use a product ontology that is compatible with the GoodRelations ontology.

• The average Information Gain is used to determine the correct product class.

• For the evaluation we have used a training and test set, consisting of 1718 products.

• Our approach outperforms the baseline approach at all stages of the population process.

摘要

With the vast amount of information available on the Web, there is an urgent need to structure Web data in order to make it available to both users and machines. E-commerce is one of the areas in which growing data congestion on the Web impedes data accessibility. This paper proposes FLOPPIES, a framework capable of semi-automatic ontology population of tabular product information from Web stores. By formalizing product information in an ontology, better product comparison or parametric search applications can be built, using the semantics of product attributes and their corresponding values. The framework employs both lexical and pattern matching for classifying products, mapping properties, and instantiating values. It is shown that the performance on instantiating TVs and MP3 players from Best Buy and Newegg.com looks promising, achieving an F1-measure of approximately 77%.

论文关键词:Product,Ontology,Population,Instantiation,e-commerce,Semantic Web

论文评审过程:Received 19 December 2012, Revised 16 December 2013, Accepted 6 January 2014, Available online 14 January 2014.

论文官网地址:https://doi.org/10.1016/j.dss.2014.01.001