Business information extraction from semi-structured webpages
作者:
Highlights:
•
摘要
To protect online consumers, as OECD Guidelines recommend, Internet shopping malls should provide information about their business on their webpages. In Korea, The Consumer Protection Law in Electronic Commerce, forced Internet shopping malls to provide their business information, so that consumers could easily identify them. Since most Korean Internet shopping malls provide consumers with business information in a semi-structured format on their homepages, a software agent can easily identify them.To investigate automatically the provision of the business information with the Internet shopping malls, this article proposes the methods of gathering URLs of Internet shopping malls, of monitoring alterations of webpages, and of extracting business information. Business information extraction in our research is based on synonyms and indicator words of the attributes. We used inductive learning to raise the efficiency of information extraction. With experiments, we showed the potentialities of our agent system. The average extraction accuracy of our agent system was 89.3%.
论文关键词:Electronic commerce,Internet shopping mall,Business information,Information extraction,Agent
论文评审过程:Received 9 October 2003, Accepted 1 December 2003, Available online 24 December 2003.
论文官网地址:https://doi.org/10.1016/j.eswa.2003.12.008