On learning web information extraction rules with TANGO
作者:
Highlights:
• TANGO can be adapted to particular websites or to keep with the evolution of HTML.
• It relies on an open catalogue of features and a highly configurable learning process.
• We provide a method to help re-configure our proposal to improve the effectiveness.
• It beats other state-of-the-art proposals regarding effectiveness.
摘要
Highlights•TANGO can be adapted to particular websites or to keep with the evolution of HTML.•It relies on an open catalogue of features and a highly configurable learning process.•We provide a method to help re-configure our proposal to improve the effectiveness.•It beats other state-of-the-art proposals regarding effectiveness.
论文关键词:Web information extraction,Semi-structured documents,Open catalogues of features,Learning rules,Variation points,Configuration method
论文评审过程:Received 1 August 2015, Revised 29 March 2016, Accepted 23 May 2016, Available online 21 June 2016, Version of Record 16 July 2016.
论文官网地址:https://doi.org/10.1016/j.is.2016.05.003