Learning (k,l)-contextual tree languages for information extraction from web pages
作者:Stefan Raeymaekers, Maurice Bruynooghe, Jan Van den Bussche
摘要
This paper introduces a novel method for learning a wrapper for extraction of information from web pages, based upon (k,l)-contextual tree languages. It also introduces a method to learn good values of k and l based on a few positive and negative examples. Finally, it describes how the algorithm can be integrated in a tool for information extraction.
论文关键词:Information extraction, Wrapper induction, Tree languages
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10994-008-5049-7