Information extraction from syllabi for academic e-Advising

作者:

Highlights:

摘要

Creating an academic e-Advisor to automate the process of transferring course credits between institutions and recommend courses for further study requires an extensive database of course information. This paper presents an application for creating such a database by automatically extracting relevant information from HTML course outlines stored on an institution’s website and storing it in machine-readable XML. The developed application, called CODE (course outline data extractor), parses a course outline based on its HTML tags and content to build a document object model then applies a combination of web mining, natural language processing, and pattern recognition techniques to automatically classify and extract content useful for the semi-automatic e-Advisor and store it as XML. The current implementation is restricted to HTML course outlines, but the concepts can be extended to other formats of learning objects or entirely different domains. The quality of extraction and classification is evaluated for a corpus of syllabi as proof of concept.

论文关键词:Information extraction,Classification,e-Advising,e-Learning

论文评审过程:Available online 11 May 2008.

论文官网地址:https://doi.org/10.1016/j.eswa.2008.05.011