DiSeg 1.0: The first system for Spanish discourse segmentation

作者:

Highlights:

摘要

Nowadays discourse parsing is a very prominent research topic. However, there is not a discourse parser for Spanish texts. The first stage in order to develop this tool is discourse segmentation. In this work, we present DiSeg, the first discourse segmenter for Spanish, which uses the framework of Rhetorical Structure Theory and is based on lexical and syntactic rules. We describe the system and we evaluate its performance against a gold standard corpus, divided in a medical and a terminological subcorpus. We obtain promising results, which means that discourse segmentation is possible using shallow parsing.

论文关键词:Discourse parsing,Discourse segmentation,Shallow parsing,Rhetorical Structure Theory

论文评审过程:Available online 13 July 2011.

论文官网地址:https://doi.org/10.1016/j.eswa.2011.06.058