Multi-documents Automatic Abstracting based on text clustering and semantic analysis

作者:

Highlights:

摘要

A method of realization of multi-documents Automatic Abstracting based on text clustering and semantic analysis is brought forward, aimed at overcoming shortages of some current methods about multi-documents. The method makes use of semantic analysis and can realize Automatic Abstracting of multi-documents. The algorithm of twice word segmentation based on the title and first-sentences in paragraphs is brought forward. Its precision and recall is above 95%. For a specific domain on plastics, an Automatic Abstracting system named TCAAS is implemented. The precision and recall of multi-document’s Automatic Abstracting is above 75%. And experiments do prove that it is feasible to use the method to develop a domain Automatic Abstracting system, which is valuable for further study in more depth.

论文关键词:Semantic analysis,Automatic Abstracting,Multi-documents,Text clustering,Natural language understanding

论文评审过程:Received 15 March 2008, Accepted 4 June 2009, Available online 12 June 2009.

论文官网地址:https://doi.org/10.1016/j.knosys.2009.06.010