Compressive approaches for cross-language multi-document summarization

作者:

Highlights:

• Cross-Language Text Summarization (CLTS) aims at analyzing a document in a source language and then generating a short summary in a target language.

• We combine sentence and multi-sentence compression methods for the CLTS problem in order to generate more informative cross-lingual summaries.

• A new Long Short-Term Memory (LSTM) model with an attention mechanism is proposed to compress sentences by removing non-relevant words.

• Our Multi-Sentence Compression (MSC) model compresses small clusters of similar sentences from a cohesion metrics and a list of keywords.

• Automatic and manual evaluations were carried out on a French-to-English multi-document summarization task.

摘要

•Cross-Language Text Summarization (CLTS) aims at analyzing a document in a source language and then generating a short summary in a target language.•We combine sentence and multi-sentence compression methods for the CLTS problem in order to generate more informative cross-lingual summaries.•A new Long Short-Term Memory (LSTM) model with an attention mechanism is proposed to compress sentences by removing non-relevant words.•Our Multi-Sentence Compression (MSC) model compresses small clusters of similar sentences from a cohesion metrics and a list of keywords.•Automatic and manual evaluations were carried out on a French-to-English multi-document summarization task.

论文关键词:Cross-language text summarization,Sentence compression,Multi-sentence compression,Optimization

论文评审过程:Received 19 January 2019, Revised 25 October 2019, Accepted 1 November 2019, Available online 5 November 2019, Version of Record 29 February 2020.

论文官网地址:https://doi.org/10.1016/j.datak.2019.101763