Candidate sentence selection for extractive text summarization
作者:
Highlights:
• A new benchmark dataset for studies on automatic text summarization, which contains both human-generated abstracts and extracts, was proposed.
• The extractive summarization problem was revisited.
• The syntactic and semantic feature spaces used in summarization were comprehensively investigated.
• An ensembled feature space was introduced on a new long short-term memory-based neural network model (LSTM-NN).
• Experimental results showed that the use of ensemble feature space remarkably improved the single-use of syntactic or semantic features, and the proposed LSTM-NN also outperformed the state-of-the-art models for extractive summarization.
摘要
•A new benchmark dataset for studies on automatic text summarization, which contains both human-generated abstracts and extracts, was proposed.•The extractive summarization problem was revisited.•The syntactic and semantic feature spaces used in summarization were comprehensively investigated.•An ensembled feature space was introduced on a new long short-term memory-based neural network model (LSTM-NN).•Experimental results showed that the use of ensemble feature space remarkably improved the single-use of syntactic or semantic features, and the proposed LSTM-NN also outperformed the state-of-the-art models for extractive summarization.
论文关键词:Extractive text summarization,Text summarization features,Summarization dataset,Long short-term memory
论文评审过程:Received 4 April 2020, Revised 27 June 2020, Accepted 11 July 2020, Available online 12 August 2020, Version of Record 12 August 2020.
论文官网地址:https://doi.org/10.1016/j.ipm.2020.102359