一种基于Longformer和Transformer的抽取式摘要层次表示模型

ID:38955

大小:1.04 MB

页数:10页

时间:2023-03-14

金币:2

上传者:战必胜
Citation: Yang, S.; Zhang, S.; Fang,
M.; Yang, F.; Liu, S. A Hierarchical
Representation Model Based on
Longformer and Transformer for
Extractive Summarization. Electronics
2022, 11, 1706. https://doi.org/
10.3390/electronics11111706
Academic Editors: Phivos Mylonas,
Katia Lida Kermanidis and
Manolis Maragoudakis
Received: 30 April 2022
Accepted: 24 May 2022
Published: 27 May 2022
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
electronics
Article
A Hierarchical Representation Model Based on Longformer and
Transformer for Extractive Summarization
Shihao Yang , Shaoru Zhang, Ming Fang, Fengqin Yang and Shuhua Liu *
School of Information Science and Technology, Northeast Normal University, Changchun 130117, China;
yangsh861@nenu.edu.cn (S.Y.); zhangsr030@nenu.edu.cn (S.Z.); fangm000@nenu.edu.cn (M.F.);
yangfq147@nenu.edu.cn (F.Y.)
* Correspondence: liush129@nenu.edu.cn
Abstract:
Automatic text summarization is a method used to compress documents while preserving
the main idea of the original text, including extractive summarization and abstractive summarization.
Extractive text summarization extracts important sentences from the original document to serve
as the summary. The document representation method is crucial for the quality of the generated
summarization. To effectively represent the document, we propose a hierarchical document represen-
tation model Long-Trans-Extr for Extractive Summarization, which uses Longformer as the sentence
encoder and Transformer as the document encoder. The advantage of Longformer as sentence
encoder is that the model can input long document up to 4096 tokens with adding relative a little
calculation. The proposed model Long-Trans-Extr is evaluated on three benchmark datasets: CNN
(Cable News Network), DailyMail, and the combined CNN/DailyMail. It achieves 43.78 (Rouge-1)
and 39.71 (Rouge-L) on CNN/DailyMail and 33.75 (Rouge-1), 13.11 (Rouge-2), and 30.44 (Rouge-L)
on the CNN datasets. They are very competitive results, and furthermore, they show that our model
has better performance on long documents, such as the CNN corpus.
Keywords: extractive summarization; transformer; longformer; deep learning
1. Introduction
Since Luhn [
1
] started automatic summarization research in 1958, great achievements
have been made in this field. Text summarization can be divided into two categories:
namely, abstractive and extractive summarization. Abstractive summarization [
2
] refines
its ideas and concepts on the basis of understanding the semantic meaning of the original
text to realize semantic reconstruction. Although more similar to the logic of human beings,
abstractive summarization still faces a great challenge to produce a coherent, grammatical,
and general summary of the original text, due to the limitations of natural language
generation technology. The extractive summarization method extracts key sentences from
a document to generate a summary. The input document is initially encoded, and then, the
scores of sentences in the document are calculated. The sentences are sorted according to
the scores, and those with high scores are selected to form a summary.
This study focuses on extractive summarization, since it not only generates semanti-
cally and grammatically correct sentences in news articles but also computes faster than
abstractive summarization. At present, both generative and extractive summarization
methods have some difficulties in processing long text, which is caused by the computa-
tional complexity of the encoder network. Recent studies have shown that Transformer [
3
]
outperforms LSTM [
4
] in the area of natural language processing, both in terms of ex-
perimental results and computational complexity. However, even Transformer, which is
capable of parallel computation, is unable to handle long text, resulting in the text summa-
rization method being limited to short text. For a long text, there are usually two processing
methods: (1) Discard the exceeding part directly. This method is simple to implement,
but it has a great impact on the quality of the final summary. (2) Divide the long text into
Electronics 2022, 11, 1706. https://doi.org/10.3390/electronics11111706 https://www.mdpi.com/journal/electronics
资源描述:

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
关闭