基于标签的鉴别特征学习的有效迁移学习

ID：39385

阅读量：0

大小：0.58 MB

页数：10页

时间：2023-03-14

金币：2

上传者：战必胜



 

Citation: Kim, G.; Kang, S. Effective

Transfer Learning with Label-Based

Discriminative Feature Learning.

Sensors 2022, 22, 2025. https://

doi.org/10.3390/s22052025

Academic Editor: Andrea Cataldo

Received: 8 February 2022

Accepted: 3 March 2022

Published: 4 March 2022

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional afﬁl-

iations.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

sensors

Article

Effective Transfer Learning with Label-Based Discriminative

Feature Learning

Gyunyeop Kim and Sangwoo Kang *

School of Computing, Gachon University, Seongnam 13120, Korea; gyop0817@gachon.ac.kr

* Correspondence: swkang@gachon.ac.kr

Abstract:

The performance of natural language processing with a transfer learning methodology has

improved by applying pre-training language models to downstream tasks with a large number of

general data. However, because the data used in pre-training are irrelevant to the downstream tasks, a

problem occurs in that it learns general features rather than those features speciﬁc to the downstream

tasks. In this paper, a novel learning method is proposed for embedding pre-trained models to

learn speciﬁc features of such tasks. The proposed method learns the label features of downstream

tasks through contrast learning using label embedding and sampled data pairs. To demonstrate the

performance of the proposed method, we conducted experiments on sentence classiﬁcation datasets

and evaluated whether the features of the downstream tasks have been learned through a PCA and a

clustering of the embeddings.

Keywords: natural language processing; transfer learning; pre-training; word embedding

1. Introduction

Artiﬁcial intelligence has shown good performance through deep learning from large

numbers of data. According to [

], transfer learning conducted through pre-learning with

large numbers of data can improve the performance of downstream tasks. Transfer learning

refers to pre-learning with unsupervised data that is easy to collect. We proceed with the

learning of downstream tasks using a pre-learning model. These processes demonstrate

the advantage of an easy collection of unsupervised datasets, which can improve the

performance of a downstream task. Therefore, many current artiﬁcial intelligence methods

use transfer-learning models to achieve a high performance.

In natural language processing (NLP), transfer learning has shown signiﬁcant perfor-

mance improvements when applied to language models. In NLP, transfer-learning-based

language models such as BERT [

] and ELECTRA [

] are pre-learned using large numbers

of natural language data that have been crawled, such as Wiki datasets. Because the data

built through crawling make up an unsupervised dataset, learning progresses through

semi-supervised learning, such as a masked token preparation. This pre-learned language

model is used as a model for generating word embeddings during ﬁne tuning. During the

ﬁne-tuning process, downstream task learning is conducted through the construction of a

model, including a pre-learning model.

However, the pre-learning model applied in transfer learning uses a dataset that is

independent of the downstream task. Thus, during the pre-training process, the model

learns general features rather than features speciﬁc to downstream tasks. Word embeddings

derived through the pre-trained model may have a higher percentage of common features

than the information required for downstream tasks. As a result, word embeddings derived

from pre-trained models can have unnecessary features in downstream tasks. Furthermore,

ﬁne-tuning using the word embeddings through a pre-trained model can be compromised

by the unnecessary features presented in the word embeddings.

In this study, further learning is applied to induce pre-trained models to derive

word embeddings optimized for downstream tasks. Using the proposed method, word

Sensors 2022, 22, 2025. https://doi.org/10.3390/s22052025 https://www.mdpi.com/journal/sensors

资源描述：

当前文档最多预览五页，下载文档查看全文

侵权申诉



1 1 2 3 4 5 / 10



此文档下载收益归作者所有

当前文档最多预览五页，下载文档查看全文

版权提示

温馨提示：
1. 部分包含数学公式或PPT动画的文件，查看预览时可能会显示错乱或异常，文件下载后无此问题，请放心下载。
2. 本文档由用户上传，版权归属用户，天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容，确认文档内容符合您的需求后进行下载，若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误，付费完成后未能成功下载的用户请联系客服处理。

大家都在看

近期热门

基于标签的鉴别特征学习的有效迁移学习

最近更新

大家都在看

相关文章

相关标签