Citation: Lin, R.; Xiao, N. Dual
Projection Fusion for
Reference-Based Image
Super-Resolution. Sensors 2022, 22,
4119. https://doi.org/10.3390/
s22114119
Academic Editors: M. Jamal Deen,
Subhas Mukhopadhyay, Yangquan
Chen, Simone Morais, Nunzio
Cennamo and Junseop Lee
Received: 22 April 2022
Accepted: 25 May 2022
Published: 28 May 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
Dual Projection Fusion for Reference-Based Image
Super-Resolution
Ruirong Lin *
and Nanfeng Xiao
School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China;
xiaonf@scut.edu.cn
* Correspondence: cslrr546786@mail.scut.edu.cn
Abstract:
Reference-based image super-resolution (RefSR) methods have achieved performance su-
perior to that of single image super-resolution (SISR) methods by transferring texture details from an
additional high-resolution (HR) reference image to the low-resolution (LR) image. However, existing
RefSR methods simply add or concatenate the transferred texture feature with the LR features, which
cannot effectively fuse the information of these two independently extracted features. Therefore, this
paper proposes a dual projection fusion for reference-based image super-resolution (DPFSR), which
enables the network to focus more on the different information between feature sources through
inter-residual projection operations, ensuring effective filling of detailed information in the LR feature.
Moreover, this paper also proposes a novel backbone called the deep channel attention connection
network (DCACN), which is capable of extracting valuable high-frequency components from the LR
space to further facilitate the effectiveness of image reconstruction. Experimental results show that
we achieve the best peak signal-to-noise ratio (PSNR) and structure similarity (SSIM) performance
compared with the state-of-the-art (SOTA) SISR and RefSR methods. Visual results demonstrate that
the proposed method in this paper recovers more natural and realistic texture details.
Keywords:
reference-based super-resolution; attention mechanism; texture transformer; dual
projection fusion
1. Introduction
Image super-resolution (SR) aims to reconstruct an HR image with clear texture details
from a blurred LR image [
1
]. In recent years, deep learning-based SISR algorithms [
2
–
6
]
have made significant progress and are widely used for various real-world tasks, such
as medical image processing [
7
,
8
], surveillance imaging [
9
], and object recognition [
10
].
However, when the upsampling factor reaches 4× or greater, the reconstruction results of
most existing methods show blurred visual effects or artifacts. Although generative adver-
sarial network (GAN) [
11
] and perceptual loss [
12
]-based methods have been proposed to
improve the quality of the reconstructed images, they cannot guarantee the realism of the
generated textures, resulting in the degradation of the PSNR performance.
To address this problem, the RefSR method [13–18], which transfers fine details from
an additional reference image (Ref) to the LR image, is proposed. Compared to traditional
SISR, RefSR exhibits better reconstruction performance. RefSR transforms the more complex
texture generation process into a relatively simple texture search and transfer operation,
thus producing more realistic and natural-looking textures. For example, Zhang et al. [
16
]
feed the Ref and LR images into a pre-trained VGG model for feature extraction, and then
performed feature matching and texture transfer in the neural feature space. Yang et al. [
18
]
firstly introduced the transformer architecture to the SR tasks and proposed a novel texture
transformer to model the correspondence between the LR and Ref images, which helps to
perform feature matching more accurately.
However, the previous methods ignore that the information in the LR space still
has valuable high-frequency components. Besides, they simply add or concatenate the
Sensors 2022, 22, 4119. https://doi.org/10.3390/s22114119 https://www.mdpi.com/journal/sensors