基于层次特征复用的视频超分辨率重建

周圆; 王明非; 杜晓婷; 陈艳芳

doi:10.16383/j.aas.c210095

基于层次特征复用的视频超分辨率重建

doi: 10.16383/j.aas.c210095

1.
天津大学电气自动化与信息工程学院天津 300072

基金项目: 国家自然科学基金联合基金项目(U2006211), 国家重点研发计划(2020YFC1523204)资助

详细信息

作者简介:
周圆：天津大学电气自动化与信息工程学院副教授. 主要研究方向为计算机视觉与图像/视频通信. 本文通信作者. E-mail: zhouyuan@tju.edu.cn

王明非：天津大学电气自动化与信息工程学院硕士研究生. 主要研究方向为计算机视觉与机器学习. E-mail: wmf997@126.com

杜晓婷：天津大学电气自动化与信息工程学院硕士研究生. 主要研究方向为计算机视觉与机器学习. E-mail: 18225511181@163.com

陈艳芳：天津大学电气自动化与信息工程学院博士研究生. 主要研究方向为计算机视觉与机器学习. E-mail: chyf@tju.edu.cn

计量
- 文章访问数: 1795
- HTML全文浏览量: 633
- PDF下载量: 170
- 被引次数: 0
出版历程
- 收稿日期: 2021-01-28
- 网络出版日期: 2021-06-30
- 刊出日期: 2024-09-19

Video Super-resolution via Hierarchical Feature Reuse

1.
School of Electrical and Information Engineering, Tianjin University, Tianjin 300072

Funds: Supported by National Natural Science Foundation of China (U2006211) and National Key Research and Development Program of China (2020YFC1523204)

More Information

Author Bio:
ZHOU Yuan　Associate professor at the School of Electrical and Information Engineering, Tianjin University. Her research interest covers computer vision and image/video communication. Corresponding author of the paper

WANG Ming-Fei　Master student at the School of Electrical and Information Engineering, Tianjin University. His research interest covers computer vision and machine learning

DU Xiao-Ting　Master student at the School of Electrical and Information Engineering, Tianjin University. Her research interest covers computer vision and machine learning

CHEN Yan-Fang　Ph.D. candidate at the School of Electrical and Information Engineering, Tianjin University. Her research interest covers computer vision and machine learning

摘要

摘要: 当前的深度卷积神经网络方法, 在视频超分辨率任务上实现的性能提升相对于图像超分辨率任务略低, 部分原因是它们对层次结构特征中的某些关键帧间信息的利用不够充分. 为此, 提出一个称作层次特征复用网络(Hierarchical feature reuse network, HFRNet)的结构, 用以解决上述问题. 该网络保留运动补偿帧的低频内容, 并采用密集层次特征块(Dense hierarchical feature block, DHFB)自适应地融合其内部每个残差块的特征, 之后用长距离特征复用融合多个DHFB间的特征, 从而促进高频细节信息的恢复. 实验结果表明, 提出的方法在定量和定性指标上均优于当前的方法.
- 层次特征复用 /
- 卷积神经网络 /
- 特征融合 /
- 视频超分辨率重建
Abstract: The performance improvement of current deep convolution neural network methods in video super-resolution task is slightly lower than that in image super-resolution task, partly because they do not make full use of some key inter-frame information in hierarchical structure features. In this paper, we propose hierarchical feature reuse network (HFRNet) to solve the problem mentioned above. The network retains the low-frequency content of the motion compensation frame, and use dense hierarchical feature block (DHFB) to adaptively fuse the features of each residual block within it, then long-term feature reuse is proposed to fuse the features between multiple dense hierarchical feature block, so as to promote the recovery of high-frequency detail information. Experimental results show that the proposed method is superior to the current method in both quantitative and qualitative metrics.
- Hierarchical feature reuse /
- convolutional neural network (CNN) /
- feature fusion /
- video super-resolution

HTML全文

图 1 层次特征复用网络(HFRNet)的结构

Fig. 1 Architecture of hierarchical feature reuse network (HFRNet)

下载: 全尺寸图片幻灯片

图 2 DHFB的详细结构

Fig. 2 Detailed architecture of dense hierarchical feature block (DHFB)

下载: 全尺寸图片幻灯片

图 3 本文方法和其他方法在VIDEO4和Myanmar数据集下得到的平均PSNR和平均SSIM

Fig. 3 Average PSNRs and SSIMs obtained by our method and other methods on VIDEO4 and Myanmar datasets

下载: 全尺寸图片幻灯片

图 4 HFRNet 与其他模型在 VIDEO4 数据集图像上超分辨率的定性对比

Fig. 4 Qualitative super-resolution comparison of HFRNet with other models on an image from VIDEO4 dataset

下载: 全尺寸图片幻灯片

图 5 HFRNet 与其他模型在 Myanmar 数据集图像上超分辨率的定性对比

Fig. 5 Qualitative super-resolution comparison of HFRNet with other models on an image from Myanmar dataset

下载: 全尺寸图片幻灯片

图 6 HFRNet 重建细节与其他模型超分辨率的定性对比

Fig. 6 Qualitative super-resolution comparison of the reconstruction details by HFRNet and other models

下载: 全尺寸图片幻灯片

表 1 不同DHFB数目(D)和每个DHFB残差块数目(R)对2倍率超分辨率重建性能的影响(PSNR (dB))

Table 1 The impact (PSNR (dB)) of different numbers of DHFBs (D) and residual blocks (R) on the performance of 2× super-resolution reconstruction task

模块组合方式	CITY序列	WALK序列	FOLIAGE序列	CALENDAR序列	平均PSNR
R4D6	34.342	36.846	32.045	27.071	32.576
R6D4	34.339	37.101	32.117	27.067	32.656
R6D6	34.896	37.210	32.224	27.137	32.866
R6D8	34.901 (±0.035)	37.102 (±0.054)	32.187 (±0.069)	27.140 (±0.007)	32.833 (±0.041)
R8D6	34.633 (±0.039)	36.873 (±0.025)	32.144 (±0.050)	27.109 (±0.019)	32.690 (±0.034)

下载: 导出CSV

表 2 不同网络结构实验结果的平均PSNR及所需参数量

Table 2 The average PSNR and number of parameters for different network architectures

尺度	网络结构	参数量	CITY序列 (dB)	WALK序列 (dB)	FOLIAGE序列 (dB)	CALENDAR序列 (dB)	平均PSNR (dB)
×2	无层次特征复用	2.85 M	33.793	35.919	31.884	26.291	31.972
	HFRNet(a)	3.01 M	34.896	37.210	32.224	27.137	32.866
	HFRNet(b)	3.10 M	35.104	37.218	32.230	27.158	32.927

×3	无层次特征复用	2.85 M	27.220	30.113	27.019	23.344	26.924
	HFRNet(a)	3.01 M	28.235	31.513	27.539	24.190	27.869
	HFRNet(b)	3.10 M	28.240	31.613	27.587	24.217	27.914

下载: 导出CSV

表 3 不同光流估计方法对超分辨率重建性能的影响(PSNR (dB))

Table 3 The impact (PSNR (dB)) of different optical flow estimation methods on super-resolution reconstruction performance

尺度	光流估计算法	CITY序列	WALK序列	FOLIAGE序列	CALENDAR序列	平均PSNR
×2	CNN-based	35.226	37.106	32.244	27.817	33.098
×2	CLG-TV	35.104	37.218	32.230	27.158	32.927

×3	CNN-based	28.255	32.103	27.590	24.766	28.179
×3	CLG-TV	28.240	31.613	27.587	24.217	27.914

下载: 导出CSV

表 4 不同运动补偿算法对超分辨率重建性能的影响(平均PSNR (dB))

Table 4 Average PSNR (dB) in video super-resolution task, with different motion compensation algorithm

运动补偿算法与参数	尺度	MC (k = 0.050)	MC (k = 0.100)	MC (k = 0.125)	MC (k = 0.175)	AMC
平均PSNR (dB)	×2	32.493	32.510	32.714	32.615	32.927
平均PSNR (dB)	×3	27.505	27.684	27.822	27.694	27.914

下载: 导出CSV

参考文献(41)

[1]	Liu C, Sun D. On Bayesian adaptive video super resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(2): 346−360 doi: 10.1109/TPAMI.2013.127
[2]	Shahar O, Faktor A, Irani M. Space-time super-resolution from a single video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEEE, 2011. 3353−3360
[3]	Zhou Y, Wang Y, Zhang Y, Du X, Liu H, Li C. Manifold learning based super resolution for mixed-resolution multi-view video in visual internet of things. In: Proceedings of the International Conference on Artificial Intelligence for Communications and Networks. Harbin, China: Springer, 2019. 486−495
[4]	Caballero J, Ledig C, Aitken A, Acosta A, Totz J, Wang Z, et al. Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 2848−2857
[5]	Tao X, Gao H, Liao R, Wang J, Jia J. Detail-revealing deep video super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 4482−4490
[6]	Kappeler A, Yoo S, Dai Q, Katsaggelos A K. Video super-resolution with convolutional neural networks. IEEE Transactions on Computational Imaging, 2016, 2(2): 109−122 doi: 10.1109/TCI.2016.2532323
[7]	Li D, Wang Z. Video super resolution via motion compensation and deep residual learning. IEEE Transactions on Computational Imaging, 2017, 3(4): 749−762 doi: 10.1109/TCI.2017.2671360
[8]	Zhou Y, Zhang Y, Xie X, Kung S-Y. Image super-resolution based on dense convolutional auto-encoder blocks. Neurocomputing, 2021, 423(1): 98−109
[9]	李金新, 黄志勇, 李文斌, 周登文. 基于多层次特征融合的图像超分辨率重建. 自动化学报, 2023, 49(1): 161−171 Li Jin-Xin, Huang Zhi-Yong, Li Wen-Bin, Zhou Deng-Wen. Image super-resolution based on multi hierarchical features fusion network. Acta Automatica Sinica, 2023, 49(1): 161−171
[10]	张毅锋, 刘袁, 蒋程, 程旭. 用于超分辨率重建的深度网络递进学习方法. 自动化学报, 2020, 46(2): 274−282 Zhang Yi-Feng, Liu Yuan, Jiang Cheng, Cheng Xu. A curriculum learning approach for single image super resolution. Acta Automatica Sinica, 2020, 46(2): 274−282
[11]	Zhou Y, Feng L, Hou C, Kung S-Y. Hyperspectral and multispectral image fusion based on local low rank and coupled spectral unmixing. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(10): 5997−6009 doi: 10.1109/TGRS.2017.2718728
[12]	周登文, 赵丽娟, 段然, 柴晓亮. 基于递归残差网络的图像超分辨率重建. 自动化学报, 2019, 45(6): 1157−1165 Zhou Deng-Wen, Zhao Li-Juan, Duan Ran, Chai Xiao-Liang. Image super-resolution based on recursive residual networks. Acta Automatica Sinica, 2019, 45(6): 1157−1165
[13]	孙旭, 李晓光, 李嘉锋, 卓力. 基于深度学习的图像超分辨率复原研究进展. 自动化学报, 2017, 43(5): 697−709 Sun Xu, Li Xiao-Guang, Li Jia-Feng, Zhuo Li. Review on deep learning based image super-resolution restoration algorithms. Acta Automatica Sinica, 2017, 43(5): 697−709
[14]	Xie X K, Zhou Y, Kung S-Y. Exploiting operation importance for differentiable neural architecture earch. arXiv preprint arXiv: 1911.10511, 2019.
[15]	Huo S, Zhou Y, Xiang W, Kung S-Y. Semi-supervised learning based on a novel iterative optimization model for saliency detection. IEEE Transactions on Neural Network and Learning System, 2019, 30(1): 225−241 doi: 10.1109/TNNLS.2018.2809702
[16]	Zhou Y, Mao A, Huo S, Lei J, Kung S-Y. Salient object detection via fuzzy theory and object-level enhancement. IEEE Transactions on Multimedia, 2019, 1(1): 74−85
[17]	Jo Y, Oh S W, Kang J, Kim S J. Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 3224−3232
[18]	潘志勇, 郁梅, 谢登梅, 宋洋, 蒋刚毅. 采用精简卷积神经网络的快速视频超分辨率重建. 光电子 · 激光, 2018, 29(12): 1332−1341 Pan Zhi-Yong, Yu Mei, Xie Deng-Mei, Song Yang, Jiang Gang-Yi. Fast video super-resolution reconstruction using a succinct convolutional neural network. Journal of Optoelectronics · Laser, 2018, 29(12): 1332−1341
[19]	Drulea M, Nedevschi S. Total variation regularization of local-global optical flow. In: Proceedings of the IEEE Conference on Intelligent Transportation Systems. Washington D C, USA: IEEE, 2011. 318−323
[20]	Lucas A, López-Tapia S, Molina R, Katsaggelos A K. Generative adversarial networks and perceptual losses for video super-resolution. IEEE Transactions on Image Processing, 2019, 28(7): 3312−3327 doi: 10.1109/TIP.2019.2895768
[21]	Zhou Y, Yang J X, Li H R, Cao T, Kung S-Y. Adversarial learning for multiscale crowd counting under complex scenes. IEEE Transactions on Cybernetics, 2021, 51(11): 5423−5432
[22]	Zhou Y, Huo S, Xiang W, Hou C, Kung S-Y. Semi-supervised salient object detection using a linear feedback control system model. IEEE Transactions on Cybernetics, 2019, 49(4): 1173−1185 doi: 10.1109/TCYB.2018.2793278
[23]	Huo S, Zhou Y, Lei J, Ling N, Hou C. Iterative feedback control-based salient object segmentation. IEEE Transactions on Multimedia, 2018, 20(6): 1350−1364 doi: 10.1109/TMM.2017.2769801
[24]	Zhou Y, Zhang T, Huo S, Hou C, Kung S-Y. Adaptive irregular graph construction based salient object detection. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(6): 1569−1582 doi: 10.1109/TCSVT.2019.2904463
[25]	Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S E, Anguelov D, et al. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015. 1−9
[26]	He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770−778
[27]	Huang G, Liu Z, Van Der Maaten L, Weinberger K Q. Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 2261−2269
[28]	Zhang Y, Tian Y, Kong Y, Zhong B, Fu Y. Residual dense network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 2472−2481
[29]	Zhou Y, Du X T, Wang M F, Huo S W, Zhang Y D, Kung S-Y. Cross-scale residual network: A general framework for image super-resolution, denoising, and deblocking. IEEE Transactions on Cybernetics, 2022, 52(7): 5855−5867
[30]	Yi P, Wang Z, Jiang K, Shao Z, Ma J. Multi-temporal ultra dense memory network for video super-resolution. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(8): 2503−2516 doi: 10.1109/TCSVT.2019.2925844
[31]	Yi P, Wang Z, Jiang K, Jiang J, Ma J. Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In: Proceedings of the IEEE International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 3106−3115
[32]	Yi P, Wang Z Y, Jiang K, Jiang J J, Lu T, Ma J. A progressive fusion generative adversarial network for realistic and consistent video super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(5): 2264−2280
[33]	H Inc. Myanmar 60p [Online], available: http://www.harmoni-cinc.com/resources/videos/4k-video-clip-center, May 20, 2021
[34]	Wang L, Guo Y, Liu L, Lin Z, Deng X, An W. Deep video super-resolution using HR optical flow estimation. IEEE Transactions on Image Processing, 2020, 29(1): 4323−4336
[35]	Dong C, Loy C C, He K, Tang X. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(2): 295−307 doi: 10.1109/TPAMI.2015.2439281
[36]	Li D, Liu Y, Wang Z. Video super-resolution using motion compensation and residual bidirectional recurrent convolutional network. In: Proceedings of the IEEE International Conference on Image Processing. Beijing, China: IEEE, 2017. 1642−1646
[37]	Kim S Y, Lim J, Na T, Kim M. Video super-resolution based on 3D-CNNs with consideration of scene change. In: Proceedings of the IEEE International Conference on Image Processing. Taipei, China: IEEE, 2019. 2831−2835
[38]	Wang Z, Yi P, Jiang K, Jiang J, Han Z, Lu T, et al. Multi-memory convolutional neural network for video super-resolution. IEEE Transactions on Image Processing, 2019, 28(5): 2530−2544 doi: 10.1109/TIP.2018.2887017
[39]	Kim J, Lee J K, Lee K M. Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 1646−1654
[40]	Lai W S, Huang J B, Ahuja N, Yang M H. Deep Laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 5835−5843
[41]	Wang L, Guo Y, Lin Z, Deng X, An W. Learning for video super-resolution through HR optical flow estimation. In: Proceedings of the Asian Conference on Computer Vision. Perth, Australia: Springer, 2018. 514−529