2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于深度学习的视频插帧研究进展

吴晨阳 张勇 韩树豪 郭春乐 李重仪 程明明

吴晨阳, 张勇, 韩树豪, 郭春乐, 李重仪, 程明明. 基于深度学习的视频插帧研究进展. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240572
引用本文: 吴晨阳, 张勇, 韩树豪, 郭春乐, 李重仪, 程明明. 基于深度学习的视频插帧研究进展. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240572
Wu Chen-Yang, Zhang Yong, Han Shu-Hao, Guo Chun-Le, Li Chong-Yi, Cheng Ming-Ming. Review on deep learning based video frame interpolation. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240572
Citation: Wu Chen-Yang, Zhang Yong, Han Shu-Hao, Guo Chun-Le, Li Chong-Yi, Cheng Ming-Ming. Review on deep learning based video frame interpolation. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240572

基于深度学习的视频插帧研究进展

doi: 10.16383/j.aas.c240572 cstr: 32138.14.j.aas.c240572
基金项目: 国家自然科学基金(62306153, U23B2011, 62176130), 中央高校基本科研业务费(070-63243143), 天津市自然科学基金(24JCJQJC00020), 深圳市科技计划(JCYJ20240813114237048)资助
详细信息
    作者简介:

    吴晨阳:南开大学计算机学院博士研究生. 主要研究方向为深度学习和视频插帧. E-mail: wucy0519@gmail.com

    张勇:重庆长安望江工业有限公司工程师, 南开大学计算机学院博士生研究生. 主要研究方向为目标检测与跟踪和多模态感知数据融合. 本文通讯作者. E-mail: zhangyongtju@163.com

    韩树豪:南开大学计算机学院硕士研究生. 主要研究方向为深度学习和视频插帧. E-mail: hansh@mail.nankai.edu.cn

    郭春乐:南开大学计算机学院副教授, 南开国际先进研究院(深圳福田)副教授. 主要研究方向为计算成像, 图像增强与复原. E-mail: guochunle@nankai.edu.cn

    李重仪:南开大学计算机学院教授, 南开国际先进研究院(深圳福田)教授. 主要研究方向为计算成像. E-mail: lichongyi@nankai.eud.cn

    程明明:南开大学计算机学院教授, 南开国际先进研究院(深圳福田)教授. 主要研究方向为人工智能, 计算机视觉和计算机图形学. E-mail: cmm@nankai.edu.cn

Review on Deep Learning Based Video Frame Interpolation

Funds: Supported by National Natural Science Foundation of China (62306153, U23B2011, 62176130), Fundamental Research Funds for the Central Universities (070-63243143), Natural Science Foundation of Tianjin (24JCJQJC00020) and Shenzhen Science and Technology Program (JCYJ20240813114237048)
More Information
    Author Bio:

    WU Chen-Yang Ph.D. candidate at the College of Computer Science, Nankai University. His research interest covers deep learning and video frame interpolation

    ZHANG Yong Engineer at Chongqing Chang'an Wangjiang Industry Co., Ltd., and Ph.D. candidate at the College of Computer Science, Nankai University. His research interest covers object detection and tracking, and multimodal perception data fusion. Corresponding author of this paper

    HAN Shu-Hao Master student at the College of Computer Science, Nankai University. His research interest covers deep learning and video frame interpolation

    GUO Chun-Le Associate professor at the College of Computer Science, Nankai University, and Nankai International Advanced Research Institute (SHENZHEN-FUTIAN). His research interests cover computational imaging, image enhancement and restoration

    LI Chong-Yi Professor at the College of Computer Science, Nankai University, and Nankai International Advanced Research Institute (SHENZHEN-FUTIAN). His main research interest is computational imaging

    CHENG Ming-Ming Professor at the College of Computer Science, Nankai University, and Nankai International Advanced Research Institute (SHENZHEN-FUTIAN). His research interests cover artificial intelligence, computer vision and computer graphics

  • 摘要: 视频插帧技术是视频处理领域的研究热点问题. 它通过生成中间帧来提高视频的帧率, 从而使视频播放更加流畅, 在老视频修复、电影后期制作和慢动作生成等领域发挥着重要的作用. 随着深度学习技术的迅猛发展, 基于深度学习的视频插帧技术已经成为主流. 本文全面综述现有的基于深度学习的视频插帧工作, 并且深入分析这些方法的优点与不足. 随后, 详细介绍视频插帧领域的常用数据集, 这些数据集为视频插帧相关研究和算法训练提供重要支撑. 最后, 对当前视频插帧研究中仍然存在的挑战进行深入思考, 并且从多个角度展望未来的研究方向, 旨在为该领域后续的发展提供参考.
  • 图  1  基于深度学习的视频插帧发展流程图

    Fig.  1  The flowchart of deep-learning based video frame interpolation development

    图  2  视频插帧算法在数据集上的可视化结果

    Fig.  2  Visualization results of video frame interpolation algorithms on datasets

    表  1  基于深度学习的视频插帧方法对比

    Table  1  Comparison of deep-learning based video frame interpolation methods

    发表年份 方法 归类 损失函数 训练集 评价指标 训练框架
    2017 AdaConv[1] 色彩损失, 梯度损失 Flickr SSIM, IE PyTorch
    SepConv[2] 重构损失 YouTube PSNR, SSIM, MAE, RMSE PyTorch
    2018 PhaseNet[3] 相位 重构损失, 相位损失 DAVIS SSIM Tensorflow
    Super SloMo[4] 光流/后向 重构损失, 感知损失,
    翘曲损失, 平滑损失
    Adobe240, YouTube240 PSNR, SSIM, IE PyTorch
    CS-VFI[5] 光流/前向 重构损失, 感知损失, 色彩损失 YouTube PSNR, SSIM, IE PyTorch
    2019 IM-Net[6] 重构损失, 翘曲损失, 相似损失 YouTube PSNR, SSIM, IE Caffe
    MEMC-Net[7] 光流/后向 Charbon损失 Vimeo-90K PSNR, SSIM, IE PyTorch
    TOFlow[8] 光流/后向 重构损失 Vimeo-90K PSNR, SSIM, SSD PyTorch
    Q-VFI[9] 光流/后向 重构损失, 感知损失 来自互联网 PSNR, SSIM, IE PyTorch
    DAIN[10] 光流/后向 Charbon损失 Vimeo-90K PSNR, SSIM, IE, NIE PyTorch
    2020 FeFlow[11] MMG损失, 重建损失 Vimeo-90K PSNR, SSIM, IE PyTorch
    DSepConv[12] Charbon损失, 梯度损失 Vimeo-90K PSNR, SSIM, IE PyTorch
    AdaCoF[13] 重构损失, 失真损失, 感知损失 Vimeo-90K PSNR, SSIM, IE PyTorch
    CAIN[14] 生成 重构损失, 感知损失 Vimeo-90K PSNR, SSIM PyTorch
    FISR[15] 生成 时间损失, 重构损失 YouTube PSNR, SSIM Tensorflow
    BMBC[16] 光流/后向 光度损失, 平滑损失 Vimeo-90K PSNR, SSIM, IE, NIE PyTorch
    SoftSplat[17] 光流/前向 色彩损失, 感知损失 Vimeo-90K PSNR, SSIM, LPIPS PyTorch
    2021 EDSC[18] Charbon损失, 感知损失 Vimeo-90K PSNR, SSIM, IE, LPIPS PyTorch
    CDFI[19] Charbon损失, 感知损失, 偏移损失 Vimeo-90K PSNR, SSIM, LPIPS PyTorch
    XVFI[20] 光流/后向 重构损失, 平滑损失 X-TRAIN PSNR, SSIM, tOF, EPE PyTorch
    ABME[21] 光流/后向 Charbon损失, Census损失 Vimeo-90K PSNR, SSIM PyTorch
    2022 VFIT[22] 生成 重构损失 Vimeo-90K PSNR, SSIM PyTorch
    M2M[23] 光流/前向 Charbon损失, Census损失 Vimeo-90K PSNR, SSIM PyTorch
    RIFE[24] 光流/后向 重构损失, 蒸馏损失 Vimeo-90K PSNR, SSIM, IE PyTorch
    IFRNet[25] 光流/后向 Charbon损失, Census损失,
    蒸馏损失, 几何一致性损失
    Vimeo-90K PSNR, SSIM, IE, NIE PyTorch
    FILM[26] 光流/后向 重构损失, 感知损失, Gram损失 Vimeo-90K PSNR, SSIM Tensorflow
    VFIFormer[27] 光流/后向 重构损失, Census损失, 蒸馏损失 Vimeo-90K PSNR, SSIM PyTorch
    2023 FLAVR[28] 生成 重构损失 GoPro, Vimeo-90K PSNR, SSIM, TCC PyTorch
    UPR-Net[29] 光流/前向 Charbon损失, Census损失 Vimeo-90K PSNR, SSIM PyTorch
    AMT[30] 光流/后向 Charbon损失, Census损失, 光流损失 Vimeo-90K PSNR, SSIM PyTorch
    EMA-VFI[31] 光流/后向 重构损失, 蒸馏损失,
    色彩损失, 感知损失
    Vimeo-90K PSNR, SSIM PyTorch
    BiFormer[32] 光流/后向 Charbon损失, Census损失 X-TRAIN PSNR, SSIM PyTorch
    2024 MSEConv[33] 重构损失, 感知损失, 对抗损失 Vimeo-90K PSNR, SSIM PyTorch
    LDMVFI[34] 生成 LDM损失 Vimeo-90K LPIPS, FloLPIPS, FID PyTorch
    SwinCS-VFIT[35] 生成 重构损失 Vimeo-90K PSNR, SSIM PyTorch
    VFIMamba[36] 生成 拉普拉斯损失, 翘曲损失 X-Train, Vimeo-90K PSNR, SSIM PyTorch
    PerVFI[37] 光流/前向 负对数似然损失, 感知损失 Vimeo-90K PSNR, SSIM, LPIPS,
    FloLPIPS, VFIPS
    PyTorch
    IQ-VFI[38] 光流/前向 重构损失, 蒸馏损失 Vimeo-90K PSNR, SSIM PyTorch
    SGM[39] 光流/后向 重构损失, 翘曲损失 X-Train, Vimeo-90K PSNR, SSIM PyTorch
    下载: 导出CSV

    表  2  基于深度学习的视频插帧方法性能对比(评价指标: PSNR$ {\uparrow} $/SSIM$ {\uparrow} $/LPIPS$ {\downarrow} $)

    Table  2  Performance comparison of deep-learning based video frame interpolation methods(Evaluation metrics: PSNR$ {\uparrow} $/SSIM$ {\uparrow} $/LPIPS$ {\downarrow} $)

    发表年份 方法 Vimeo-90K UCF101 X-Test Xiph DAVIS SNU-FILM
    Easy Medium Hard Extreme
    2017 AdaConv[1] 32.33/0.957/−
    SepConv[2] 33.45/0.967/0.019 33.02/0.935/0.024 24.34/0.742/− 32.61/0.880/− 26.21/0.857/− 39.68/0.990/− 35.07/0.976/− 29.39/0.926/− 34.32/0.845/−
    2018 Super SloMo[4] 32.90/0.957/− 33.14/0.938/− 25.76/0.850/− 37.28/0.986/− 33.80/0.973/− 28.98/0.925/− 24.15/0.845/−
    2019 MEMC-Net[7] 34.02/0.970/0.027 34.95/0.968/0.030
    IM-Net[6] 33.50/0.947/−
    TOFlow[8] 33.53/0.967/0.027 34.58/0.967/0.027 39.08/0.989/− 34.39/0.974/− 28.44/0.918/− 23.39/0.831/−
    Q-VFI[9] 35.15/0.971/− 32.54/0.948/− 27.73/0.894/−
    DAIN[10] 34.71/0.976/0.022 34.99/0.968/0.028 26.78/0.807/− 26.12/0.870/− 39.73/0.990/− 35.46/0.978/− 30.17/0.934/− 25.09/0.858/−
    2020 FeFlow[11] 35.28/0.976/− 24.00/0.756/−
    DSepConv[12] 34.73/0.974/0.028 35.08/0.969/0.030
    AdaCoF[13] 35.40/0.971/0.031 35.06/0.974/0.033 24.13/0.734/− 32.72/0.881/− 27.07/0.874/− 39.80/0.990/0.019 35.05/0.975/0.036 29.46/0.924/0.075 24.30/0.844/0.148
    CAIN[14] 34.65/0.973/− 34.91/0.969/− 24.50/0.752/− 24.50/0.752/− 26.46/0.856/− 39.78/0.990/− 35.49/0.977/− 29.86/0.929/− 24.69/0.850/−
    FISR[15]
    BMBC[16] 35.01/0.976/− 32.61/0.955/0.032 22.86/0.727/− 31.27/0.880/− 26.42/0.868/− 39.89/0.990/0.018 35.31/0.977/0.034 29.32/0.927/0.075 23.92/0.843/0.152
    SoftSplat[17] 35.48/0.964/0.013 35.10/0.948/0.022 25.48/0.725/− 27.42/0.878/−
    2021 EDSC[18] 34.84/0.975/0.026 35.13/0.968/0.029 24.54/0.768/0.205
    CDFI[19] 35.17/0.964/0.010 35.21/0.950/0.015 24.49/0.742/− 33.01/0.872/− 40.11/0.990/0.013 35.50/0.978/0.024 29.74/0.928/0.056 24.54/0.847/0.121
    XVFI[20] 35.07/0.968/− 32.65/0.968/0.033 30.12/0.870/− 34.06/0.895/− 39.55/0.989/0.020 35.06/0.976/0.037 29.51/0.927/0.075 24.43/0.848/0.143
    ABME[21] 36.18/0.981/− 32.05/0.967/0.058 30.16/0.879/− 33.81/0.903/− 39.69/0.990/0.022 35.28/0.977/0.042 29.64/0.929/0.092 24.54/0.853/0.182
    2022 RIFE[24] 35.61/0.978/0.020 35.28/0.969/− 24.67/0.797/− 25.89/0.803/0.134 40.06/0.991/− 35.75/0.979/− 30.10/0.933/− 24.84/0.853/−
    VFIT[22] 36.96/0.978/− 33.44/0.971/− 28.09/0.888/−
    IFRNet[25] 36.20/0.981/− 35.42/0.970/0.031 30.46/−/− 40.10/0.991/0.017 36.12/0.980/0.029 30.63/0.937/0.058 25.27/0.861/0.128
    FILM[26] 35.87/0.968/− 35.16/0.949/−
    M2M[23] 35.40/0.978/− 35.17/0.970/− 30.81/0.912/− 34.46/0.925/− 39.66/0.991/− 35.74/0.980/− 30.32/0.936/− 25.07/0.860/−
    VFIFormer[27] 36.50/0.982/0.021 35.43/0.970/0.034 24.58/0.805/− 33.69/0.925/− 40.13/0.991/0.018 36.09/0.980/0.033 30.67/0.938/0.069 25.43/0.864/0.146
    2023 FLAVR[28] 36.25/0.975/− 33.31/0.971/− 27.43/0.874/−
    AMT[30] 36.53/0.982/0.021 35.45/0.970/− 39.88/0.991/− 36.12/0.981/− 30.78/0.939/− 25.43/0.865/−
    EMA-VFI[31] 36.64/0.982/0.026 35.48/0.970/− 31.46/−/− 37.61/0.846/0.203 39.98/0.991/− 36.09/0.980/− 30.94/0.939/− 25.69/0.866/−
    BiFormer[32] 31.32/0.921/− 34.48/0.927/−
    UPR-Net[29] 36.42/0.982/− 35.47/0.970/− 30.50/0.905/− 40.44/0.991/− 36.29/0.980/− 30.86/0.938/− 25.63/0.864/−
    2024 LDMVFI[34] 32.16/0.964/0.026 38.89/0.988/0.013 33.97/0.971/0.027 28.14/0.911/0.068 23.34/0.827/0.139
    SGM[39] 29.91/0.897/− 29.25/0.818/− 40.15/0.991/− 36.05/0.980/− 28.88/0.922/− 23.62/0.838/−
    PerVFI[37] 33.89/0.953/0.018 26.23/0.808/0.114
    IQ-VFI[38] 36.60/0.982/− 35.48/0.970/− 40.24/0.991/− 36.24/0.980/− 30.83/0.938/− 25.45/0.863/−
    SwinCS-VFIT[35] 37.13/0.978/− 33.36/0.971/− 28.28/0.891
    VFIMamba[36] 36.64/0.982/− 35.45/0.970/− 32.15/0.925/− 34.62/0.906 40.51/0.991/− 36.40/0.981/− 30.99/0.940/− 25.79/0.868/−
    MSEConv[33] 35.10/0.966/−
    下载: 导出CSV

    表  3  基于深度学习的视频插帧技术使用的数据集

    Table  3  Datasets used for deep learning based video frame interpolation

    数据集 发布年份 视频数目 分辨率 常用评价指标
    Xiph 1994 8 4096 $ {\times} $ 2160 PSNR, SSIM, IPIPS
    Middlebury[61] 2011 24 640 $ {\times} $ 480 IE
    UCF101[64] 2012 13 320 256 $ {\times} $ 256 PSNR, SSIM, IPIPS
    DAVIS[59] 2017 90 4096 $ {\times} $ 2160 PSNR, SSIM, IPIPS
    GOPRO[62] 2017 33 1280 $ {\times} $ 720 PSNR, SSIM, IE
    Adobe240[63] 2017 71 1280 $ {\times} $ 720 PSNR, SSIM, IE
    Vimeo-90K[8] 2019 4 278 448 $ {\times} $ 256 PSNR, SSIM, IPIPS
    HD[7] 2019 7 1280 $ {\times} $ 720 PSNR
    SNU-FILM[14] 2020 31 1280 $ {\times} $ 720 PSNR, SSIM, IPIPS
    X4K1000FPS (Test)[20] 2021 15 4096 $ {\times} $ 2160 PSNR, SSIM, IPIPS
    SportsSloMo[65] 2024 8 498 1280 $ {\times} $ 720 PSNR, SSIM, IE
    下载: 导出CSV
  • [1] Niklaus S, Mai L, Liu F. Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 2270−2279
    [2] Niklaus S, Mai L, Liu F. Video frame interpolation via adaptive separable convolution. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 261−270
    [3] Meyer S, Djelouah A, McWilliams B, Sorkine-Hornung A, Gross M, Schroers C. PhaseNet for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 498−507
    [4] Jiang H Z, Sun D Q, Jampani V, Yang M H, Learned-Miller E, Kautz J. Super SloMo: High quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 9000−9008
    [5] Niklaus S, Liu F. Context-aware synthesis for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 1701−1710
    [6] Peleg T, Szekely P, Sabo D, Sendik O. IM-Net for high resolution video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 2393−2402
    [7] Bao W B, Lai W S, Zhang X Y, Gao Z Y, Yang M H. MEMC-Net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(3): 933−948 doi: 10.1109/TPAMI.2019.2941941
    [8] Xue T F, Chen B A, Wu J J, Wei D L, Freeman W T. Video enhancement with task-oriented flow. International Journal of Computer Vision, 2019, 127(8): 1106−1125 doi: 10.1007/s11263-018-01144-2
    [9] Xu X Y, Li S Y, Sun W X, Yin Q, Yang M H. Quadratic video interpolation. In: Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS). Vancouver, Canada: MIT Press, 2019. 1645−1654
    [10] Bao W B, Lai W S, Ma C, Zhang X Y, Gao Z Y, Yang M H. Depth-aware video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 3698−3707
    [11] Gui S R, Wang C Y, Chen Q H, Tao D C. FeatureFlow: Robust video interpolation via structure-to-texture generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2020. 14001−14010
    [12] Cheng X H, Chen Z Z. Video frame interpolation via deformable separable convolution. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI, 2020. 10607−10614
    [13] Lee H, Kim T, Chung T Y, Pak D, Ban Y, Lee S. AdaCoF: Adaptive collaboration of flows for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE 2020. 5315−5324
    [14] Choi M, Kim H, Han B, Xu N, Lee K M. Channel attention is all you need for video frame interpolation. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI, 2020. 10663−10671
    [15] Kim S Y, Oh J, Kim M. FISR: Deep joint frame interpolation and super-resolution with a multi-scale temporal loss. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI, 2020. 11278−11286
    [16] Park J, Ko K, Lee C, Kim C S. BMBC: Bilateral motion estimation with bilateral cost volume for video interpolation. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 109−125
    [17] Niklaus S, Liu F. Softmax splatting for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2020. 5436−5445
    [18] Cheng X H, Chen Z Z. Multiple video frame interpolation via enhanced deformable separable convolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(10): 7029−7045
    [19] Ding T Y, Liang L M, Zhu Z H, Zharkov I. CDFI: Compression-driven network design for frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021. 7997−8007
    [20] Sim H, Oh J, Kim M. XVFI: Extreme video frame interpolation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE 2021. 14469−14478
    [21] Park J, Lee C, Kim C S. Asymmetric bilateral motion estimation for video frame interpolation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE, 2021. 14519−14528
    [22] Shi Z H, Xu X Y, Liu X H, Chen J, Yang M H. Video frame interpolation transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 17461−17470
    [23] Hu P, Niklaus S, Sclaroff S, Saenko K. Many-to-many splatting for efficient video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 3543−3552
    [24] Huang Z W, Zhang T Y, Heng W, Shi B X, Zhou S C. Real-time intermediate flow estimation for video frame interpolation. In: Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer, 2022. 624−642
    [25] Kong L T, Jiang B Y, Luo D H, Chu W Q, Huang X M, Tai Y, et al. IFRNET: Intermediate feature refine network for efficient frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 1959−1968
    [26] Reda F, Kontkanen J, Tabellion E, Sun D Q, Pantofaru C, Curless B. FILM: Frame interpolation for large motion. In: Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer, 2022. 250−266
    [27] Lu L Y, Wu R Z, Lin H J, Lu J B, Jia J Y. Video frame interpolation with transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 3522−3532
    [28] Kalluri T, Pathak D, Chandraker M, Tran D. FLAVR: Flow-agnostic video representations for fast frame interpolation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE, 2023. 2070−2081
    [29] Jin X, Wu L H, Chen J, Chen Y X, Koo J, Hahm C H. A unified pyramid recurrent network for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE, 2023. 1578−1587
    [30] Li Z, Zhu Z L, Han L H, Hou Q B, Guo C L, Cheng M M. AMT: All-pairs multi-field transforms for efficient frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE, 2023. 9801−9810
    [31] Zhang G Z, Zhu Y H, Wang H N, Chen Y X, Wu G S, Wang L M. Extracting motion and appearance via inter-frame attention for efficient video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE, 2023. 5682−5692
    [32] Park J, Kim J, Kim C S. BiFormer: Learning bilateral motion estimation via bilateral transformer for 4K video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE, 2023. 1568−1577
    [33] Ding X L, Huang P, Zhang D Y, Liang W, Li F, Yang G B, et al. MSEConv: A unified warping framework for video frame interpolation. ACM Transactions on Asian and Low-Resource Language Information Processing, to be published, DOI: 10.1145/3648364
    [34] Danier D, Zhang F, Bull D. LDMVFI: Video frame interpolation with latent diffusion models. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence. Vancouver, Canada: AAAI, 2024. 1472−-1480
    [35] 石昌通, 单鸿涛, 郑光远, 张玉金, 刘怀远, 宗智浩. 改进视觉Transformer的视频插帧方法. 计算机应用研究, 2024, 41(4): 1252−1257

    Shi Chang-Tong, Shan Hong-Tao, Zheng Guang-Yuan, Zhang Yu-Jin, Liu Huai-Yuan, Zong Zhi-Hao. Video frame interpolation method based on improved Visual Transformer. Application Research of Computers, 2024, 41(4): 1252−1257
    [36] Zhang G Z, Liu C X, Cui Y T, Zhao X T, Ma K, Wang L M. VFIMamba: Video frame interpolation with state space models. In: Proceedings of the 38th Conference on Neural Information Processing Systems. Vancouver, Canada: NeurIPS, 2024.
    [37] Wu G Y, Tao X, Li C L, Wang W Y, Liu X H, Zheng Q Q. Perception-oriented video frame interpolation via asymmetric blending. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2024. 2753−2762
    [38] Hu M S, Jiang K, Zhong Z H, Wang Z, Zheng Y Q. IQ-VFI: Implicit quadratic motion estimation for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2024. 6410−6419
    [39] Liu C X, Zhang G Z, Zhao R, Wang L M. Sparse global matching for video frame interpolation with large motion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2024. 19125−19134
    [40] Parihar A S, Varshney D, Pandya K, Aggarwal A. A comprehensive survey on video frame interpolation techniques. The Visual Computer, 2022, 38(1): 295−319 doi: 10.1007/s00371-020-02016-y
    [41] Dong J, Ota K, Dong M X. Video frame interpolation: A comprehensive survey. ACM Transactions on Multimedia Computing, Communications and Applications, 2023, 19(2s): Article No. 78
    [42] Meyer S, Wang O, Zimmer H, Grosse M, Sorkine-Hornung A. Phase-based frame interpolation for video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, USA: IEEE, 2015. 1410−1418
    [43] Prashnani E, Noorkami M, Vaquero D, Sen P. A phase-based approach for animating images using video examples. Computer Graphics Forum, 2017, 36(6): 303−311 doi: 10.1111/cgf.12940
    [44] Wadhwa N, Rubinstein M, Durand F, Freeman W T. Phase-based video motion processing. ACM Transactions on Graphics (TOG), 2013, 32(4): Article No. 80
    [45] Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE, 2015. 4489−4497
    [46] 林传健, 邓炜, 童同, 高钦泉. 基于深度体素流的模糊视频插帧方法. 计算机应用, 2020, 40(3): 819−824 doi: 10.11772/j.issn.1001-9081.2019081474

    Lin Chuan-Jian, Deng Wei, Tong Tong, Gao Qin-Quan. Blurred video frame interpolation method based on deep voxel flow. Journal of Computer Applications, 2020, 40(3): 819−824 doi: 10.11772/j.issn.1001-9081.2019081474
    [47] Cho H, Kim T, Jeong Y, Yoon K J. TTA-EVF: Test-time adaptation for event-based video frame interpolation via reliable pixel and sample estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2024. 25701−25711
    [48] Dosovitskiy A, Fischer P, Ilg E, Häusser P, Hazirbas C, Golkov V, et al. FlowNet: Learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE, 2015. 2758−2766
    [49] Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T. Flownet 2.0: Evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 2462−2470
    [50] Ranjan A, Black M J. Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 2720−2729
    [51] Sun D Q, Yang X D, Liu M Y, Kautz J. PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 8934−8943
    [52] Hui T W, Tang X O, Loy C C. LiteFlowNet: A lightweight convolutional neural network for optical flow estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 8981−8989
    [53] Yang G S, Ramanan D. Volumetric correspondence networks for optical flow. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2019. Article No. 72
    [54] Teed Z, Deng J. RAFT: Recurrent all-pairs field transforms for optical flow. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 402−419
    [55] Zhao S Y, Zhao L, Zhang Z X, Zhou E Y, Metaxas D. Global matching with overlapping attention for optical flow estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 17592−17601
    [56] 张倩, 姜峰. 基于深度学习的视频插帧算法. 智能计算机与应用, 2019, 9(4): 252−257 doi: 10.3969/j.issn.2095-2163.2019.04.058

    Zhang Qian, Jiang Feng. Video interpolation based on deep learing. Intelligent Computer and Applications, 2019, 9(4): 252−257 doi: 10.3969/j.issn.2095-2163.2019.04.058
    [57] 马境远, 王川铭. 一种多尺度光流预测与融合的实时视频插帧方法. 小型微型计算机系统, 2021, 42(12): 2567−2571

    Ma Jing-Yuan, Wang Chuan-Ming. Real-time video frame interpolation based on multi-scale optical prediction and fusion. Journal of Chinese Computer Systems, 2021, 42(12): 2567−2571
    [58] 杨华, 王姣, 张维君, 吴杰宏, 高利军. 基于光流估计的轻量级视频插帧算法. 沈阳航空航天大学学报, 2022, 39(6): 57−64 doi: 10.3969/j.issn.2095-1248.2022.06.008

    Yang Hua, Wang Jiao, Zhang Wei-Jun, Wu Jie-Hong, Gao Li-Jun. Lightweight video frame interpolation algorithm based on optical flow estimation. Journal of Shenyang Aerospace ace University, 2022, 39(6): 57−64 doi: 10.3969/j.issn.2095-1248.2022.06.008
    [59] Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A. A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE, 2016. 724−732
    [60] Ding C, Lin M Y, Zhang H J, Liu J Z, Yu L. Video frame interpolation with stereo event and intensity cameras. IEEE Transactions on Multimedia, 2024, 26: 9187−9202 doi: 10.1109/TMM.2024.3387690
    [61] Baker S, Scharstein D, Lewis J P, Roth S, Black M J, Szeliski R. A database and evaluation methodology for optical flow. International Journal of Computer Vision, 2011, 92(1): 1−31 doi: 10.1007/s11263-010-0390-2
    [62] Nah S, Kim T H, Lee K M. Deep multi-scale convolutional neural network for dynamic scene deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 257−265
    [63] Su S C, Delbracio M, Wang J, Sapiro G, Heidrich W, Wang O. Deep video deblurring for hand-held cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 237−246
    [64] Soomro K, Zamir A R, Shah M. UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv: 1212.0402, 2012.
    [65] Chen J B, Jiang H Z. Sportsslomo: A new benchmark and baselines for human-centric video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2024. 6475−6486
    [66] Kiefhaber S, Niklaus S, Liu F, Schaub-Meyer S. Benchmarking video frame interpolation. arXiv preprint arXiv: 2403.17128, 2024.
  • 加载中
计量
  • 文章访问数:  39
  • HTML全文浏览量:  31
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-08-14
  • 网络出版日期:  2025-04-20

目录

    /

    返回文章
    返回