伪时空图卷积网络修复姿态引导的Transformer行人视频修复方法

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：为解决监控视频中被遮挡行人的修复问题,提出了一种基于人体姿态的行人视频修复方法,即先修复视频中残缺的行人姿态序列,然后在修补后的姿势序列的引导下修复视频帧中人体的缺失部分.该方法采用OpenPose从视频中提取被遮挡的人体姿态序列,针对其因存在遮挡情况导致未识别出和未准确识别部分关节点的问题,提出了一种伪时空图卷积网络模型对缺失姿态进行修复,得到一个相对准确的姿态序列;基于修复后的姿态,提出了基于姿态序列引导的Transformer行人视频修复模型.在Human3.6M数据集上进行了测试,所提出的方法在4个指标PSNR,RMSE,SSIM,LPIPS上均比对比方法有提升,特别是RMSE指标提升了 9.50％,LPIPS指标提升了 21.67％.

外文标题：Transformer-Based Pedestrian Video Inpainting Guided by Pseudo-Spatiotemporal Pose Correction Graph Convolutional Networks

外文摘要：In order to solve the problem of repairing occluded pedestrians in surveillance videos,a pedestrian video inpainting method based on human pose is proposed,which repairs the incomplete pedestrian pose sequence at first,and then inpaints the video frames under the guidance of the repaired pose sequence.Firstly,the proposed method uses OpenPose to extract the occluded human pose sequence from the video.Due to occlusions,some joints of the extracted poses may be unrecognized or inaccurately recognized.We thus propose a pseudo-spatiotemporal graph convolutional network to repair the extracted poses and obtain an accurate pose sequence.We then propose a Transformer-based pedestrian video repair model guided by the repaired pose sequence.Tested on the Human3.6M dataset,the proposed method is better than previous approaches in terms of four metrics including PSNR,RMSE,SSIM,and LPIPS.Especially,RMSE is im-proved by 9.50％,and LPIPS is improved by 21.67％.

外文关键词：

deep learninggraph convolutional networkTransformerhuman pose completionvideo inpainting

作者：

唐福梅、聂勇伟、余嘉祺、张青、李桂清

展开 >

作者单位：

华南理工大学计算机科学与工程学院广州 510006

中山大学计算机学院广州 510006

关键词：

深度学习图卷积神经网络 Transformer 人体姿态补全视频修复

基金：

国家自然科学基金面上项目国家自然科学基金面上项目广东省自然科学基金面上项目广东省自然科学基金面上项目

项目编号：

62072191619721602019A15150108602021A1515012301

出版年：

2024

DOI：

10.3724/SP.J.1089.2024.19773

计算机辅助设计与图形学学报

中国计算机学会

计算机辅助设计与图形学学报

CSTPCD北大核心

影响因子：0.892

ISSN：1003-9775

年,卷(期)：2024.36(4)