HandyPose: Multi-level framework for hand pose estimation

扫码查看

原文链接

NSTL
Elsevier

外文摘要：Hand pose estimation is a challenging task due to the large number of degrees of freedom and the frequent occlusions of joints. To address these challenges, we propose HandyPose, a single-pass, end -to-end trainable architecture for 2D hand pose estimation using a single RGB image as input. Adopt-ing an encoder-decoder framework with multi-level features, along with a novel multi-level waterfall atrous spatial pooling module for multi-scale representations, our method achieves high accuracy in hand pose while maintaining manageable size complexity and modularity of the network. HandyPose takes a multi-scale approach to representing context by incorporating spatial information at various levels of the network to mitigate the loss of resolution due to pooling. Our advanced multi-level waterfall module leverages the efficiency of progressive cascade filtering while maintaining larger fields-of-view through the concatenation of multi-level features from different levels of the network in the waterfall module. The decoder incorporates both the waterfall and multi-scale features for the generation of accurate joint heatmaps in a single stage. Our results demonstrate state-of-the-art performance on popular datasets and show that HandyPose is a robust and efficient architecture for 2D hand pose estimation.(c) 2022 Elsevier Ltd. All rights reserved.

外文关键词：

Hand pose estimationFeature representationsComputer vision

作者：

Gupta, Divyansh、Artacho, Bruno、Savakis, Andreas

展开 >

作者单位：

Rochester Inst Technol

出版年：

2022

DOI：

10.1016/j.patcog.2022.108674

Pattern Recognition

EISCI

ISSN：0031-3203

年,卷(期)：2022.128

被引量2
参考文献量40