Reference : Temporal 3D Human Pose Estimation for Action Recognition from Arbitrary Viewpoints
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Computer science
Computational Sciences
http://hdl.handle.net/10993/41079
Temporal 3D Human Pose Estimation for Action Recognition from Arbitrary Viewpoints
English
Adel Musallam, Mohamed []
Baptista, Renato mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Al Ismaeil, Kassem mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Aouada, Djamila mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Dec-2019
6th Annual Conf. on Computational Science & Computational Intelligence, Las Vegas 5-7 December 2019
Conference Publishing Services
Yes
International
6th Annual Conf. on Computational Science & Computational Intelligence
5-7 December 2019
https://americancse.org/events/csci2019
Las Vegas, Nevada
[en] View-Invariant ; Human Action Recognition ; Human Pose Estimation
[en] This work presents a new view-invariant action recognition system that is able to classify human actions by using a single RGB camera, including challenging camera viewpoints. Understanding actions from different viewpoints remains an extremely challenging problem, due to depth ambiguities, occlusion, and a large variety of appearances and scenes. Moreover, using only the information from the 2D perspective gives different interpretations for the same action seen from different viewpoints. Our system operates in two subsequent stages. The first stage estimates the 2D human pose using a convolution neural network. In the next stage, the 2D human poses are lifted to 3D human poses, using a temporal convolution neural network that enforces the temporal coherence over the estimated 3D poses. The estimated 3D poses from different viewpoints are then aligned to the same camera reference frame. Finally, we propose to use a temporal convolution network-based classifier for cross-view action recognition.
Our results show that we can achieve state of art view-invariant action recognition accuracy even for the challenging viewpoints by only using RGB videos, without pre-training on synthetic or motion capture data.
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > SIGCOM
Researchers ; General public
http://hdl.handle.net/10993/41079
H2020 ; 689947 - STARR - Decision SupporT and self-mAnagement system for stRoke survivoRs
FnR ; FNR10415355 > Bjorn Ottersten > 3D-ACT > 3D Action Recognition Using Refinement and Invariance Strategies for Reliable Surveillance > 01/06/2016 > 31/05/2019 > 2015

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
csci_cameraready_2019.pdfAuthor postprint5.87 MBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.