Aouada, Djamila[University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Ottersten, Björn[University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Feb-2019
14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Prague, 25-27 February 2018
Yes
International
14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications.
from 25-02-2019 to 27-02-2019
Prague
[en] View-invariant ; Human action recognition ; monocular camera ; pose estimation
[en] View-invariant action recognition using a single RGB camera represents a very challenging topic due to the lack of 3D information in RGB images. Lately, the recent advances in deep learning made it possible to extract a 3D skeleton from a single RGB image.
Taking advantage of this impressive progress, we propose a simple framework for fast and view-invariant action recognition using a single RGB camera. The proposed pipeline can be seen as the association of two key steps. The first step is the estimation of a 3D skeleton from a single RGB image using a CNN-based pose estimator such as VNect. The second one aims at computing view-invariant skeleton-based features based on the estimated 3D skeletons. Experiments are conducted on two well-known benchmarks, namely, IXMAS and Northwestern-UCLA datasets. The obtained results prove the validity of our concept, which suggests a new way to address the challenge of RGB-based view-invariant action recognition.