DeepVI: A Novel Framework for Learning Deep View-Invariant Human Action Representations using a Single RGB Camera

PAPADOPOULOS, Konstantinos; GHORBEL, Enjie; OYEDOTUN, Oyebade; AOUADA, Djamila; OTTERSTEN, Björn

Request a copy

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

DeepVI: A Novel Framework for Learning Deep View-Invariant Human Action Representations using a Single RGB Camera

PAPADOPOULOS, Konstantinos; GHORBEL, Enjie; OYEDOTUN, Oyebade et al.

2020 • In IEEE International Conference on Automatic Face and Gesture Recognition, Buenos Aires 18-22 May 2020

Peer reviewed

Permalink
https://hdl.handle.net/10993/42471

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

FG2020_camera_ready.pdf

Author preprint (596.3 kB)

Request a copy

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Disciplines :

Computer science

Author, co-author :

PAPADOPOULOS, Konstantinos ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

GHORBEL, Enjie ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

OYEDOTUN, Oyebade ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

AOUADA, Djamila ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

OTTERSTEN, Björn ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

External co-authors :

Language :

English

Title :

DeepVI: A Novel Framework for Learning Deep View-Invariant Human Action Representations using a Single RGB Camera

Publication date :

2020

Event name :

IEEE International Conference on Automatic Face and Gesture Recognition

Event date :

from 18-05-2020 to 22-05-2020

Main work title :

IEEE International Conference on Automatic Face and Gesture Recognition, Buenos Aires 18-22 May 2020

Peer reviewed :

Peer reviewed

FnR Project :

FNR10415355 - 3d Action Recognition Using Refinement And Invariance Strategies For Reliable Surveillance, 2015 (01/06/2016-31/05/2019) - Bjorn Ottersten

Available on ORBilu :

since 11 February 2020

Statistics

Number of views

319 (21 by Unilu)

Number of downloads

8 (3 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

R. Baptista, E. Ghorbel, K. Papadopoulos, G. Demisse, D. Aouada, and B. Ottersten. View-invariant action recognition from rgb data via 3d pose estimation. In ICASSP. IEEE, 2019.
F. Baradel, C. Wolf, and J. Mille. Human action recognition: Pose-based attention draws focus to hands. In ICCV, pages 604-613, 2017.
F. Baradel, C. Wolf, and J. Mille. Human Activity Recognition with Pose-driven Attention to RGB. In BMVC, pages 1-14, Newcastle, United Kingdom, Sept. 2018.
D. R. K. Brownrigg. The weighted median filter. Commun. ACM, 27(8):807-818, Aug. 1984.
S. Das, A. Chaudhary, F. Bremond, and M. Thonnat. Where to focus on for human action recognition? In WACV, pages 71-80, Jan 2019.
G. G. Demisse, K. Papadopoulos, D. Aouada, and B. Ottersten. Pose encoding for robust skeleton-based action recognition. CVPRW: Visual Understanding of Humans in Crowd Scene, Salt Lake City, Utah, June 18-22, 2018, 2018.
A. Farhadi and M. K. Tabrizi. Learning to recognize activities from the wrong view point. In ECCV, pages 154-166. Springer, 2008.
N. C. Garcia, P. Morerio, and V. Murino. Modality distillation with multiple stream networks for action recognition. In ECCV, pages 103-118, 2018.
E. Ghorbel, J. Boonaert, R. Boutteau, S. Lecoeuche, and X. Savatier. An extension of kernel learning methods using a modified log-euclidean distance for fast and accurate skeleton-based human action recognition. Computer Vision and Image Understanding, 175:32-43, 2018.
E. Ghorbel, R. Boutteau, J. Boonaert, X. Savatier, and S. Lecoeuche. Kinematic spline curves: A temporal invariant descriptor for fast action recognition. Image and Vision Computing, 77:60-71, 2018.
E. Ghorbel, K. Papadopoulos, R. Baptista, H. Pathak, G. Demisse, D. Aouada, and B. Ottersten. A view-invariant framework for fast skeleton-based action recognition using a single rgb camera. In 2019 International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP), 2019.
A. Gupta, J. Martinez, J. J. Little, and R. J. Woodham. 3d pose from motion for cross-view action recognition via non-linear circulant temporal encoding. In CVPR, pages 2601-2608, 2014.
T. Hao, D. Wu, Q. Wang, and J.-S. Sun. Multi-view representation learning for multi-view action recognition. Journal of Visual Communication and Image Representation, 48:453-460, 2017.
I. N. Junejo, E. Dexter, I. Laptev, and P. Pérez. Cross-view action recognition from temporal self-similarities. In ECCV, pages 293-306. Springer, 2008.
T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
Y. Kong, Z. Ding, J. Li, and Y. Fu. Deeply learned view-invariant features for cross-view action recognition. IEEE Transactions on Image Processing, 26(6):3028-3037, 2017.
C. Lea, M. D. Flynn, R. Vidal, A. Reiter, and G. D. Hager. Temporal convolutional networks for action segmentation and detection. In CVPR, pages 156-165, 2017.
I. Lee, D. Kim, S. Kang, and S. Lee. Ensemble deep learning for skeleton-based action recognition using temporal sliding lstm networks. In ICCV, pages 1012-1020, 2017.
B. Li, O. I. Camps, and M. Sznaier. Cross-view activity recognition using hankelets. In CVPR, pages 1362-1369. IEEE, 2012.
R. Li and T. Zickler. Discriminative virtual views for cross-view action recognition. In CVPR, pages 2855-2862. IEEE, 2012.
J. Liu, M. Shah, B. Kuipers, and S. Savarese. Cross-view action recognition via view knowledge transfer. 2011.
M. Liu, H. Liu, and C. Chen. Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognition, 68:346-362, 2017.
M. Liu and J. Yuan. Recognizing human actions as the evolution of pose estimation maps. In CVPR, pages 1159-1168, 2018.
D. Mehta, S. Sridhar, O. Sotnychenko, H. Rhodin, M. Shafiei, H.-P. Seidel, W. Xu, D. Casas, and C. Theobalt. Vnect: Real-time 3d human pose estimation with a single rgb camera. ACM Transactions on Graphics (TOG), 36(4):44, 2017.
Q. Nie, J. Wang, X. Wang, and Y. Liu. View-invariant human action recognition based on a 3d bio-constrained skeleton model. IEEE Transactions on Image Processing, 28(8):3959-3972, Aug 2019.
K. Papadopoulos, M. Antunes, D. Aouada, and B. Ottersten. Enhanced trajectory-based action recognition using human pose. In ICIP, pages 1807-1811. IEEE, 2017.
K. Papadopoulos, G. Demisse, E. Ghorbel, M. Antunes, D. Aouada, and B. Ottersten. Localized trajectories for 2d and 3d action recognition. Sensors, 19(16):3503, 2019.
H. Rahmani and A. Mian. Learning a non-linear knowledge transfer model for cross-view action recognition. In CVPR, pages 2458-2466, 2015.
H. Rahmani, A. Mian, and M. Shah. Learning a deep model for human action recognition from novel viewpoints. IEEE transactions on pattern analysis and machine intelligence, 40(3):667-681, 2017.
H. Rahmani, A. Mian, and M. Shah. Learning a deep model for human action recognition from novel viewpoints. IEEE transactions on pattern analysis and machine intelligence, 40(3):667-681, 2018.
G. Rogez, P. Weinzaepfel, and C. Schmid. Lcr-net++: Multi-person 2d and 3d pose detection in natural images. IEEE transactions on pattern analysis and machine intelligence, 2019.
A. Shahroudy, J. Liu, T.-T. Ng, and G. Wang. Ntu rgb+d: A large scale dataset for 3d human activity analysis. In CVPR, June 2016.
D. Tome, C. Russell, and L. Agapito. Lifting from the deep: Convolutional 3d pose estimation from a single image. In CVPR, pages 2500-2509, 2017.
A. Ulhaq. Deep cross-view convolutional features for view-invariant action recognition. In 2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS), pages 137-142, Dec 2018.
A. Ulhaq, X. Yin, J. He, and Y. Zhang. On space-time filtering framework for matching human actions across different viewpoints. IEEE Transactions on Image Processing, 27(3):1230-1242, March 2018.
S. Varrette, P. Bouvry, H. Cartiaux, and F. Georgatos. Management of an academic hpc cluster: The ul experience. In Proc. of the 2014 Intl. Conf. on High Performance Computing & Simulation (HPCS 2014), pages 959-967, Bologna, Italy, July 2014. IEEE.
D. Wang, W. Ouyang, W. Li, and D. Xu. Dividing and aggregating network for multi-view action recognition. In ECCV, pages 451-467, 2018.
J. Wang, X. Nie, Y. Xia, Y. Wu, and S.-C. Zhu. Cross-view action modeling, learning and recognition. In CVPR, pages 2649-2656, 2014.
L. Xia, C.-C. Chen, and J. K. Aggarwal. View invariant human action recognition using histograms of 3d joints. In CVPRW, pages 20-27. IEEE, 2012.
S. Yan, Y. Xiong, and D. Lin. Spatial temporal graph convolutional networks for skeleton-based action recognition. In AAAI, 2018.
W. Yang, W. Ouyang, X. Wang, J. Ren, H. Li, and X. Wang. 3d human pose estimation in the wild by adversarial learning. In CVPR, pages 5255-5264, 2018.
H. Yasin, U. Iqbal, B. Kruger, A. Weber, and J. Gall. A dual-source approach for 3d pose estimation from a single image. In CVPR, pages 4948-4956, 2016.
A. Zanfir, E. Marinoiu, and C. Sminchisescu. Monocular 3d pose and shape estimation of multiple people in natural scenes-the importance of multiple scene constraints. In CVPR, pages 2148-2157, 2018.
P. Zhang, C. Lan, J. Xing, W. Zeng, J. Xue, and N. Zheng. View adaptive neural networks for high performance skeleton-based human action recognition. IEEE transactions on pattern analysis and machine intelligence, 41(8):1963-1978, 2019.
S. Zhang, H. Jiang, S. Wei, and L.-R. Dai. Rectified linear neural networks with tied-scalar regularization for lvcsr. In Sixteenth Annual Conference of the International Speech Communication Association, 2015.
Z. Zhang, C. Wang, B. Xiao, W. Zhou, S. Liu, and C. Shi. Cross-view action recognition via a continuous virtual path. In CVPR, pages 2690-2697, 2013.
J. Zheng and Z. Jiang. Learning view-invariant sparse representations for cross-view action recognition. In ICCV, pages 3176-3183, 2013.