References of "Aouada, Djamila 50000437"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailTowards Generalization of 3D Human Pose Estimation In The Wild
Baptista, Renato UL; Saint, Alexandre Fabian A UL; Al Ismaeil, Kassem UL et al

in International Conference on Pattern Recognition (ICPR) Workshop on 3D Human Understanding, Milan 10-15 January 2021 (2020)

In this paper, we propose 3DBodyTex.Pose, a dataset that addresses the task of 3D human pose estimation in-the-wild. Generalization to in-the-wild images remains limited due to the lack of adequate ... [more ▼]

In this paper, we propose 3DBodyTex.Pose, a dataset that addresses the task of 3D human pose estimation in-the-wild. Generalization to in-the-wild images remains limited due to the lack of adequate datasets. Existent ones are usually collected in indoor controlled environments where motion capture systems are used to obtain the 3D ground-truth annotations of humans. 3DBodyTex.Pose offers high quality and rich data containing 405 different real subjects in various clothing and poses, and 81k image samples with ground-truth 2D and 3D pose annotations. These images are generated from 200 viewpoints among which 70 challenging extreme viewpoints. This data was created starting from high resolution textured 3D body scans and by incorporating various realistic backgrounds. Retraining a state-of-the-art 3D pose estimation approach using data augmented with 3DBodyTex.Pose showed promising improvement in the overall performance, and a sensible decrease in the per joint position error when testing on challenging viewpoints. The 3DBodyTex.Pose is expected to offer the research community with new possibilities for generalizing 3D pose estimation from monocular in-the-wild images. [less ▲]

Detailed reference viewed: 179 (16 UL)
Full Text
Peer Reviewed
See detailDeepVI: A Novel Framework for Learning Deep View-Invariant Human Action Representations using a Single RGB Camera
Papadopoulos, Konstantinos UL; Ghorbel, Enjie UL; Oyedotun, Oyebade UL et al

in IEEE International Conference on Automatic Face and Gesture Recognition, Buenos Aires 18-22 May 2020 (2020)

Detailed reference viewed: 146 (19 UL)
Full Text
Peer Reviewed
See detailTemporal 3D Human Pose Estimation for Action Recognition from Arbitrary Viewpoints
Adel Musallam, Mohamed; Baptista, Renato UL; Al Ismaeil, Kassem UL et al

in 6th Annual Conf. on Computational Science & Computational Intelligence, Las Vegas 5-7 December 2019 (2019, December)

This work presents a new view-invariant action recognition system that is able to classify human actions by using a single RGB camera, including challenging camera viewpoints. Understanding actions from ... [more ▼]

This work presents a new view-invariant action recognition system that is able to classify human actions by using a single RGB camera, including challenging camera viewpoints. Understanding actions from different viewpoints remains an extremely challenging problem, due to depth ambiguities, occlusion, and a large variety of appearances and scenes. Moreover, using only the information from the 2D perspective gives different interpretations for the same action seen from different viewpoints. Our system operates in two subsequent stages. The first stage estimates the 2D human pose using a convolution neural network. In the next stage, the 2D human poses are lifted to 3D human poses, using a temporal convolution neural network that enforces the temporal coherence over the estimated 3D poses. The estimated 3D poses from different viewpoints are then aligned to the same camera reference frame. Finally, we propose to use a temporal convolution network-based classifier for cross-view action recognition. Our results show that we can achieve state of art view-invariant action recognition accuracy even for the challenging viewpoints by only using RGB videos, without pre-training on synthetic or motion capture data. [less ▲]

Detailed reference viewed: 302 (11 UL)
Full Text
Peer Reviewed
See detailBODYFITR: Robust Automatic 3D Human Body Fitting
Saint, Alexandre Fabian A UL; Shabayek, Abd El Rahman UL; Cherenkova, Kseniya UL et al

in Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP) (2019, September 22)

This paper proposes BODYFITR, a fully automatic method to fit a human body model to static 3D scans with complex poses. Automatic and reliable 3D human body fitting is necessary for many applications ... [more ▼]

This paper proposes BODYFITR, a fully automatic method to fit a human body model to static 3D scans with complex poses. Automatic and reliable 3D human body fitting is necessary for many applications related to healthcare, digital ergonomics, avatar creation and security, especially in industrial contexts for large-scale product design. Existing works either make prior assumptions on the pose, require manual annotation of the data or have difficulty handling complex poses. This work addresses these limitations by providing a novel automatic fitting pipeline with carefully integrated building blocks designed for a systematic and robust approach. It is validated on the 3DBodyTex dataset, with hundreds of high-quality 3D body scans, and shown to outperform prior works in static body pose and shape estimation, qualitatively and quantitatively. The method is also applied to the creation of realistic 3D avatars from the high-quality texture scans of 3DBodyTex, further demonstrating its capabilities. [less ▲]

Detailed reference viewed: 256 (31 UL)
Full Text
Peer Reviewed
See detailVIEW-INVARIANT ACTION RECOGNITION FROM RGB DATA VIA 3D POSE ESTIMATION
Baptista, Renato UL; Ghorbel, Enjie UL; Papadopoulos, Konstantinos UL et al

in IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, UK, 12–17 May 2019 (2019, May)

In this paper, we propose a novel view-invariant action recognition method using a single monocular RGB camera. View-invariance remains a very challenging topic in 2D action recognition due to the lack of ... [more ▼]

In this paper, we propose a novel view-invariant action recognition method using a single monocular RGB camera. View-invariance remains a very challenging topic in 2D action recognition due to the lack of 3D information in RGB images. Most successful approaches make use of the concept of knowledge transfer by projecting 3D synthetic data to multiple viewpoints. Instead of relying on knowledge transfer, we propose to augment the RGB data by a third dimension by means of 3D skeleton estimation from 2D images using a CNN-based pose estimator. In order to ensure view-invariance, a pre-processing for alignment is applied followed by data expansion as a way for denoising. Finally, a Long-Short Term Memory (LSTM) architecture is used to model the temporal dependency between skeletons. The proposed network is trained to directly recognize actions from aligned 3D skeletons. The experiments performed on the challenging Northwestern-UCLA dataset show the superiority of our approach as compared to state-of-the-art ones. [less ▲]

Detailed reference viewed: 299 (32 UL)
Full Text
Peer Reviewed
See detailA View-invariant Framework for Fast Skeleton-based Action Recognition Using a Single RGB Camera
Ghorbel, Enjie UL; Papadopoulos, Konstantinos UL; Baptista, Renato UL et al

in 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Prague, 25-27 February 2018 (2019, February)

View-invariant action recognition using a single RGB camera represents a very challenging topic due to the lack of 3D information in RGB images. Lately, the recent advances in deep learning made it ... [more ▼]

View-invariant action recognition using a single RGB camera represents a very challenging topic due to the lack of 3D information in RGB images. Lately, the recent advances in deep learning made it possible to extract a 3D skeleton from a single RGB image. Taking advantage of this impressive progress, we propose a simple framework for fast and view-invariant action recognition using a single RGB camera. The proposed pipeline can be seen as the association of two key steps. The first step is the estimation of a 3D skeleton from a single RGB image using a CNN-based pose estimator such as VNect. The second one aims at computing view-invariant skeleton-based features based on the estimated 3D skeletons. Experiments are conducted on two well-known benchmarks, namely, IXMAS and Northwestern-UCLA datasets. The obtained results prove the validity of our concept, which suggests a new way to address the challenge of RGB-based view-invariant action recognition. [less ▲]

Detailed reference viewed: 485 (23 UL)
Full Text
Peer Reviewed
See detailTwo-stage RGB-based Action Detection using Augmented 3D Poses
Papadopoulos, Konstantinos UL; Ghorbel, Enjie UL; Baptista, Renato UL et al

in 18th International Conference on Computer Analysis of Images and Patterns SALERNO, 3-5 SEPTEMBER, 2019 (2019)

In this paper, a novel approach for action detection from RGB sequences is proposed. This concept takes advantage of the recent development of CNNs to estimate 3D human poses from a monocular camera. To ... [more ▼]

In this paper, a novel approach for action detection from RGB sequences is proposed. This concept takes advantage of the recent development of CNNs to estimate 3D human poses from a monocular camera. To show the validity of our method, we propose a 3D skeleton-based two-stage action detection approach. For localizing actions in unsegmented sequences, Relative Joint Position (RJP) and Histogram Of Displacements (HOD) are used as inputs to a k-nearest neighbor binary classifier in order to define action segments. Afterwards, to recognize the localized action proposals, a compact Long Short-Term Memory (LSTM) network with a de-noising expansion unit is employed. Compared to previous RGB-based methods, our approach offers robustness to radial motion, view-invariance and low computational complexity. Results on the Online Action Detection dataset show that our method outperforms earlier RGB-based approaches. [less ▲]

Detailed reference viewed: 205 (11 UL)
Full Text
Peer Reviewed
See detailLocalized Trajectories for 2D and 3D Action Recognition
Papadopoulos, Konstantinos UL; Demisse, Girum UL; Ghorbel, Enjie UL et al

in Sensors (2019)

The Dense Trajectories concept is one of the most successful approaches in action recognition, suitable for scenarios involving a significant amount of motion. However, due to noise and background motion ... [more ▼]

The Dense Trajectories concept is one of the most successful approaches in action recognition, suitable for scenarios involving a significant amount of motion. However, due to noise and background motion, many generated trajectories are irrelevant to the actual human activity and can potentially lead to performance degradation. In this paper, we propose Localized Trajectories as an improved version of Dense Trajectories where motion trajectories are clustered around human body joints provided by RGB-D cameras and then encoded by local Bag-of-Words. As a result, the Localized Trajectories concept provides an advanced discriminative representation of actions. Moreover, we generalize Localized Trajectories to 3D by using the depth modality. One of the main advantages of 3D Localized Trajectories is that they describe radial displacements that are perpendicular to the image plane. Extensive experiments and analysis were carried out on five different datasets. [less ▲]

Detailed reference viewed: 347 (17 UL)
Full Text
Peer Reviewed
See detailHome Self-Training: Visual Feedback for Assisting Physical Activity for Stroke Survivors
Baptista, Renato UL; Ghorbel, Enjie UL; Shabayek, Abd El Rahman UL et al

in Computer Methods and Programs in Biomedicine (2019)

Background and Objective: With the increase in the number of stroke survivors, there is an urgent need for designing appropriate home-based rehabilitation tools to reduce health-care costs. The objective ... [more ▼]

Background and Objective: With the increase in the number of stroke survivors, there is an urgent need for designing appropriate home-based rehabilitation tools to reduce health-care costs. The objective is to empower the rehabilitation of post-stroke patients at the comfort of their homes by supporting them while exercising without the physical presence of the therapist. Methods: A novel low-cost home-based training system is introduced. This system is designed as a composition of two linked applications: one for the therapist and another one for the patient. The therapist prescribes personalized exercises remotely, monitors the home-based training and re-adapts the exercises if required. On the other side, the patient loads the prescribed exercises, trains the prescribed exercise while being guided by color-based visual feedback and gets updates about the exercise performance. To achieve that, our system provides three main functionalities, namely: 1) Feedback proposals guiding a personalized exercise session, 2) Posture monitoring optimizing the effectiveness of the session, 3) Assessment of the quality of the motion. Results: The proposed system is evaluated on 10 healthy participants without any previous contact with the system. To analyze the impact of the feedback proposals, we carried out two different experimental sessions: without and with feedback proposals. The obtained results give preliminary assessments about the interest of using such feedback. Conclusions: Obtained results on 10 healthy participants are promising. This encourages to test the system in a realistic clinical context for the rehabilitation of stroke survivors. [less ▲]

Detailed reference viewed: 180 (16 UL)
Full Text
Peer Reviewed
See detailDeformation-Based Abnormal Motion Detection using 3D Skeletons
Baptista, Renato UL; Demisse, Girum UL; Aouada, Djamila UL et al

in IEEE International Conference on Image Processing Theory, Tools and Applications (IPTA) (2018, November)

In this paper, we propose a system for abnormal motion detection using 3D skeleton information, where the abnormal motion is not known a priori. To that end, we present a curve-based representation of a ... [more ▼]

In this paper, we propose a system for abnormal motion detection using 3D skeleton information, where the abnormal motion is not known a priori. To that end, we present a curve-based representation of a sequence, based on few joints of a 3D skeleton, and a deformation-based distance function. We further introduce a time-variation model that is specifically designed for assessing the quality of a motion; we refer to a distance function that is based on such a model as~\emph{motion quality distance}. The overall advantages of the proposed approach are 1) lower dimensional yet representative sequence representation and 2) a distance function that emphasizes time variation, the motion quality distance, which is a particularly important property for quality assessment. We validate our approach using a publicly available dataset, SPHERE-StairCase2014 dataset. Qualitative and quantitative results show promising performance. [less ▲]

Detailed reference viewed: 170 (5 UL)
Full Text
Peer Reviewed
See detailHighway Network Block with Gates Constraints for Training Very Deep Networks
Oyedotun, Oyebade UL; Shabayek, Abd El Rahman UL; Aouada, Djamila UL et al

in 2018 IEEE International Conference on Computer Vision and Pattern Recognition Workshop, June 18-22, 2018 (2018, June 19)

In this paper, we propose to reformulate the learning of the highway network block to realize both early optimization and improved generalization of very deep networks while preserving the network depth ... [more ▼]

In this paper, we propose to reformulate the learning of the highway network block to realize both early optimization and improved generalization of very deep networks while preserving the network depth. Gate constraints are duly employed to improve optimization, latent representations and parameterization usage in order to efficiently learn hierarchical feature transformations which are crucial for the success of any deep network. One of the earliest very deep models with over 30 layers that was successfully trained relied on highway network blocks. Although, highway blocks suffice for alleviating optimization problem via improved information flow, we show for the first time that further in training such highway blocks may result into learning mostly untransformed features and therefore a reduction in the effective depth of the model; this could negatively impact model generalization performance. Using the proposed approach, 15-layer and 20-layer models are successfully trained with one gate and a 32-layer model using three gates. This leads to a drastic reduction of model parameters as compared to the original highway network. Extensive experiments on CIFAR-10, CIFAR-100, Fashion-MNIST and USPS datasets are performed to validate the effectiveness of the proposed approach. Particularly, we outperform the original highway network and many state-ofthe- art results. To the best our knowledge, on the Fashion-MNIST and USPS datasets, the achieved results are the best reported in literature. [less ▲]

Detailed reference viewed: 287 (24 UL)
Full Text
Peer Reviewed
See detailPose Encoding for Robust Skeleton-Based Action Recognition
Demisse, Girum UL; Papadopoulos, Konstantinos UL; Aouada, Djamila UL et al

in CVPRW: Visual Understanding of Humans in Crowd Scene, Salt Lake City, Utah, June 18-22, 2018 (2018, June 18)

Some of the main challenges in skeleton-based action recognition systems are redundant and noisy pose transformations. Earlier works in skeleton-based action recognition explored different approaches for ... [more ▼]

Some of the main challenges in skeleton-based action recognition systems are redundant and noisy pose transformations. Earlier works in skeleton-based action recognition explored different approaches for filtering linear noise transformations, but neglect to address potential nonlinear transformations. In this paper, we present an unsupervised learning approach for estimating nonlinear noise transformations in pose estimates. Our approach starts by decoupling linear and nonlinear noise transformations. While the linear transformations are modelled explicitly the nonlinear transformations are learned from data. Subsequently, we use an autoencoder with L2-norm reconstruction error and show that it indeed does capture nonlinear noise transformations, and recover a denoised pose estimate which in turn improves performance significantly. We validate our approach on a publicly available dataset, NW-UCLA. [less ▲]

Detailed reference viewed: 288 (46 UL)
Full Text
Peer Reviewed
See detailKey-Skeleton Based Feedback Tool for Assisting Physical Activity
Baptista, Renato UL; Ghorbel, Enjie UL; Shabayek, Abd El Rahman UL et al

in 2018 Zooming Innovation in Consumer Electronics International Conference (ZINC), 30-31 May 2018 (2018, May 31)

This paper presents an intuitive feedback tool able to implicitly guide motion with respect to a reference movement. Such a tool is important in multiple applications requiring assisting physical ... [more ▼]

This paper presents an intuitive feedback tool able to implicitly guide motion with respect to a reference movement. Such a tool is important in multiple applications requiring assisting physical activities as in sports or rehabilitation. Our proposed approach is based on detecting key skeleton frames from a reference sequence of skeletons. The feedback is based on the 3D geometry analysis of the skeletons by taking into account the key-skeletons. Finally, the feedback is illustrated by a color-coded tool, which reflects the motion accuracy. [less ▲]

Detailed reference viewed: 194 (7 UL)
Full Text
Peer Reviewed
See detailIMPROVING THE CAPACITY OF VERY DEEP NETWORKS WITH MAXOUT UNITS
Oyedotun, Oyebade UL; Shabayek, Abd El Rahman UL; Aouada, Djamila UL et al

in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (2018, February 21)

Deep neural networks inherently have large representational power for approximating complex target functions. However, models based on rectified linear units can suffer reduction in representation ... [more ▼]

Deep neural networks inherently have large representational power for approximating complex target functions. However, models based on rectified linear units can suffer reduction in representation capacity due to dead units. Moreover, approximating very deep networks trained with dropout at test time can be more inexact due to the several layers of non-linearities. To address the aforementioned problems, we propose to learn the activation functions of hidden units for very deep networks via maxout. However, maxout units increase the model parameters, and therefore model may suffer from overfitting; we alleviate this problem by employing elastic net regularization. In this paper, we propose very deep networks with maxout units and elastic net regularization and show that the features learned are quite linearly separable. We perform extensive experiments and reach state-of-the-art results on the USPS and MNIST datasets. Particularly, we reach an error rate of 2.19% on the USPS dataset, surpassing the human performance error rate of 2.5% and all previously reported results, including those that employed training data augmentation. On the MNIST dataset, we reach an error rate of 0.36% which is competitive with the state-of-the-art results. [less ▲]

Detailed reference viewed: 280 (25 UL)
Full Text
Peer Reviewed
See detailFull 3D Reconstruction of Non-Rigidly Deforming Objects
Afzal, Hassan; Aouada, Djamila UL; Mirbach, Bruno et al

in ACM Transactions on Multimedia Computing, Communications, & Applications (2018)

Detailed reference viewed: 269 (11 UL)
Full Text
Peer Reviewed
See detailAnticipating Suspicious Actions using a Small Dataset of Action Templates
Baptista, Renato UL; Antunes, Michel; Aouada, Djamila UL et al

in 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP) (2018, January)

In this paper, we propose to detect an action as soon as possible and ideally before it is fully completed. The objective is to support the monitoring of surveillance videos for preventing criminal or ... [more ▼]

In this paper, we propose to detect an action as soon as possible and ideally before it is fully completed. The objective is to support the monitoring of surveillance videos for preventing criminal or terrorist attacks. For such a scenario, it is of importance to have not only high detection and recognition rates but also low time latency for the detection. Our solution consists in an adaptive sliding window approach in an online manner, which efficiently rejects irrelevant data. Furthermore, we exploit both spatial and temporal information by constructing feature vectors based on temporal blocks. For an added efficiency, only partial template actions are considered for the detection. The relationship between the template size and latency is experimentally evaluated. We show promising preliminary experimental results using Motion Capture data with a skeleton representation of the human body. [less ▲]

Detailed reference viewed: 347 (25 UL)
Full Text
Peer Reviewed
See detailA Revisit of Action Detection using Improved Trajectories
Papadopoulos, Konstantinos UL; Antunes, Michel; Aouada, Djamila UL et al

in IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, Alberta, Canada, 15–20 April 2018 (2018)

In this paper, we revisit trajectory-based action detection in a potent and non-uniform way. Improved trajectories have been proven to be an effective model for motion description in action recognition ... [more ▼]

In this paper, we revisit trajectory-based action detection in a potent and non-uniform way. Improved trajectories have been proven to be an effective model for motion description in action recognition. In temporal action localization, however, this approach is not efficiently exploited. Trajectory features extracted from uniform video segments result in significant performance degradation due to two reasons: (a) during uniform segmentation, a significant amount of noise is often added to the main action and (b) partial actions can have negative impact in classifier's performance. Since uniform video segmentation seems to be insufficient for this task, we propose a two-step supervised non-uniform segmentation, performed in an online manner. Action proposals are generated using either 2D or 3D data, therefore action classification can be directly performed on them using the standard improved trajectories approach. We experimentally compare our method with other approaches and we show improved performance on a challenging online action detection dataset. [less ▲]

Detailed reference viewed: 258 (27 UL)
Full Text
Peer Reviewed
See detail3DBodyTex: Textured 3D Body Dataset
Saint, Alexandre Fabian A UL; Ahmed, Eman UL; Shabayek, Abd El Rahman UL et al

in 2018 Sixth International Conference on 3D Vision (3DV 2018) (2018)

In this paper, a dataset, named 3DBodyTex, of static 3D body scans with high-quality texture information is presented along with a fully automatic method for body model fitting to a 3D scan. 3D shape ... [more ▼]

In this paper, a dataset, named 3DBodyTex, of static 3D body scans with high-quality texture information is presented along with a fully automatic method for body model fitting to a 3D scan. 3D shape modelling is a fundamental area of computer vision that has a wide range of applications in the industry. It is becoming even more important as 3D sensing technologies are entering consumer devices such as smartphones. As the main output of these sensors is the 3D shape, many methods rely on this information alone. The 3D shape information is, however, very high dimensional and leads to models that must handle many degrees of freedom from limited information. Coupling texture and 3D shape alleviates this burden, as the texture of 3D objects is complementary to their shape. Unfortunately, high-quality texture content is lacking from commonly available datasets, and in particular in datasets of 3D body scans. The proposed 3DBodyTex dataset aims to fill this gap with hundreds of high-quality 3D body scans with high-resolution texture. Moreover, a novel fully automatic pipeline to fit a body model to a 3D scan is proposed. It includes a robust 3D landmark estimator that takes advantage of the high-resolution texture of 3DBodyTex. The pipeline is applied to the scans, and the results are reported and discussed, showcasing the diversity of the features in the dataset. [less ▲]

Detailed reference viewed: 1270 (88 UL)
Full Text
Peer Reviewed
See detailDeformation Based 3D Facial Expression Representation
Demisse, Girum UL; Aouada, Djamila UL; Ottersten, Björn UL

in ACM Transactions on Multimedia Computing, Communications, & Applications (2018)

We propose a deformation based representation for analyzing expressions from 3D faces. A point cloud of a 3D face is decomposed into an ordered deformable set of curves that start from a fixed point ... [more ▼]

We propose a deformation based representation for analyzing expressions from 3D faces. A point cloud of a 3D face is decomposed into an ordered deformable set of curves that start from a fixed point. Subsequently, a mapping function is defined to identify the set of curves with an element of a high dimensional matrix Lie group, specifically the direct product of SE(3). Representing 3D faces as an element of a high dimensional Lie group has two main advantages. First, using the group structure, facial expressions can be decoupled from a neutral face. Second, an underlying non-linear facial expression manifold can be captured with the Lie group and mapped to a linear space, Lie algebra of the group. This opens up the possibility of classifying facial expressions with linear models without compromising the underlying manifold. Alternatively, linear combinations of linearised facial expressions can be mapped back from the Lie algebra to the Lie group. The approach is tested on the BU-3DFE and the Bosphorus datasets. The results show that the proposed approach performed comparably, on the BU-3DFE dataset, without using features or extensive landmark points. [less ▲]

Detailed reference viewed: 288 (27 UL)