![]() Saint, Alexandre Fabian A ![]() ![]() ![]() in D'APUZZO, Nicola (Ed.) Proceedings of 3DBODY.TECH 2017 - 8th International Conference and Exhibition on 3D Body Scanning and Processing Technologies, Montreal QC, Canada, 11-12 Oct. 2017 (2017, October) This paper presents a method to automatically recover a realistic and accurate body shape of a person wearing clothing from a 3D scan. Indeed, in many practical situations, people are scanned wearing ... [more ▼] This paper presents a method to automatically recover a realistic and accurate body shape of a person wearing clothing from a 3D scan. Indeed, in many practical situations, people are scanned wearing clothing. The underlying body shape is thus partially or completely occluded. Yet, it is very desirable to recover the shape of a covered body as it provides non-invasive means of measuring and analysing it. This is particularly convenient for patients in medical applications, customers in a retail shop, as well as in security applications where suspicious objects under clothing are to be detected. To recover the body shape from the 3D scan of a person in any pose, a human body model is usually fitted to the scan. Current methods rely on the manual placement of markers on the body to identify anatomical locations and guide the pose fitting. The markers are either physically placed on the body before scanning or placed in software as a postprocessing step. Some other methods detect key points on the scan using 3D feature descriptors to automate the placement of markers. They usually require a large database of 3D scans. We propose to automatically estimate the body pose of a person from a 3D mesh acquired by standard 3D body scanners, with or without texture. To fit a human model to the scan, we use joint locations as anchors. These are detected from multiple 2D views using a conventional body joint detector working on images. In contrast to existing approaches, the proposed method is fully automatic, and takes advantage of the robustness of state-of-art 2D joint detectors. The proposed approach is validated on scans of people in different poses wearing garments of various thicknesses and on scans of one person in multiple poses with known ground truth wearing close-fitting clothing. [less ▲] Detailed reference viewed: 366 (36 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() in 2017 IEEE International Conference on Computer Vision Workshop (ICCVW) (2017, August 21) Humans use facial expressions successfully for conveying their emotional states. However, replicating such success in the human-computer interaction domain is an active research problem. In this paper, we ... [more ▼] Humans use facial expressions successfully for conveying their emotional states. However, replicating such success in the human-computer interaction domain is an active research problem. In this paper, we propose deep convolutional neural network (DCNN) for joint learning of robust facial expression features from fused RGB and depth map latent representations. We posit that learning jointly from both modalities result in a more robust classifier for facial expression recognition (FER) as opposed to learning from either of the modalities independently. Particularly, we construct a learning pipeline that allows us to learn several hierarchical levels of feature representations and then perform the fusion of RGB and depth map latent representations for joint learning of facial expressions. Our experimental results on the BU-3DFE dataset validate the proposed fusion approach, as a model learned from the joint modalities outperforms models learned from either of the modalities. [less ▲] Detailed reference viewed: 397 (55 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() in 24th International Conference on Neural Information Processing, Guangzhou, China, November 14–18, 2017 (2017, July 31) Many works have posited the benefit of depth in deep networks. However, one of the problems encountered in the training of very deep networks is feature reuse; that is, features are ’diluted’ as they are ... [more ▼] Many works have posited the benefit of depth in deep networks. However, one of the problems encountered in the training of very deep networks is feature reuse; that is, features are ’diluted’ as they are forward propagated through the model. Hence, later network layers receive less informative signals about the input data, consequently making training less effective. In this work, we address the problem of feature reuse by taking inspiration from an earlier work which employed residual learning for alleviating the problem of feature reuse. We propose a modification of residual learning for training very deep networks to realize improved generalization performance; for this, we allow stochastic shortcut connections of identity mappings from the input to hidden layers.We perform extensive experiments using the USPS and MNIST datasets. On the USPS dataset, we achieve an error rate of 2.69% without employing any form of data augmentation (or manipulation). On the MNIST dataset, we reach a comparable state-of-the-art error rate of 0.52%. Particularly, these results are achieved without employing any explicit regularization technique. [less ▲] Detailed reference viewed: 300 (47 UL)![]() ; ; Aouada, Djamila ![]() in Data Science for Cyber-Security (DSCS), London 25-27 September (2017) Detailed reference viewed: 261 (3 UL)![]() Demisse, Girum ![]() ![]() ![]() in IEEE Transactions on Pattern Analysis and Machine Intelligence (2017) In this paper, we introduce a deformation based representation space for curved shapes in Rn. Given an ordered set of points sampled from a curved shape, the proposed method represents the set as an ... [more ▼] In this paper, we introduce a deformation based representation space for curved shapes in Rn. Given an ordered set of points sampled from a curved shape, the proposed method represents the set as an element of a finite dimensional matrix Lie group. Variation due to scale and location are filtered in a preprocessing stage, while shapes that vary only in rotation are identified by an equivalence relationship. The use of a finite dimensional matrix Lie group leads to a similarity metric with an explicit geodesic solution. Subsequently, we discuss some of the properties of the metric and its relationship with a deformation by least action. Furthermore, invariance to reparametrization or estimation of point correspondence between shapes is formulated as an estimation of sampling function. Thereafter, two possible approaches are presented to solve the point correspondence estimation problem. Finally, we propose an adaptation of k-means clustering for shape analysis in the proposed representation space. Experimental results show that the proposed representation is robust to uninformative cues, e.g. local shape perturbation and displacement. In comparison to state of the art methods, it achieves a high precision on the Swedish and the Flavia leaf datasets and a comparable result on MPEG-7, Kimia99 and Kimia216 datasets. [less ▲] Detailed reference viewed: 402 (56 UL)![]() Papadopoulos, Konstantinos ![]() ![]() ![]() in IEEE International Conference on Image Processing, Beijing 17-20 Spetember 2017 (2017) Action recognition using dense trajectories is a popular concept. However, many spatio-temporal characteristics of the trajectories are lost in the final video representation when using a single Bag-of ... [more ▼] Action recognition using dense trajectories is a popular concept. However, many spatio-temporal characteristics of the trajectories are lost in the final video representation when using a single Bag-of-Words model. Also, there is a significant amount of extracted trajectory features that are actually irrelevant to the activity being analyzed, which can considerably degrade the recognition performance. In this paper, we propose a human-tailored trajectory extraction scheme, in which trajectories are clustered using information from the human pose. Two configurations are considered; first, when exact skeleton joint positions are provided, and second, when only an estimate thereof is available. In both cases, the proposed method is further strengthened by using the concept of local Bag-of-Words, where a specific codebook is generated for each skeleton joint group. This has the advantage of adding spatial human pose awareness in the video representation, effectively increasing its discriminative power. We experimentally compare the proposed method with the standard dense trajectories approach on two challenging datasets. [less ▲] Detailed reference viewed: 348 (62 UL)![]() Shabayek, Abd El Rahman ![]() ![]() ![]() in European Project Space on Networks, Systems and Technologies (2017) This chapter explains a vision based platform developed within a European project on decision support and self-management for stroke survivors. The objective is to provide a low cost home rehabilitation ... [more ▼] This chapter explains a vision based platform developed within a European project on decision support and self-management for stroke survivors. The objective is to provide a low cost home rehabilitation system. Our main concern is to maintain the patients' physical activity while carrying a continuous monitoring of his physical and emotional state. This is essential for recovering some autonomy in daily life activities and preventing a second damaging stroke. Post-stroke patients are initially subject to physical therapy under the supervision of a health professional to follow up on their daily physical activity and monitor their emotional state. However, due to social and economical constraints, home based rehabilitation is eventually suggested. Our vision platform paves the way towards having low cost home rehabilitation. [less ▲] Detailed reference viewed: 234 (6 UL)![]() Baptista, Renato ![]() ![]() in IEEE International Conference on Image Information Processing (ICIIP) (2017) In this paper, we propose a framework for guiding patients and/or users in how to correct their posture in real-time without requiring a physical or a direct intervention of a therapist or a sports ... [more ▼] In this paper, we propose a framework for guiding patients and/or users in how to correct their posture in real-time without requiring a physical or a direct intervention of a therapist or a sports specialist. In order to support posture monitoring and correction, this paper presents a flexible system that continuously evaluates postural defects of the user. In case deviations from a correct posture are identified, then feedback information is provided in order to guide the user to converge to an appropriate and stable body condition. The core of the proposed approach is the analysis of the motion required for aligning body-parts with respect to postural constraints and pre-specified template skeleton poses. Experimental results in two scenarios (sitting and weight lifting) show the potential of the proposed framework. [less ▲] Detailed reference viewed: 340 (50 UL)![]() Baptista, Renato ![]() ![]() ![]() in 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP) (2017) In this paper, we explore the concept of providing feedback to a user moving in front of a depth camera so that he is able to replicate a specific template action. This can be used as a home based ... [more ▼] In this paper, we explore the concept of providing feedback to a user moving in front of a depth camera so that he is able to replicate a specific template action. This can be used as a home based rehabilitation system for stroke survivors, where the objective is for patients to practice and improve their daily life activities. Patients are guided in how to correctly perform an action by following feedback proposals. These proposals are presented in a human interpretable way. In order to align an action that was performed with the template action, we explore two different approaches, namely, Subsequence Dynamic Time Warping and Temporal Commonality Discovery. The first method aims to find the temporal alignment and the second one discovers the interval of the subsequence that shares similar content, after which standard Dynamic Time Warping can be used for the temporal alignment. Then, feedback proposals can be provided in order to correct the user with respect to the template action. Experimental results show that both methods have similar accuracy rate and the computational time is a decisive factor, where Subsequence Dynamic Time Warping achieves faster results. [less ▲] Detailed reference viewed: 561 (71 UL)![]() Goncalves Almeida Antunes, Michel ![]() ![]() in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 (2017) The article concerns the automatic calibration of a camera with radial distortion from a single image. It is known that, under the mild assumption of square pixels and zero skew, lines in the scene ... [more ▼] The article concerns the automatic calibration of a camera with radial distortion from a single image. It is known that, under the mild assumption of square pixels and zero skew, lines in the scene project into circles in the image, and three lines suffice to calibrate the camera up to an ambiguity between focal length and radial distortion. The calibration results highly depend on accurate circle estimation, which is hard to accomplish, because lines tend to project into short circular arcs. To overcome this problem, we show that, given a short circular arc edge, it is possible to robustly determine a line that goes through the center of the corresponding circle. These lines, henceforth called Lines of Circle Centres (LCCs), are used in a new method that detects sets of parallel lines and estimates the calibration parameters, including the center and amount of distortion, focal length, and camera orientation with respect to the Manhattan frame. Extensive experiments in both semi-synthetic and real images show that our algorithm outperforms state- of-the-art approaches in unsupervised calibration from a single image, while providing more information. [less ▲] Detailed reference viewed: 666 (29 UL)![]() Shabayek, Abd El Rahman ![]() ![]() ![]() in IEEE International Conference on Image Processing, Beijing 17-20 Spetember 2017 (2017) Detailed reference viewed: 420 (61 UL)![]() ; Aouada, Djamila ![]() in IEEE Transactions on Pattern Analysis and Machine Intelligence (2016), 39(10), 2045-2059 We propose a novel approach for enhancing depth videos containing non-rigidly deforming objects. Depth sensors are capable of capturing depth maps in real-time but suffer from high noise levels and low ... [more ▼] We propose a novel approach for enhancing depth videos containing non-rigidly deforming objects. Depth sensors are capable of capturing depth maps in real-time but suffer from high noise levels and low spatial resolutions. While solutions for reconstructing 3D details in static scenes, or scenes with rigid global motions have been recently proposed, handling unconstrained non-rigid deformations in relative complex scenes remains a challenge. Our solution consists in a recursive dynamic multi-frame superresolution algorithm where the relative local 3D motions between consecutive frames are directly accounted for. We rely on the assumption that these 3D motions can be decoupled into lateral motions and radial displacements. This allows to perform a simple local per-pixel tracking where both depth measurements and deformations are dynamically optimized. The geometric smoothness is subsequently added using a multi-level L1 minimization with a bilateral total variation regularization. The performance of this method is thoroughly evaluated on both real and synthetic data. As compared to alternative approaches, the results show a clear improvement in reconstruction accuracy and in robustness to noise, to relative large non-rigid deformations, and to topological changes. Moreover, the proposed approach, implemented on a CPU, is shown to be computationally efficient and working in real-time. [less ▲] Detailed reference viewed: 325 (15 UL)![]() Demisse, Girum ![]() ![]() ![]() in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016 (2016, June 26) In this paper, we introduce a similarity metric for curved shapes that can be described, distinctively, by ordered points. The proposed method represents a given curve as a point in the deformation space ... [more ▼] In this paper, we introduce a similarity metric for curved shapes that can be described, distinctively, by ordered points. The proposed method represents a given curve as a point in the deformation space, the direct product of rigid transformation matrices, such that the successive action of the matrices on a fixed starting point reconstructs the full curve. In general, both open and closed curves are represented in the deformation space modulo shape orientation and orientation preserving diffeomorphisms. The use of direct product Lie groups to represent curved shapes led to an explicit formula for geodesic curves and the formulation of a similarity metric between shapes by the $L^{2}$-norm on the Lie algebra. Additionally, invariance to reparametrization or estimation of point correspondence between shapes is performed as an intermediate step for computing geodesics. Furthermore, since there is no computation of differential quantities on the curves, our representation is more robust to local perturbations and needs no pre-smoothing. We compare our method with the elastic shape metric defined through the square root velocity (SRV) mapping, and other shape matching approaches [less ▲] Detailed reference viewed: 427 (55 UL)![]() ; Goncalves Almeida Antunes, Michel ![]() ![]() in 11th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP) (2016) Detailed reference viewed: 307 (19 UL)![]() Goncalves Almeida Antunes, Michel ![]() ![]() ![]() in European Conference on Computer Vision (ECCV) Workshop on Assistive Computer Vision and Robotics Amsterdam, (2016) Physical activity is essential for stroke survivors for recovering some autonomy in daily life activities. Post-stroke patients are initially subject to physical therapy under the supervision of a health ... [more ▼] Physical activity is essential for stroke survivors for recovering some autonomy in daily life activities. Post-stroke patients are initially subject to physical therapy under the supervision of a health professional, but due to economical aspects, home based rehabilitation is eventually suggested. In order to support the physical activity of stroke patients at home, this paper presents a system for guiding the user in how to properly perform certain actions and movements. This is achieved by presenting feedback in form of visual information and human-interpretable messages. The core of the proposed approach is the analysis of the motion required for aligning body-parts with respect to a template skeleton pose, and how this information can be presented to the user in form of simple recommendations. Experimental results in three datasets show the potential of the proposed framework. [less ▲] Detailed reference viewed: 381 (48 UL)![]() ; Aouada, Djamila ![]() in Expert Systems with Applications (2016), 51 Detailed reference viewed: 477 (3 UL)![]() Goncalves Almeida Antunes, Michel ![]() ![]() ![]() in IEEE Winter Conference on Applications of Computer Vision (WACV), 2016 (2016) Detailed reference viewed: 273 (39 UL)![]() ; Aouada, Djamila ![]() in Computer Vision and Image Understanding (2016) Multi-frame super-resolution is the process of recovering a high resolution image or video from a set of captured low resolution images. Super-resolution approaches have been largely explored in 2-D ... [more ▼] Multi-frame super-resolution is the process of recovering a high resolution image or video from a set of captured low resolution images. Super-resolution approaches have been largely explored in 2-D imaging. However, their extension to depth videos is not straightforward due to the textureless nature of depth data, and to their high frequency contents coupled with fast motion artifacts. Recently, few attempts have been introduced where only the super-resolution of static depth scenes has been addressed. In this work, we propose to enhance the resolution of dynamic depth videos with non-rigidly moving objects. The proposed approach is based on a new data model that uses densely upsampled, and cumulatively registered versions of the observed low resolution depth frames. We show the impact of upsampling in increasing the sub-pixel accuracy and reducing the rounding error of the motion vectors. Furthermore, with the proposed cumulative motion estimation, a high registration accuracy is achieved between non-successive upsampled frames with relative large motions. A statistical performance analysis is derived in terms of mean square error explaining the effect of the number of observed frames and the effect of the super-resolution factor at a given noise level. We evaluate the accuracy of the proposed algorithm theoretically and experimentally as function of the SR factor, and the level of contaminations with noise. Experimental results on both real and synthetic data show the effectiveness of the proposed algorithm on dynamic depth videos as compared to state-of-art methods. [less ▲] Detailed reference viewed: 298 (21 UL)![]() ; Aouada, Djamila ![]() in IEEE International Conference on Machine Learning and Applications (2015, December) Detailed reference viewed: 282 (6 UL)![]() Correa Bahnsen, Alejandro ![]() ![]() ![]() in Expert Systems with Applications (2015), 42(19), 6609-6619 Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples. However, standard classification methods do not ... [more ▼] Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples. However, standard classification methods do not take these costs into account, and assume a constant cost of misclassification errors. State-of-the-art example-dependent cost-sensitive techniques only introduce the cost to the algorithm, either before or after training, therefore, leaving opportunities to investigate the potential impact of algorithms that take into account the real financial example-dependent costs during an algorithm training. In this paper, we propose an example-dependent cost-sensitive decision tree algorithm, by incorporating the different example-dependent costs into a new cost-based impurity measure and a new cost-based pruning criteria. Then, using three different databases, from three real-world applications: credit card fraud detection, credit scoring and direct marketing, we evaluate the proposed method. The results show that the proposed algorithm is the best performing method for all databases. Furthermore, when compared against a standard decision tree, our method builds significantly smaller trees in only a fifth of the time, while having a superior performance measured by cost savings, leading to a method that not only has more business-oriented results, but also a method that creates simpler models that are easier to analyze. [less ▲] Detailed reference viewed: 286 (8 UL) |
||