![]() Oyedotun, Oyebade ![]() ![]() Poster (2022, May 22) Detailed reference viewed: 93 (20 UL)![]() Singh, Inder Pal ![]() ![]() ![]() in AAAI-22 Workshop Program-Deep Learning on Graphs: Methods and Applications (2022, February) In this paper, we propose the Improved Multi-Label Graph Convolutional Network (IML-GCN) as a precise and efficient framework for multi-label image classification. Although previous approaches have shown ... [more ▼] In this paper, we propose the Improved Multi-Label Graph Convolutional Network (IML-GCN) as a precise and efficient framework for multi-label image classification. Although previous approaches have shown great performance, they usually make use of very large architectures. To handle this, we propose to combine the small version of a newly introduced network called TResNet with an extended version of Multi-label Graph Convolution Networks (ML-GCN); therefore ensuring the learning of label correlation while reducing the size of the overall network. The proposed approach considers a novel image feature embedding instead of using word embeddings. In fact, the latter are learned from words and not images making them inadequate for the task of multi-label image classification. Experimental results show that our framework competes with the state-of-the-art on two multi-label image benchmarks in terms of both precision and memory requirements. [less ▲] Detailed reference viewed: 321 (23 UL)![]() Singh, Inder Pal ![]() ![]() ![]() in IEEE International Conference on Image Processing (2022) In this paper, a novel graph-based approach for multi-label image classification called Multi-Label Adaptive Graph Convolutional Network (ML-AGCN) is introduced. Graph-based methods have shown great ... [more ▼] In this paper, a novel graph-based approach for multi-label image classification called Multi-Label Adaptive Graph Convolutional Network (ML-AGCN) is introduced. Graph-based methods have shown great potential in the field of multi-label classification. However, these approaches heuristically fix the graph topology for modeling label dependencies, which might be not optimal. To handle that, we propose to learn the topology in an end-to-end manner. Specifically, we incorporate an attention-based mechanism for estimating the pairwise importance between graph nodes and a similarity-based mechanism for conserving the feature similarity between different nodes. This offers a more flexible way for adaptively modeling the graph. Experimental results are reported on two well-known datasets, namely, MS-COCO and VG-500. Results show that ML-AGCN outperforms state-of-the-art methods while reducing the number of model parameters. [less ▲] Detailed reference viewed: 89 (7 UL)![]() Oyedotun, Oyebade ![]() ![]() in IEEE Transactions on Neural Networks and Learning Systems (2021) Detailed reference viewed: 58 (8 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() in Neurocomputing (2021) Detailed reference viewed: 177 (10 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() Poster (2021) Detailed reference viewed: 109 (12 UL)![]() Oyedotun, Oyebade ![]() ![]() Poster (2020, November 18) Detailed reference viewed: 134 (12 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() in IEEE Access (2020) Detailed reference viewed: 151 (16 UL)![]() Oyedotun, Oyebade ![]() Doctoral thesis (2020) Learning-based approaches have recently become popular for various computer vision tasks such as facial expression recognition, action recognition, banknote identification, image captioning, medical image ... [more ▼] Learning-based approaches have recently become popular for various computer vision tasks such as facial expression recognition, action recognition, banknote identification, image captioning, medical image segmentation, etc. The learning-based approach allows the constructed model to learn features, which result in high performance. Recently, the backbone of most learning-based approaches are deep neural networks (DNNs). Importantly, it is believed that increasing the depth of DNNs invariably leads to improved generalization performance. Thus, many state-of-the-art DNNs have over 30 layers of feature representations. In fact, it is not uncommon to find DNNs with over 100 layers in the literature. However, training very DNNs that have over 15 layers is not trivial. On one hand, such very DNNs generally suffer optimization problems. On the other hand, very DNNs are often overparameterized such that they overfit the training data, and hence incur generalization loss. Moreover, overparameterized DNNs are impractical for applications that require low latency, small Graphic Processing Unit (GPU) memory for operation and small memory for storage. Interestingly, skip connections of various forms have been shown to alleviate the difficulty of optimizing very DNNs. In this thesis, we propose to improve the optimization and generalization of very DNNs with and without skip connections by reformulating their training schemes. Specifically, the different modifications proposed allow the DNNs to achieve state-of-the-art results on several benchmarking datasets. The second part of the thesis presents the theoretical analyses of DNNs without and with skip connections based on several concepts from linear algebra and random matrix theory. The theoretical results obtained provide new insights into why DNNs with skip connections are easy to optimize, and generalize better than DNNs without skip connections. Ultimately, the theoretical results are shown to agree with practical DNNs via extensive experiments. The third part of the thesis addresses the problem of compressing large DNNs into smaller models. Following the identified drawbacks of the conventional group LASSO for compressing large DNNs, the debiased elastic group least absolute shrinkage and selection operator (DEGL) is employed. Furthermore, the layer-wise subspace learning (SL) of latent representations in large DNNs is proposed. The objective of SL is learning a compressed latent space for large DNNs. In addition, it is observed that SL improves the performance of LASSO, which is popularly known not to work well for compressing large DNNs. Extensive experiments are reported to validate the effectiveness of the different model compression approaches proposed in this thesis. Finally, the thesis addresses the problem of multimodal learning using DNNs, where data from different modalities are combined into useful representations for improved learning results. Different interesting multimodal learning frameworks are applied to the problems of facial expression and object recognition. We show that under the right scenarios, the complementary information from multimodal data leads to better model performance. [less ▲] Detailed reference viewed: 235 (14 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() in Applied Intelligence (2020) Detailed reference viewed: 133 (18 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() in IEEE International Conference on Image Processing (ICIP 2020), Abu Dhabi, UAE, Oct 25–28, 2020 (2020, May 30) Detailed reference viewed: 157 (9 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() in IEEE 2020 Winter Conference on Applications of Computer Vision (WACV 20), Aspen, Colorado, US, March 2–5, 2020 (2020, March 01) Detailed reference viewed: 156 (16 UL)![]() Papadopoulos, Konstantinos ![]() ![]() ![]() in IEEE International Conference on Automatic Face and Gesture Recognition, Buenos Aires 18-22 May 2020 (2020) Detailed reference viewed: 146 (19 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() Poster (2019, May 14) Detailed reference viewed: 143 (12 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() in 2018 IEEE International Conference on Computer Vision and Pattern Recognition Workshop, June 18-22, 2018 (2018, June 19) In this paper, we propose to reformulate the learning of the highway network block to realize both early optimization and improved generalization of very deep networks while preserving the network depth ... [more ▼] In this paper, we propose to reformulate the learning of the highway network block to realize both early optimization and improved generalization of very deep networks while preserving the network depth. Gate constraints are duly employed to improve optimization, latent representations and parameterization usage in order to efficiently learn hierarchical feature transformations which are crucial for the success of any deep network. One of the earliest very deep models with over 30 layers that was successfully trained relied on highway network blocks. Although, highway blocks suffice for alleviating optimization problem via improved information flow, we show for the first time that further in training such highway blocks may result into learning mostly untransformed features and therefore a reduction in the effective depth of the model; this could negatively impact model generalization performance. Using the proposed approach, 15-layer and 20-layer models are successfully trained with one gate and a 32-layer model using three gates. This leads to a drastic reduction of model parameters as compared to the original highway network. Extensive experiments on CIFAR-10, CIFAR-100, Fashion-MNIST and USPS datasets are performed to validate the effectiveness of the proposed approach. Particularly, we outperform the original highway network and many state-ofthe- art results. To the best our knowledge, on the Fashion-MNIST and USPS datasets, the achieved results are the best reported in literature. [less ▲] Detailed reference viewed: 288 (24 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (2018, February 21) Deep neural networks inherently have large representational power for approximating complex target functions. However, models based on rectified linear units can suffer reduction in representation ... [more ▼] Deep neural networks inherently have large representational power for approximating complex target functions. However, models based on rectified linear units can suffer reduction in representation capacity due to dead units. Moreover, approximating very deep networks trained with dropout at test time can be more inexact due to the several layers of non-linearities. To address the aforementioned problems, we propose to learn the activation functions of hidden units for very deep networks via maxout. However, maxout units increase the model parameters, and therefore model may suffer from overfitting; we alleviate this problem by employing elastic net regularization. In this paper, we propose very deep networks with maxout units and elastic net regularization and show that the features learned are quite linearly separable. We perform extensive experiments and reach state-of-the-art results on the USPS and MNIST datasets. Particularly, we reach an error rate of 2.19% on the USPS dataset, surpassing the human performance error rate of 2.5% and all previously reported results, including those that employed training data augmentation. On the MNIST dataset, we reach an error rate of 0.36% which is competitive with the state-of-the-art results. [less ▲] Detailed reference viewed: 280 (25 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() in 2017 IEEE International Conference on Computer Vision Workshop (ICCVW) (2017, August 21) Humans use facial expressions successfully for conveying their emotional states. However, replicating such success in the human-computer interaction domain is an active research problem. In this paper, we ... [more ▼] Humans use facial expressions successfully for conveying their emotional states. However, replicating such success in the human-computer interaction domain is an active research problem. In this paper, we propose deep convolutional neural network (DCNN) for joint learning of robust facial expression features from fused RGB and depth map latent representations. We posit that learning jointly from both modalities result in a more robust classifier for facial expression recognition (FER) as opposed to learning from either of the modalities independently. Particularly, we construct a learning pipeline that allows us to learn several hierarchical levels of feature representations and then perform the fusion of RGB and depth map latent representations for joint learning of facial expressions. Our experimental results on the BU-3DFE dataset validate the proposed fusion approach, as a model learned from the joint modalities outperforms models learned from either of the modalities. [less ▲] Detailed reference viewed: 397 (55 UL)![]() Oyedotun, Oyebade ![]() in IEEE Transactions on Neural Networks and Learning Systems (2017) Artificial neural networks (ANNs) aim to simulate the biological neural activities. Interestingly, many ‘engineering’ prospects in ANN have relied on motivations from cognition and psychology studies. So ... [more ▼] Artificial neural networks (ANNs) aim to simulate the biological neural activities. Interestingly, many ‘engineering’ prospects in ANN have relied on motivations from cognition and psychology studies. So far, two important learning theories that have been subject of active research are the prototype and adaptive learning theories. The learning rules employed for ANNs can be related to adaptive learning theory, where several examples of the different classes in a task are supplied to the network for adjusting internal parameters. Conversely, prototype learning theory uses prototypes (representative examples); usually, one prototype per class of the different classes contained in the task. These prototypes are supplied for systematic matching with new examples so that class association can be achieved. In this paper, we propose and implement a novel neural network algorithm based on modifying the emotional neural network (EmNN) model to unify the prototype and adaptive learning theories. We refer to our new model as “PI-EmNN” (Prototype-Incorporated Emotional Neural Network). Furthermore, we apply the proposed model to two real-life challenging tasks, namely; static hand gesture recognition and face recognition, and compare the result to those obtained using the popular back propagation neural network (BPNN), emotional back propagation neural network (EmNN), deep networks and an exemplar classification model, k-nearest neighbor (k-NN). [less ▲] Detailed reference viewed: 113 (18 UL)![]() Oyedotun, Oyebade ![]() ![]() ![]() in 24th International Conference on Neural Information Processing, Guangzhou, China, November 14–18, 2017 (2017, July 31) Many works have posited the benefit of depth in deep networks. However, one of the problems encountered in the training of very deep networks is feature reuse; that is, features are ’diluted’ as they are ... [more ▼] Many works have posited the benefit of depth in deep networks. However, one of the problems encountered in the training of very deep networks is feature reuse; that is, features are ’diluted’ as they are forward propagated through the model. Hence, later network layers receive less informative signals about the input data, consequently making training less effective. In this work, we address the problem of feature reuse by taking inspiration from an earlier work which employed residual learning for alleviating the problem of feature reuse. We propose a modification of residual learning for training very deep networks to realize improved generalization performance; for this, we allow stochastic shortcut connections of identity mappings from the input to hidden layers.We perform extensive experiments using the USPS and MNIST datasets. On the USPS dataset, we achieve an error rate of 2.69% without employing any form of data augmentation (or manipulation). On the MNIST dataset, we reach a comparable state-of-the-art error rate of 0.52%. Particularly, these results are achieved without employing any explicit regularization technique. [less ▲] Detailed reference viewed: 300 (47 UL)![]() Shabayek, Abd El Rahman ![]() ![]() ![]() in European Project Space on Networks, Systems and Technologies (2017) This chapter explains a vision based platform developed within a European project on decision support and self-management for stroke survivors. The objective is to provide a low cost home rehabilitation ... [more ▼] This chapter explains a vision based platform developed within a European project on decision support and self-management for stroke survivors. The objective is to provide a low cost home rehabilitation system. Our main concern is to maintain the patients' physical activity while carrying a continuous monitoring of his physical and emotional state. This is essential for recovering some autonomy in daily life activities and preventing a second damaging stroke. Post-stroke patients are initially subject to physical therapy under the supervision of a health professional to follow up on their daily physical activity and monitor their emotional state. However, due to social and economical constraints, home based rehabilitation is eventually suggested. Our vision platform paves the way towards having low cost home rehabilitation. [less ▲] Detailed reference viewed: 234 (6 UL) |
||