Training Very Deep Networks via Residual Learning with Stochastic Input Shortcut Connections

OYEDOTUN, Oyebade; SHABAYEK, Abd El Rahman; AOUADA, Djamila; OTTERSTEN, Björn

Télécharger

Communication publiée dans un ouvrage (Colloques, congrès, conférences scientifiques et actes)

Training Very Deep Networks via Residual Learning with Stochastic Input Shortcut Connections

OYEDOTUN, Oyebade; SHABAYEK, Abd El Rahman; AOUADA, Djamila et al.

2017 • In 24th International Conference on Neural Information Processing, Guangzhou, China, November 14–18, 2017

Peer reviewed

Permalien
https://hdl.handle.net/10993/32080

Documents (1)Envoyer vers Détails Statistiques Bibliographie Publications similaires

Documents

Texte intégral

typeinst_V19_review_V02.pdf

Preprint Auteur (650.2 kB)

Télécharger

Tous les documents dans ORBilu sont protégés par une licence d'utilisation.

Envoyer vers

RIS BibTex APA Chicago Permalink X Linkedin

Détails

Mots-clés :

Deep neural networks; residual learning; optimization

Résumé :

[en] Many works have posited the benefit of depth in deep networks. However, one of the problems encountered in the training of very deep networks is feature reuse; that is, features are ’diluted’ as they are forward propagated through the model. Hence, later network layers receive less informative signals about the input data, consequently making training less effective. In this work, we address the problem of feature reuse by taking inspiration from an earlier work which employed residual learning for alleviating the problem of feature reuse. We propose a modification of residual learning for training very deep networks to realize improved generalization performance; for this, we allow stochastic shortcut connections of identity mappings from the input to hidden layers.We perform extensive experiments using the USPS and MNIST datasets. On the USPS dataset, we achieve an error rate of 2.69% without employing any form of data augmentation (or manipulation). On the MNIST dataset, we reach a comparable state-of-the-art error rate of 0.52%. Particularly, these results are achieved without employing any explicit regularization technique.

Centre de recherche :

Interdisciplinary Centre for Security, Reliability and Trust (SnT) > SIGCOM

Disciplines :

Sciences informatiques

Auteur, co-auteur :

OYEDOTUN, Oyebade ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

SHABAYEK, Abd El Rahman ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

AOUADA, Djamila ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

OTTERSTEN, Björn ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

Co-auteurs externes :

yes

Langue du document :

Anglais

Titre :

Training Very Deep Networks via Residual Learning with Stochastic Input Shortcut Connections

Date de publication/diffusion :

31 juillet 2017

Nom de la manifestation :

24th International Conference on Neural Information Processing, Guangzhou, China, November 14–18, 2017

Lieu de la manifestation :

Guangzhou, Chine

Date de la manifestation :

November 14–18, 2017

Manifestation à portée :

International

Titre de l'ouvrage principal :

24th International Conference on Neural Information Processing, Guangzhou, China, November 14–18, 2017

Peer reviewed :

Peer reviewed

Focus Area :

Security, Reliability and Trust

Projet FnR :

FNR11295431 - Automatic Feature Selection For Visual Recognition, 2016 (01/02/2017-31/01/2021) - Oyebade Oyedotun

Organisme subsidiant :

This work was funded by the National Research Fund (FNR), Luxembourg, under the project reference R-AGR-0424-05-D/Bjorn Ottersten

Disponible sur ORBilu :

depuis le 05 septembre 2017

Statistiques

Nombre de vues

306 (dont 40 Unilu)

Nombre de téléchargements

511 (dont 43 Unilu)

Voir plus de statistiques

citations Scopus^®

citations Scopus^®
sans auto-citations

Bibliographie

Oyedotun, O.K., Khashman, A.: Deep learning in vision-based static hand gesture recognition. Neural Comput. Appl. 27(3), 1–11 (2016)
Oyedotun, O.K., Khashman, A.: Banknote recognition: investigating processing and cognition framework using competitive neural network. Cogn. Neurodyn. 11(1), 67–79 (2017)
Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991)
Funahashi, K.I.: On the approximate realization of continuous mappings by neural networks. Neural Netw. 2(3), 183–192 (1989)
Delalleau, O., Bengio, Y.: Shallow vs. deep sum-product networks. In: Advances in Neural Information Processing Systems, pp. 666–674 (2011)
Mhaskar, H., Liao, Q., Poggio, T.: Learning functions: When is deep better than shallow. arXiv preprint (2016). arXiv:1603.00988
Bianchini, M., Scarselli, F.: On the complexity of neural network classifiers: a comparison between shallow and deep architectures. IEEE Trans. Neural Netw. Learn. Syst. 25(8), 1553–1565 (2014)
Wan, L., Zeiler, M., Zhang, S., Cun, Y.L., Fergus, R.: Regularization of neural networks using dropconnect. In: Proceedings of the 30th International Conference on Machine Learning (ICML-2013), pp. 1058–1066 (2013)
Graham, B.: Fractional max-pooling. arXiv preprint (2014). arXiv:1412.6071
Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs), arXiv preprint, arXiv:1511.07289 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition, arXiv preprint, arXiv:1409.1556 (2014)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Srivastava, R.K., Greff, K., Schmidhuber, J.: Training very deep networks. In: Advances in Neural Information Processing Systems, pp. 2377–2385 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. AISTATS 9, 249–256 (2010)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift, arXiv preprint, arXiv:1502.03167 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: European Conference on Computer Vision, pp. 630–645 (2016)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Schlkopf, B., Simard, P., Smola, A., Vapnik, V.: Prior knowledge in support vector Kernels. In: Proceedings of the 10th International Conference on Neural Information Processing Systems, pp. 640–646 (1997)
Simard, P.Y., LeCun, Y.A., Denker, J.S., Victorri, B.: Transformation invariance in pattern recognition – tangent distance and tangent propagation. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 235–269. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35289-8_17
Wu, M., Schlkopf, B., Bakir, G.: Building sparse large margin classifiers. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 996–1003 (2005)
Trottier, L., Chaib-draa, B., Giguère, P.: Incrementally built dictionary learning for sparse representation. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9489, pp. 117–126. Springer, Cham (2015). doi:10.1007/978-3-319-26532-2_14
Simard, P., LeCun, Y., Denker, J.S.: Efficient pattern recognition using a new transformation distance. In: Advances in Neural Information Processing Systems, pp. 50–58 (1993)
Keysers, D., Dahmen, J., Theiner, T., Ney, H.: Experiments with an extended tangent distance. In: 15th International Conference on Pattern Recognition, Proceedings, vol. 2, pp. 38–42 (2000)
Yang, J., Yu, K., Huang, T.: Supervised translation-invariant sparse coding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3517– 3524 (2010)
Yang, Z., Moczulski, M., Denil, M., de Freitas, N., Smola, A., Song, L., Wang, Z.: Deep fried convnets. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1476–1483 (2015)
Chan, T.H., Jia, K., Gao, S., Lu, J., Zeng, Z., Ma, Y.: Pcanet: a simple deep learning baseline for image classification? IEEE Trans. Image Process. 24(12), 5017–5032 (2015)
Lin, M., Chen, Q., Yan, S.: Network in network. In: International Conference on Learning Representations, abs/1312.4400 (2014)
Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: Artificial Intelligence and Statistics, pp. 562–570 (2015)
Ngiam, J., Coates, A., Lahiri, A., Prochnow, B., Le, Q.V., Ng, A.Y.: On optimization methods for deep learning. In: Proceedings of the 28th International Conference on Machine Learning (ICML-2011), pp. 265–272 (2011)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations, arXiv preprint, arXiv:1412.6572 (2015)
Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q.: Deep networks with stochastic depth. In: European Conference on Computer Vision, pp. 646–661 (2016)