[en] Convolutional layers, one of the basic building blocks of deep learning architectures, contain numerous trainable filters for feature extraction. These filters operate independently which can result in distinct filters learning similar weights and extracting similar features. In contrast, competition mechanisms in the brain contribute to the sharpening of the responses of activated neurons, enhancing the contrast and selectivity of individual neurons towards specific stimuli, and simultaneously increasing the diversity of responses across the population of neurons. Inspired by this observation, this paper proposes a novel convolutional layer based on the theory of predictive coding, in which each filter effectively tries to block other filters from responding to the input features which it represents. In this way, filters learn to become more distinct which increases the diversity of the extracted features. When replacing standard convolutional layers with the proposed layers the performance of classification networks is not only improved on ImageNet but also significantly boosted on eight robustness benchmarks, as well as on downstream detection and segmentation tasks. Most notably, ResNet50/101/152 robust accuracy increases by 15.9%/20.0%/20.9% under FGSM attack, and by 10.5%/14.7%/15.0% under PGD attack.
Disciplines :
Computer science
Author, co-author :
Gao, Bo ; Department of Intelligent Manufacturing and Electrical Engineering, Nanyang Normal University, Nanyang, China ; The Department of Informatics, King's College London, Strand, London, United Kingdom
SPRATLING, Michael ; University of Luxembourg > Faculty of Humanities, Education and Social Sciences (FHSE) > Department of Behavioural and Cognitive Sciences (DBCS) > Cognitive Science and Assessment ; The Department of Informatics, King's College London, Strand, London, United Kingdom
External co-authors :
yes
Language :
English
Title :
Filter competition results in more robust Convolutional Neural Networks
This research was funded by the special project of Nanyang Normal University, China (Grant Number :2024ZX033). The authors acknowledge use of the research computing facility at King's College London, CREATE [82], and the Joint Academic Data science Endeavour (JADE) facility.This research was funded by the special project of Nanyang Normal University (Grant Number : 2024ZX033 ) . The authors acknowledge use of the research computing facility at King\u2019s College London, CREATE [82] , and the Joint Academic Data science Endeavour (JADE) facility.
scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.
Bibliography
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet classification with deep convolutional neural networks, in: Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.
Simonyan, Karen, Zisserman, Andrew, Very deep convolutional networks for large-scale image recognition. 2014 arXiv:1409.1556.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
Ren, Shaoqing, He, Kaiming, Girshick, Ross, Sun, Jian, Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, 2015, 91–99.
Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580–587.
Ronneberger, Olaf, Fischer, Philipp, Brox, Thomas, U-Net: Convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015, Springer, 234–241.
Jonathan Long, Evan Shelhamer, Trevor Darrell, Fully convolutional networks for semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431–3440.
Kumar, Gunupudi Sai Chaitanya, Kumar, Reddi Kiran, Kumar, Kuricheti Parish Venkata, Sai, Nallagatla Raghavendra, Brahmaiah, Madamachi, Deep residual convolutional neural network: an efficient technique for intrusion detection system. Expert Syst. Appl., 238, 2024, 121912.
López-González, Clara I., Gascó, Esther, Barrientos-Espillco, Fredy, Besada-Portas, Eva, Pajares, Gonzalo, Filter pruning for convolutional neural networks in semantic image segmentation. Neural Netw. 169 (2024), 713–732.
Liu, Long, Lin, Bing, Yang, Yong, Moving scene object tracking method based on deep convolutional neural network. Alexandria Eng. J. 86 (2024), 592–602.
Dhar, Mrinal Kanti, Zhang, Taiyu, Patel, Yash, Gopalakrishnan, Sandeep, Yu, Zeyun, FUSegNet: A deep convolutional neural network for foot ulcer segmentation. Biomed. Signal Process. Control, 92, 2024, 106057.
Zheng, Yongbin, Sun, Peng, Ren, Qiang, Xu, Wanying, Zhu, Di, A novel and efficient model pruning method for deep convolutional neural networks by evaluating the direct and indirect effects of filters. Neurocomputing, 569, 2024, 127124.
Zhang, Yueqi, Feng, Lichen, Shan, Hongwei, Yang, Liying, Zhu, Zhangming, An AER-based spiking convolution neural network system for image classification with low latency and high energy efficiency. Neurocomputing, 564, 2024, 126984.
Bosking, William H, Zhang, Yufeng, Schofield, Brett, Fitzpatrick, David, Orientation selectivity and the arrangement of horizontal connections in tree shrew striate cortex. J. Neurosci. 17:6 (1997), 2112–2127.
Douglas, Rodney J., Martin, Kevan A.C., Neuronal circuits of the neocortex. Annu. Rev. Neurosci. 27 (2004), 419–451.
Gilbert, Charles D., Wiesel, Torsten N., Columnar specificity of intrinsic horizontal and corticocortical connections in cat visual cortex. J. Neurosci. 9:7 (1989), 2432–2442.
Spratling, Michael W., A review of predictive coding algorithms. Brain Cogn. 112 (2017), 92–97.
Rao, Rajesh P.N., Ballard, Dana H., Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci. 2:1 (1999), 79–87.
Clark, Andy, Whatever next? predictive brains, situated agents, and the future of cognitive science. Behav. Brain Sci. 36:3 (2013), 181–204.
Spratling, Michael W., A hierarchical predictive coding model of object recognition in natural images. Cogn. Comput. 9:2 (2017), 151–167.
Keller, Georg B., Mrsic-Flogel, Thomas D., Predictive processing: A canonical cortical computation. Neuron 100:2 (2018), 424–435.
Kok, Peter, de Lange, Floris P., Predictive coding in sensory cortex. An Introduction to Model-Based Cognitive Neuroscience, 2015, Springer, New York, NY, 221–244.
Han, Kuan, Wen, Haiguang, Zhang, Yizhen, Fu, Di, Culurciello, Eugenio, Liu, Zhongming, Deep predictive coding network with local recurrent processing for object recognition. Adv. Neural Inf. Process. Syst. 31 (2018), 8855–8865.
Wen, Haiguang, Han, Kuan, Shi, Junxing, Zhang, Yizhen, Culurciello, Eugenio, Liu, Zhongming, Deep predictive coding network for object recognition. International Conference on Machine Learning, 2018, PMLR, 5266–5275.
Boutin, Victor, Franciosini, Angelo, Chavane, Frederic, Ruffier, Franck, Perrinet, Laurent, Sparse deep predictive coding captures contour integration capabilities of the early visual system. PLoS Comput. Biol., 17(1), 2021, e1008629.
Krizhevsky, A., Hinton, G., Learning multiple layers of features from tiny images. 2009.
Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, A. Y. Ng, Reading digits in natural images with unsupervised feature learning, in: NIPS Workshop on Deep Learning and Unsupervised Feature Learning, Vol. 2011, 2011, p. 5.
LeCun, Yann, Bottou, Leon, Bengio, Yoshua, Haffner, Patrick, Gradient-based learning applied to document recognition. Proc. IEEE 86:11 (1998), 2278–2324.
Xiao, Han, Rasul, Kashif, Vollgraf, Roland, Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. 2017 arXiv:1708.07747.
Spratling, Michael W., De Meyer, Kris, Kompass, Raul, Unsupervised learning of overlapping image components using divisive input modulation. Comput. Intell. Neurosci. 2009:381457 (2009), 1–19.
Spratling, Michael W., Fitting predictive coding to the neurophysiological data. Brain Res., 1720, 2019, 146313.
Spratling, Michael W., Image segmentation using a sparse coding model of cortical area V1. IEEE Trans. Image Process. 22:4 (2013), 1631–1643.
Spratling, Michael W., A single functional model of drivers and modulators in cortex. J. Comput. Neurosci. 36:1 (2014), 97–118.
Spratling, Michael W., A neural implementation of the Hough transform and the advantages of explaining away. Image Vis. Comput. 52 (2016), 15–24.
Spratling, Michael W., Explaining away results in accurate and tolerant template matching. Pattern Recognit., 2020, 107337.
Gao, Bo, Spratling, Michael W., Robust template matching via hierarchical convolutional features from a shape biased CNN. Proceedings of the International Conference on Image, Vision and Intelligent Systems, ICIVIS Lecture Notes in Electrical Engineering, Vol. 813, 2021, Springer, Singapore.
Gao, Bo, Spratling, Michael W., Shape-texture debiased training for robust template matching. Sensors, 22(17), 2022, 6658.
Gao, Bo, Spratling, Michael W., Explaining away results in more robust visual tracking. Vis. Comput., 2022.
Ioffe, Sergey, Szegedy, Christian, Batch normalization: Accelerating deep network training by reducing internal covariate shift. International Conference on Machine Learning, 2015, pmlr, 448–456.
PyTorch. 2023 https://github.com/pytorch/vision/tree/main/references/classification. (accessed: 6 January 2023).
Ross Wightman, Hugo Touvron, Herve Jegou, ResNet strikes back: An improved training procedure in timm, in: NeurIPS 2021 Workshop on ImageNet: Past, Present, and Future, 2021.
Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy, Explaining and harnessing adversarial examples, in: Proceedings of the International Conference on Learning Representations, 2015.
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu, Towards deep learning models resistant to adversarial attacks, in: Proceedings of the International Conference on Learning Representations, 2018.
Xiaofeng Mao, Gege Qi, Yuefeng Chen, Xiaodan Li, Ranjie Duan, Shaokai Ye, Yuan He, Hui Xue, Towards robust vision transformer, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12042–12051.
Pavel Gavrikov, Margret Keuper, CNN filter db: An empirical investigation of trained convolutional filters, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 19066–19076.
Pavel Gavrikov, Margret Keuper, Adversarial robustness through the lens of convolutional filters, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 139–147.
Amil Dravid, Yossi Gandelsman, Alexei A Efros, Assaf Shocher, Rosetta neurons: Mining the common units in a model zoo, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 1934–1943.
Kornblith, Simon, Norouzi, Mohammad, Lee, Honglak, Hinton, Geoffrey, Similarity of neural network representations revisited. International Conference on Machine Learning, 2019, PMLR, 3519–3529.
Glorot, Xavier, Bordes, Antoine, Bengio, Yoshua, Deep sparse rectifier neural networks. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, JMLR Workshop and Conference Proceedings, 315–323.
Hendrycks, Dan, Gimpel, Kevin, Gaussian error linear units (gelus). 2016 arXiv:1606.08415.
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo, Swin transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022.
d'Ascoli, Stéphane, Touvron, Hugo, Leavitt, Matthew L, Morcos, Ari S, Biroli, Giulio, Sagun, Levent, Convit: Improving vision transformers with soft convolutional inductive biases. International Conference on Machine Learning, 2021, PMLR, 2286–2296.
Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie, A convnet for the 2020s, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11976–11986.
Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie, ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
Dan Hendrycks, Thomas Dietterich, Benchmarking Neural Network Robustness to Common Corruptions and Perturbations, in: Proceedings of the International Conference on Learning Representations, 2019.
Singh, Naman D., Croce, Francesco, Hein, Matthias, Revisiting adversarial training for ImageNet: Architectures, training and generalization across threat models. 2023 arXiv:2303.01870.
Shao, Rulin, Shi, Zhouxing, Yi, Jinfeng, Chen, Pin-Yu, Hsieh, Cho-Jui, On the adversarial robustness of vision transformers. Trans. Mach. Learn. Res., 2021.
Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, Dawn Song, Natural adversarial examples, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15262–15271.
Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, et al., The many faces of robustness: A critical analysis of out-of-distribution generalization, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 8340–8349.
Geirhos, Robert, Narayanappa, Kantharaju, Mitzkus, Benjamin, Thieringer, Tizian, Bethge, Matthias, Wichmann, Felix A, Brendel, Wieland, Partial success in closing the gap between human and machine vision. Advances in Neural Information Processing Systems 34, 2021.
Wang, Haohan, Ge, Songwei, Xing, Eric P., Lipton, Zachary C., Learning robust global representations by penalizing local predictive power. Advances in Neural Information Processing Systems, 2019.
Zhang, Chenshuang, Pan, Fei, Kim, Junmo, Kweon, In So, Mao, Chengzhi, ImageNet-D: Benchmarking neural network robustness on diffusion synthetic object. 2024 arXiv:2403.18775.
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer, High-resolution image synthesis with latent diffusion models, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10684–10695.
Zhaowei Cai, Nuno Vasconcelos, Cascade R-CNN: Delving into high quality object detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6154–6162.
Lin, Tsung-Yi, Maire, Michael, Belongie, Serge, Hays, James, Perona, Pietro, Ramanan, Deva, Dollár, Piotr, Zitnick, C Lawrence, Microsoft coco: Common objects in context. Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, 2014, Springer, 740–755.
Michaelis, Claudio, Mitzkus, Benjamin, Geirhos, Robert, Rusak, Evgenia, Bringmann, Oliver, Ecker, Alexander S., Bethge, Matthias, Brendel, Wieland, Benchmarking robustness in object detection: Autonomous driving when winter is coming. 2019 arXiv:1907.07484.
Cihang Xie, Yuxin Wu, Laurens van der Maaten, Alan L. Yuille, Kaiming He, Feature Denoising for Improving Adversarial Robustness, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 501–509.
Tao, Guanhong, Ma, Shiqing, Liu, Yingqi, Zhang, Xiangyu, Attacks meet interpretability: Attribute-steered detection of adversarial samples. Advances in Neural Information Processing Systems, 2018, 7717–7728.
Orhan, A. Emin, Lake, Brenden M., Improving the robustness of ImageNet classifiers using elements of human visual cognition. 2019 arXiv:1906.08416.
Roth, Kevin, Kilcher, Yannic, Hofmann, Thomas, The odds are odd: A statistical test for detecting adversarial examples. International Conference on Machine Learning, 2019, PMLR, 5498–5507.
Cihang Xie, Mingxing Tan, Boqing Gong, Jiang Wang, Alan L. Yuille, Quoc V. Le, Adversarial Examples Improve Image Recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 819–828.
Chattopadhay, Aditya, Sarkar, Anirban, Howlader, Prantik, Balasubramanian, Vineeth N, Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. 2018 IEEE Winter Conference on Applications of Computer Vision, WACV, 2018, IEEE, 839–847.
King's College London, King's computational research, engineering and technology environment (CREATE). 2022 from https://doi.org/10.18742/rnvf-m076. (Retrieved 2 March 2024).