kCV-B: Bootstrap with Cross-Validation for Deep Learning Model Development, Assessment and Selection

NURUNNABI, Abdul Awal Md; TEFERLE, Felix Norman; Laefer, Debra; Remondino, Fabio; Karas, Ismail Rakip; Li, Jonatha

doi:10.5194/isprs-archives-XLVIII-4-W3-2022-111-2022

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

kCV-B: Bootstrap with Cross-Validation for Deep Learning Model Development, Assessment and Selection

NURUNNABI, Abdul Awal Md; TEFERLE, Felix Norman; Laefer, Debra et al.

2022 • In kCV-B: Bootstrap with Cross-Validation for Deep Learning Model Development, Assessment and Selection

Peer reviewed

Permalink
https://hdl.handle.net/10993/52920

DOI
10.5194/isprs-archives-XLVIII-4-W3-2022-111-2022

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

Bootstrap and CV for Deep Learning_SCA'22_CR.pdf

Publisher postprint (2.13 MB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Classification; Cross-Validation; Neural Network; PointNet; Semantic Segmentation; Supervised Machine Learning

Abstract :

[en] This study investigates the inability of two popular data splitting techniques: train/test split and k-fold cross-validation that are to create training and validation data sets, and to achieve sufficient generality for supervised deep learning (DL) methods. This failure is mainly caused by their limited ability of new data creation. In response, the bootstrap is a computer based statistical resampling method that has been used efficiently for estimating the distribution of a sample estimator and to assess a model without having knowledge about the population. This paper couples cross-validation and bootstrap to have their respective advantages in view of data generation strategy and to achieve better generalization of a DL model. This paper contributes by: (i) developing an algorithm for better selection of training and validation data sets, (ii) exploring the potential of bootstrap for drawing statistical inference on the necessary performance metrics (e.g., mean square error), and (iii) introducing a method that can assess and improve the efficiency of a DL model. The proposed method is applied for semantic segmentation and is demonstrated via a DL based classification algorithm, PointNet, through aerial laser scanning point cloud data.

Research center :

ULHPC - University of Luxembourg: High Performance Computing

Disciplines :

Engineering, computing & technology: Multidisciplinary, general & others

Author, co-author :

NURUNNABI, Abdul Awal Md ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Engineering (DoE)

TEFERLE, Felix Norman ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Engineering (DoE)

Laefer, Debra; 2Center for Urban Science and Progress ; Department of Civil and Urban Engineering > New York University, USA

Remondino, Fabio; 33D Optical Metrology (3DOM) unit > Bruno Kessler Foundation (FBK), Trento, Italy

Karas, Ismail Rakip; Department of Computer Engineering > Karabuk University, Karabuk, Turkey

Li, Jonatha; 5Geography and Environmental Management > University of Waterloo, Canada

External co-authors :

yes

Language :

English

Title :

kCV-B: Bootstrap with Cross-Validation for Deep Learning Model Development, Assessment and Selection

Publication date :

October 2022

Event name :

The 7th Smart City Applications, International Conference

Event place :

Castelo Branco, Portugal

Event date :

from 19-10-2022 to 21-10-2022

Audience :

International

Main work title :

kCV-B: Bootstrap with Cross-Validation for Deep Learning Model Development, Assessment and Selection

Publisher :

ISPRS

Peer reviewed :

Peer reviewed

Focus Area :

Computational Sciences

Name of the research project :

SOLSTICE

Available on ORBilu :

since 30 November 2022

Statistics

Number of views

106 (3 by Unilu)

Number of downloads

45 (2 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

AHN3: Actueel Hoogtebestand Nederland. Available at: https: //app. pdok. nl/ahn3-downloadpage/.
Aljumaily, H., Laefer, D., Cuadra, D., 2015. Big-data approach for three-dimensional building extraction from aerial laser scanning. J. Comp. Civil Eng., ASCE, https: //dx. doi. org/10. 1061/ (ASCE)CP. 1943-5487. 0000524.
Basiri, S., Ollila, E., Koivunen, V., 2017. Enhanced bootstrap method for statistical inference in the ICA model. Signal Process, 138, 53-62.
Becker, C., Rosinskaya, E., Häni, N., d'Angelo, E., Strecha, C., 2018. Classification of aerial photogrammetric 3D point clouds. Photogramm Eng Remote Sensing, 84 (5), 287-295.
Bishop, C. M., 2006. Pattern Recognition and Machine Learning. Springer, New York, USA.
Boos, D. D., Stefanski, L. A., 2013. Essential Statistical Inference: Theory and Methods. Springer.
Boulch, A., 2020. ConvPoint: Continuous convolutions for point cloud processing. Comput Graph, 88, 24-34.
Daszykowski, M., Walczak, B., Massart, D. L., 2002. Representative subset selection. Anal Chim Acta., 468, 91-103.
Davison, A., Hinkley, D., 1997. Bootstrap Methods and their Application. Cambridge Univ. Press, Cambridge.
Efron, B., Tibshirani, R., 1993. An Introduction to the Bootstrap. Chapman and Hall, New York.
Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep Learning. MIT press.
Harrington, P. D., 2017. Multiple versus single set validation of multivariate models to avoid mistakes. Crit Rev Anal Chem., 48, 33-46.
Hu, Q., Yang, B., Xie, L., Rosa, S., Guo, Y., Wang, Z., Trigoni N., Markham, A., 2020. Randla-Net: Efficient semantic segmentation of large-scale point clouds. IEEE CVPR, 11108-11117.
Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K., 2015. Spatial transformer networks. ArXiv: 1506. 02025.
James, G., Witten, D., Hastie, T., Tibshirani, R., 2015. An Introduction to Statistical Learning, Springer.
Kingma, D. P., Ba, J., 2014. Adam: A method for stochastic optimization. ArXiv preprint arXiv: 1412. 6980.
Krizhevsky, A., Sutskever, I., Hinton, G., 2012. ImageNet classification with deep convolutional neural networks. Adv Neural Inf Process Syst, 1097-1105.
LeCun, Y., et al., 1989. Backpropagation applied to handwritten zip code recognition. Neural Comput., 1 (4), 541-551.
Manly, B., 2020. Randomization, Bootstrap and Monte Carlo Methods in Biology. Boca Raton, FL: Chapman and Hall/CRC.
Majgaonkar, O., Panchal, K., Laefer, D. F., Stanley, M., Zaki, Y., 2021. Assessing LiDAR training data quantities for classification models. Int. Arch. of the Photogramm. Remote Sens. And Spat. Info. Sci., Vol. 46, 101-106.
Montavon, G., Samek, W., Müller, K-R., 2018. Methods for interpreting and understanding deep neural networks. Digit. Signal Process., 73, 1-15.
Nurunnabi, A., Belton, D., West, G., 2012. Robust segmentation for multiple planar surface extraction in laser scanning 3D point cloud data. IEEE ICPR, 1367-1370.
Nurunnabi, A., West, G., Belton, D., 2015. Outlier detection and robust normal-curvature estimation in mobile laser scanning 3D point cloud data. Pattern Recognit., 48 (4), 1404-1419.
Nurunnabi, A., Sadahiro, Y., Lindenbergh, R., Belton, D., 2019. Robust cylinder fitting in laser scanning point cloud data. Measurement, Vol. 138, 632-651.
Nurunnabi, A., Teferle, F. N., Li, J., Lindenbergh, R. C., Hunegnaw, A., 2021a. An efficient deep learning approach for ground point filtering in aerial laser scanning point clouds. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci, XLIII-B1, 31-38.
Nurunnabi, A., Teferle, F. N., Li, J., Lindenbergh, R. C., Parvaz, S., 2021b. Investigation of PointNet for semantic segmentation of large-scale outdoor point clouds", Int. Arch. of the Photogramm. Remote Sens. And Spat. Info. Sci., Vol. XLVI-4/W5, 397-404.
Nurunnabi, A., Teferle, F. N., 2022. Resampling methods for a reliable validation set in deep learning-based point cloud classification. Int. Arch. of the Photogramm. Remote Sens. And Spat. Info. Sci., Vol. XLIII-B2-2022, 617-624.
Puzyn, T., Mostrag-Szlichtyng, A., Gajewicz, A., Skrzyski, M., Worth, A. P., 2011. Investigating the influence of data splitting on the predictive ability of QSAR/QSPR models. Struct Chem., 22, 795-804.
Qi, C. R., Su, H., Mo, K., Guibas, L. J., 2017. PointNet: Deep learning on point sets for 3D classification and segmentation. IEEE CVPR, 652-660.
Raschka, S., 2020. Model evaluation, model selection, and algorithm selection in machine learning. ArXiv: 1811. 12808v3
Su, Y., et al., 2022. DLA-Net: Learning dual local attention features for semantic segmentation of large-scale building facade point clouds, Pattern Recognit., 123, 108272.
Taylor, B. J., 2005. Methods and Procedures for the Verification and Validation of Artificial Neural Networks. Springer-Verlag, New York, Inc., Secaucus, NJ, USA.
Thomas, J. D., Efron, B. 1996. Bootstrap confidence intervals. Stat Sci., 11 (3), 189-212.
Tsamardinos, I., Greasidou, E., Borboudakis, G., 2018. Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation. Mach Learn, 107 (12), 1895-1922.
Tuia, D., Persello, C., Bruzzone, L., 2016. Domain adaptation for the classification of remote sensing data: An overview of recent advances. IEEE Geosci. Remote Sens. Mag., 4 (2), 41-57.
Varney, N., Asari, V. K., Graehling, Q., 2020. DALES: A large-scale aerial LiDAR data set for semantic segmentation IEEE CVPR Workshops, 186-187.
Wainer, J., Cawley, G., 2021. Nested cross-validation when selecting classifiers is overzealous for most practical applications. Expert Syst. Appl., 182, 115222.
Weidner, L., Walton, G., 2021. The influence of training data variability on a supervised machine learning classifier for Structure from Motion (SfM) point clouds of rock slopes. Eng Geol., 294 (106344), 1-16.
Xie, W., Liang, G., Dong, Z., Tan, B., Zhang, B., 2019. An improved oversampling algorithm based on the samples selection strategy for classifying imbalanced data. Math. Prob. Eng. https: //doi. org/10. 1155/2019/3526539.