C&SA | Contents

Volume 57 >>> № 6 NOVEMBER — DECEMBER 2021

-->

UDC 004.932.4

B.P. Rusyn¹, O.A. Lutsyk², R.Y. Kosarevych³

¹ Karpenko Physico-Mechanical Institute
of the NAS of Ukraine, Lviv, Ukraine

b.rusyn.prof@gmail.com

² Karpenko Physico-Mechanical Institute
of the NAS of Ukraine, Lviv, Ukraine

olutsyk@yahoo.com

³ Karpenko Physico-Mechanical Institute
of the NAS of Ukraine, Lviv, Ukraine

kosar2311@gmail.com

EVALUATING THE INFORMATIVITY OF TRAINING SAMPLE
FOR CLASSIFICATION OF IMAGES BY DEEP LEARNING METHODS

Abstract. A new approach to evaluate the informativeness of the training sample when recognizing images obtained by means of remote sensing is proposed. It is shown that the informativeness of the training sample can be represented by a set of characteristics, each of which describes certain properties of the data. A relationship between the characteristics of the training sample and the accuracy of the classifier trained on the basis of this sample is established. The proposed approach is applied to various test training samples and the results of their evaluation are presented. When evaluating the training set by the proposed approach, the process is shown to be much faster than training a neural network. This makes it possible to use the proposed approach for preliminary estimation of the training sample in the problems of image recognition using deep learning methods.

Keywords: deep learning, feature selection, training sample, convolution network.

FULL TEXT

REFERENCES

Khan A., Sohail A., Zahoora U., Qureshi A. A survey of the recent architectures of deep convolutional neural networks. Artificial Intelligence Review. 2020. Vol. 53, Iss. 8. P. 5455–5516. https://doi.org/10.1007/s10462-020-09825-6.

Rusyn B.P., Lutsyk O.A., Tayanov V.A. Upper-bound estimates for classifiers based on a dissimilarity function. Cybernetics and Systems Analysis. 2012. Vol. 48, N 4. P. 592–600. https://doi.org/ 10.1007/s10559-012-9439-2.

Boyun V.P. The principles of organizing the search for an object in an image, tracking an object and the selection of informative features based on the visual perception of a person. Proc. International Conference on Data Stream Mining and Processing (DSMP 2020) (21–25 April 2020, Lviv, Ukraine) Lviv, 2020. Communications in Computer and Information Science. 2020. Vol. 1158. P. 22–44. https://doi.org/10.1007/978-3-030-61656-4 sub 2.

Vapnik V. The nature of statistical learning theory. New York: Springer-Verlag, 2000. 314 p.

Bishop C.M. Pattern recognition and machine learning (Information science and statistics). London: Springer, 2006. 738 p.

Chen Y., Lin Z., Zhao X., Wang G., Gu Y. Deep learning-based classification of hyperspectral data. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. 2014. Vol. 7, Iss. 6. P. 2094–2107. https://doi.org/10.1109/JSTARS.2014.2329330.

Ma L., Liu Y., Zhang X., Ye Y., Yin G., Johnsonf B.A. Deep learning in remote sensing applications: A meta-analysis and review. ISPRS Journal of Photogrammetry and Remote Sensing. 2019. Vol. 152. P. 166–177. https://doi.org/10.1016/j.isprsjprs.2019.04.015.

Li Y., Zhang H., Xue X., Jiang Y., Shen Q. Deep learning for remote sensing image classification: A survey. WIREs Data Mining and Knowledge Discovery. 2018. 17 p. https://doi.org/10.1002/widm.1264.

Cheng G., Xie X., Han J., Guo L., Xia G. Remote sensing image scene classification meets deep learning: challenges, methods, benchmarks, and opportunities. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. 2020. Vol. 13. P. 3735–3756. https://doi.org/ https://doi.org/10.1109/JSTARS.2020.3005403.

Hoque M., Burks R., Kwan C., Li J. Deep learning for remote sensing image super-resolution. Proc. IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) (10–12 Oct. 2019, New York, NY, USA). New York, 2019. P. 286–292. https://doi.org/ 10.1109/UEMCON47517.2019.8993047.

Van Niel T.G., McVicar T.R., Datt B. On the relationship between training sample size and data dimensionality: Monte Carlo analysis of broadband multi-temporal classification. Remote Sensing of Environment. 2005. Vol. 98, Iss. 4. P. 468–480. https://doi.org/10.1016/j.rse.2005.08.011.

Zou Q., Ni L., Zhang T., Wang Q. Deep learning based feature selection for remote sensing scene classification. IEEE Geoscience and Remote Sensing Letters. 2015. Vol. 12, Iss. 11. P. 2321–2325. https://doi.org/10.1109/LGRS.2015.2475299.

Hinterstoisser S., Lepetit V., Wohlhart P., Konolige K. On pre-trained image features and synthetic images for deep learning. In: Computer Vision — ECCV 2018 Workshops. Proc. 15th European Conference on Computer Vision (ECCV2018) Workshops (8–14 September 2018, Munich, Germany). Munich, 2018. P. 178–186. https://doi.org/10.1007/978-3-030-11009-3_42.

Genc B., Tunc H. Optimal training and test sets design for machine learning. Turkish Journal of Electrical Engineering and Computer Sciences. 2019. Vol. 27(2). P. 1534–1545. https://doi.org/ 10.3906/elk-1807-212.

Dodge S., Karam L. Understanding how image quality affects deep neural networks. Proc. 2016 Eighth International Conference on Quality of Multimedia Experience (6–8 June 2016, Lisbon, Portugal). Lisbon, 2016. https://doi.org/10.1109/QoMEX.2016.7498955.

Cheng G., Yang C., Yao X., Guo L., Han J. When deep learning meets metric learning: remote sensing image scene classification via learning discriminative CNNs. IEEE Transactions on Geoscience and Remote Sensing. 2018. Vol. 56, Iss. 5. P. 2811–2821. https://doi.org/10.1109/TGRS.2017.2783902.

Ma X., Geng J., Wang H. Hyperspectral image classification via contextual deep learning. Journal on Image and Video Processing. 2015. Article number: 20 (2015). https://doi.org/10.1186/s13640-015-0071-8.

Subbotin S.A. The training set quality measures for neural network learning. Optical Memory and Neural Networks. 2010. Vol. 19, Iss. 2. P. 126–139. https://doi.org/10.3103/S1060992X10020037.

Forsati R., Moayedikia A., Safarkhani B. Heuristic approach to solve feature selection problem. Proc. International Conference on Digital Information and Communication Technology and Its Applications (DICTAP 2011) (21-23 June 2011, Dijon, France). Dijon, 2011. Communications in Computer and Information Science. Vol. 167. P. 707–717. https://doi.org/10.1007/978-3-642 22027-2_59.

Huang K., Aviyente S. Wavelet feature selection for image classification. IEEE Transactions on Image Processing. 2008. Vol. 17, Iss. 9. P. 1709–1720. https://doi.org/10.1109/TIP.2008.2001050.

Muschelli J. ROC and AUC with a binary predictor: a potentially misleading metric. Journal of Classification. 2020. Vol. 37, Iss. 3. P. 696–708. https://doi.org/10.1007/s00357-019-09345-1.

Belov D., Armstrong R. Distributions of the Kullback–Leibler divergence with applications. British Journal of Mathematical and Statistical Psychology. 2011. Vol. 64, Iss. 2. P. 291–309. https://doi.org/10.1348/000711010X522227.

Prati R.C., Batista G.E., Monard M.C. Class imbalances versus class overlapping: an analysis of a learning system behavior. Proc. Third Mexican International Conference on Artificial Intelligence (MICAI 2004) (26–30 April 2004, Mexico City, Mexico). Mexico City, 2004. Lecture Notes in Computer Science. Vol. 2972. P. 312–321. https://doi.org/10.1007/978-3-540-24694-7_32.

Shepperd M., Cartwright M.. Predicting with sparse data. Proc. 7th IEEE International Software Metrics Symposium (4–6 April 2001, London, UK). London, 2001. P. 28–39. https://doi.org/10.1109/ METRIC.2001.915513.

UDC 004.932.4

EVALUATING THE INFORMATIVITY OF TRAINING SAMPLE FOR CLASSIFICATION OF IMAGES BY DEEP LEARNING METHODS

REFERENCES

EVALUATING THE INFORMATIVITY OF TRAINING SAMPLE
FOR CLASSIFICATION OF IMAGES BY DEEP LEARNING METHODS