DOI
10.34229/KCA2522-9664.26.3.5
UDC 004.93
P. Koval
Lviv Polytechnic National University, Lviv, Ukraine,
pavlo.koval.mknus.2024@lpnu.ua
I. Prots’ko
Lviv Polytechnic National University, Lviv, Ukraine,
ihor.o.protsko@lpnu.ua
RESEARCH ON THE APPLICATION OF BASIC RECONSTRUCTIVE
AND CONDITIONAL GAN NEURAL NETWORK MODELS
FOR IMPLEMENTING LWIR-RGB IMAGE CONVERSIONS
Abstract. The paper describes the study and comparison of the transformation of long-wave infrared images into a visible representation (LWIR-RGB) using the reconstructive basic U-Net model and the conditional GAN model pix2pix with a local discriminator. The software platform is implemented using a reproducible pipeline for image transformation using two architectural models of U-Net and pix2pix neural networks for learning with a teacher in the «image-to-image» format. The results of an experimental study of the transformation of long-wave infrared images into a visible representation trained on paired data from the KAIST dataset are presented. The experiments are based on the estimation and comparison of combined loss functions with weighting coefficients for the structural similarity index and the mean absolute difference between the predicted and reference images. Complex loss functions are developed that combine an understanding of the global context of the image with the preservation of local features. The training graphs of two U-Net and pix2pix neural network models are presented, which provide a qualitative idea of the dynamics of training and its convergence. According to the selected estimation, a well-trained U-Net network gives excellent structural reconstruction. The competitive component of the pix2pix model using the PatchGAN local discriminator provides realistic image creation at different levels of detail, improving the perceptional realism without losing the structural quality of the transformed image. Further directions for improving the reliability of the conversion of single-channel thermal image frames into visible three-channel RGB representations based on the studied models are identified.
Keywords: image-to-image transformation, thermal imaging, deep learning, image tensor.
full text
REFERENCES
- 1. Wadsworth E., Mahajan A., Prasad R., Menon R. Deep learning for thermal–RGB image-to-image translation. Infrared Physics & Technology. 2024. Vol. 141. Article number 105442. https://doi.org/10.1016/j.infrared.2024.105442.
- 2. Liu S., John V., Blasch E., Liu Z., Huang Y. IR2VI: Enhanced night environmental perception by unsupervised thermal image translation. arXiv:1806.09565v1 [cs.CV] 25 Jun 2018. https://doi.org/10.48550/arXiv.1806.09565.
- 3. Huang S., Jin X., Jiang Q., Liu L. Deep learning for image colorization: Current and future prospects. Engineering Applications of Artificial Intelligence. 2022. Vol. 114. Article number 105006. https://doi.org/10.1016/j.engappai.2022.105006.
- 4. Kuang X., Sui X., Liu C., et al. Thermal infrared image colorization via conditional generative adversarial network. arXiv:1810.05399v2 [cs.CV] 5 Nov 2018. https://doi.org/10.48550/arXiv.1810.05399.
- 5. Jia X., Zhu C., Li M., et al. LLVIP: a visible–infrared paired dataset for low-light vision. arXiv:2108.10831v4 [cs.CV] 14 Jun 2023. https://doi.org/10.48550/arXiv.2108.10831.
- 6. Shi S., Jiang Q., Jin X., et al. A comparative analysis of near-infrared image colorization methods for low-power NVIDIA Jetson embedded systems. Frontiers in Neurorobotics. 2023. Vol. 17. Article number 1143032. https://doi.org/10.3389/fnbot.2023.1143032.
- 7. Hao H., Peng Y., Ye Z., et al. TMRGBT-D2D: A temporal misaligned RGB-thermal dataset for drone-to-drone target detection. Drones. 2025. Vol. 9, Iss. 10. Article number 694. https://doi.org/10.3390/drones9100694.
- 8. Cheng X., Jin X., Wang X., et al. DCLK-GAN: A dual-branch GAN for thermal infrared image colorization based on DCLK-UNet and attention-UNet. Proc. Seventeenth International Conference on Digital Image Processing (ICDIP 2025) (25-27 April 2025, Haikou, China). Haikou, 2025. Vol. 13709. Article number 137092R. https://doi.org/10.1117/12.3073550.
- 9. Banday M., Lall B. Multi spectral visible–thermal IR image translation using improved u-net & conditional diffusion. Neurocomputing. 2025. Vol. 651. Article number 131006. https://doi.org/10.1016/j.neucom.2025.131006.
- 10. Zhu J.-Y., Park T., Isola P., Efros A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv:1703.10593v7 [cs.CV] 24 August 2020. https://doi.org/10.48550/arXiv.1703.10593.
- 11. Zou T., Chen L. LadleNet: A two-stage UNet for infrared image to visible image translation guided by semantic segmentation. arXiv:2308.06603v3 [cs.CV] 15 April 2024. https://doi.org/10.48550/arXiv.2308.06603.
- 12. Yang S., Sun M., Lou X., et al. Nighttime thermal infrared image translation integrating visible images. Remote Sensing. 2024. Vol. 16, Iss. 4. Article number 666. https:// doi.org/10.3390/rs16040666.
- 13. Ravaglia L., Longo R., Wang K., et al. RGB-to-Infrared translation using ensemble learning applied to driving scenarios. Journal of Imaging. 2025. Vol. 11, Iss. 7. Article number 206. https://doi.org/10.3390/jimaging11070206.
- 14. Ronneberger O., Fischer P., Brox T. U-Net: Convolutional networks for biomedical image segmentation. Proc. Medical Image Computing and Computer-Assisted Intervention (MICCAI-2015) (5–9 October 2015, Munich, Germany) Munich, 2015. Lecture Notes in Computer Science. 2015. Vol. 9351. P. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28.
- 15. Tao D., Shi J., Cheng F. Intelligent colorization for thermal infrared image based on CNN. Proc. IEEE International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA). (06–08 November 2020, Chongqing, China). Chongqing, 2020. P. 1184–1190. https://doi.org/10.1109/ICIBA50161.2020.9277116.
- 16. Luo F.-Y., Liu S.-L., Cao Y.-J., et al. Nighttime thermal infrared image colorization with feedback-based object appearance learning. IEEE Transactions on Circuits and Systems for Video Technology. 2024. Vol. 34, Iss. 6. P. 4745–4761. https://doi.org/10.1109/TCSVT.2023.3331499.
- 17. Cao Y., Duan X., Meng X., et al. Computer-aided colorization state-of-the-science: A survey. arXiv:2410.02288v1 [cs.CV] 3 Oct 2024. https://doi.org/10.48550/arXiv.2410.02288.
- 18. Isola P., Zhu J.-Y., Zhou T., Efros A.A. Image-to-image translation with conditional adversarial networks. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (21–26 July 2017, Honolulu, USA). Honolulu, 2017. P. 5967–5976. https://doi.org/10.1109/CVPR.2017.632.
- 19. Berg A., Ahlberg J., Felsberg M. Generating visible spectrum images from thermal infrared. Proc. IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (18–22 June 2018, Salt Lake City, USA). Salt Lake City, 2018. P. 1224–122409. https://doi.org/10.1109/CVPRW.2018.00159.
- 20. Suбrez P. L., Sappa A. D., Vintimilla B. X. Infrared image colorization based on a triplet DCGAN architecture. Proc. IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (21–26 July 2017, Honolulu, USA). Honolulu, 2017. P. 212–217. https://doi.org/10.1109/CVPRW.2017.32.
- 21. Paszke A., Gross S., Massa F., et al. PyTorch: An imperative style, high-performance deep learning library. arXiv:1912.01703v1 [cs.LG] 3 Dec 2019. https://doi.org/10.48550/arXiv.1912.01703.
- 22. Chollet F. Deep learning with Python. New York: Manning Publications Co., 2017. 384 p.
- 23. Hwang S., Park J., Kim N., et al. Multispectral pedestrian detection: Benchmark dataset and baselines. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (07–12 June 2015, Boston, USA). Boston, 2015. P. 1037–1045. https://doi.org/10.1109/CVPR.2015.7298706.