DOI
10.34229/KCA2522-9664.26.3.1
UDC 621.396
V. Mishchuk
National Aerospace University "Kharkiv Aviation Institute", Kharkiv,
Ukraine,
v.v.mishchuk@csn.khai.edu
H. Fesenko
National Aerospace University "Kharkiv Aviation Institute", Kharkiv,
Ukraine,
h.fesenko@csn.khai.edu
V. Kharchenko
National Aerospace University "Kharkiv Aviation Institute", Kharkiv,
Ukraine,
v.kharchenko@csn.khai.edu
S. Yakovlev
V.N. Karazin Kharkiv National University, Kharkiv, Ukraine,
s.yakovlev@karazin.ua
DIFFUSION MODELS FOR CREATING SYNTHETIC DATASETS
IN INTELLIGENT SYSTEMS OF EXPLOSIVE ORDNANCE DETECTION
Abstract. Based on the conducted research, it has been established that diffusion models demonstrate significant potential in image generation tasks, yet in their baseline form, they often produce results with mismatched key object characteristics. The use of Low-Rank Adaptation (LoRA) was examined as a means of improving the alignment of generated content with domain-specific requirements, using datasets of explosive ordnance (EO) images as an example. Experimental results on hyperparameter tuning are presented, demonstrating the effectiveness of specific learning rate ranges and the impact of adaptation rank and parameter on adaptation stability. At the same time, several limitations were identified, including the dependence of generation quality on dataset composition, challenges in controlling visual artifacts, and the insufficiency of formal evaluation metrics. Directions for further research are suggested, including automation of selection and assessment of generated outputs, as well as evaluation of the practical effectiveness of synthetic datasets for training explosive ordnance detection models.
Keywords: diffusion models, low-rank adaptation, synthetic datasets, explosive ordnance, image generation.
full text
REFERENCES
- 1. Mishchuk V., Fesenko H., Kharchenko V. Deep learning models for detection of explosive ordnance using autonomous robotic systems: trade-off between accuracy and real-time processing speed. Radioelectronic and Computer Systems. 2024. N. 4. Р. 99–111. https://doi.org/10.32620/reks.2024.4.09.
- 2. Hu E.J., Shen Y., Wallis P., et al. LoRA: low-rank adaptation of large language models. arXiv:2106.09685v2 [cs.CL] 16 Oct 2021. https://doi.org/10.48550/arXiv.2106.09685.
- 3. Podell D., English Z., Lacey K., et al. SDXL: improving latent diffusion models for high-resolution image synthesis. arXiv:2307.01952v1 [cs.CV] 4 Jul 2023. https://doi.org/10.48550/arXiv.2307.01952.
- 4. Goodfellow I., Pouget-Abadie J., Mirza M., et al. Generative adversarial networks. Communications of the ACM. 2020. Vol. 63, N 11. P. 139–144. https://doi.org/10.1145/3422622.
- 5. Ho J., Jain A., Abbeel P. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems. 2020. Vol. 33. P. 6840–6851. URL:https://proceedings.neurips.cc/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf.
- 6. Rombach R., Blattmann A., Lorenz D., Esser P., Ommer B. High-resolution image synthesis with latent diffusion models. Proc. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (18–24 June 2022, New Orleans, LA, USA) New Orleans, 2022. P. 10674–10685. https://doi.org/10.1109/CVPR52688.2022.01042.
- 7. Dhariwal P., Nichol A. Diffusion models beat GANs on image synthesis. arXiv:2105. 05233v4 [cs.LG] 1 Jun 2021. https://doi.org/10.48550/arXiv.2105.05233.
- 8. Zhou M., Bai Y., Yang Q., Zhao T. StyleInject: parameter efficient tuning of text-to-image diffusion models. ACM Transactions on Multimedia Computing, Communications and Applications. 2025. Vol. 21, Iss. 5. Article number 152. https://doi.org/10.1145/3730403.
- 9. Yang M., Chen J., Tao J., et al. Low-rank adaptation for foundation models: a comprehensive review. arXiv:2501.00365v2 [cs.LG] 3 Nov 2025. https://doi.org/10.48550/arXiv.2501.00365.
- 10. Lu Y., Chen L., Zhang Y., et al. Machine learning for synthetic data generation: a review. arXiv:2302.04062v10 [cs.LG] 4 Apr 2025. https://doi.org/10.48550/arXiv.2302.04062.
- 11. Mumuni A., Mumuni F., Gerrar N.K. A survey of synthetic data augmentation methods in machine vision. Machine Intelligence Research. 2024. Vol 21, Iss. 5. P. 831–869. https://doi.org/10.1007/s11633-022-1411-7.
- 12. Heusel M., Ramsauer H., Unterthiner T., Nessler B., Hochreiter S. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017. P. 6629–6640. URL: https://proceedings.neurips.cc/paper_files/paper/2017//file/8a1d694707eb0fefe5871369074926d-Paper.pdf.
- 13. Salimans T., Goodfellow I., Zaremba W., Cheung V., Radford A., Chen X. Improved techniques for training GANs. Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016. P. 2234–2242. URL: https://papers.nips.cc/paper_files/paper/2016/file/8a3363abe792db2d8761d6403605aeb7-Paper.pdf.
- 14. Otani M., Togashi R., Sawai Y. et al. Toward verifiable and reproducible human evaluation for text-to-image generation. Proc. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). (17–24 June 2023, Vancouver, BC, Canada). Vancouver, 2023. P. 14277–14286. https://doi.org/10.1109/CVPR52729.2023.01372.
- 15. Hartwig S., Engel D., Sick L., et al. A survey on quality metrics for text-to-image generation. IEEE Transactions on Visualization and Computer Graphics. 2025. Vol. 31, Iss. 10. P. 9464–9483. https://doi.org/10.1109/TVCG.2025.3585077.
- 16. LoRA training parameters. URL: https://github.com/bmaltais/kohya_ss/wiki/LoRA-training-parameters.
- 17. A comprehensive guide to training a Stable Diffusion XL LoRA: optimal settings & dataset building. URL: https://medium.com/@guillaume.bieler/a-comprehensive-guide-to-training-a-stable-diffusion-xl-lora-optimal-settings-dataset-building-844113a6d5b3.
- 18. THE OTHER LoRA TRAINING RENTRY. URL: https://rentry.org/59xed3.
- 19. Scaling Parameter Alpha. URL: https://apxml.com/courses/lora-peft-efficient-llm-training/chapter-2-lora-in-depth/lora-scaling-alpha.
- 20. Batifol S., Blattmann A., Boesel F., et al. FLUX.1 kontext: flow matching for in-context image generation and editing in latent space. arXiv:2506.15742v2 [cs.GR] 24 Jun 2025. https:// doi.org/10.48550/arXiv.2506.15742.
- 21. Fedorenko G., Fesenko H., Kharchenko V., Kliushnikov I., Tolkunov I. Robotic-biological systems for detection and identification of explosive ordnance: concept, general structure, and models. Radioelectronic and Computer Systems. 2023. N 2. Р. 143–159. https://doi.org/10.32620/reks.2023.2.12.