DOI
10.34229/KCA2522-9664.26.2.15
UDC 004.932:004.85
S.V. Zabolotnii
Cherkasy State Business College, Cherkasy, Ukraine,
zabolotniua@gmail.com
A.V. Chepynoha
Cherkasy State Technological University, Cherkasy, Ukraine,
a.chepynoha@chdtu.edu.ua
V.I. Hotunov
Cherkasy State Business College, Cherkasy, Ukraine,
vkhotunov@gmail.com
FROM STATISTICAL PATTERN RECOGNITION TO EMOTION ANALYSYS:
APPLICATION OF THE APPARATUS OF DECOMPOSITION IN SPAСE
WITH A GENERATING ELEMENT FOR NLP MODELS
Abstract. Emotion recognition in texts is an important task in modern natural language processing, which is currently dominated by transformer architectures. However, their internal mechanisms remain a ‘black box’ and the quality of classification, especially for complex cases, has room for improvement. This paper proposes a new hybrid approach that combines the power of modern language models with deep analysis of their vector representations by adapting the classical method of statistical pattern recognition based on space decomposition with a generating element (Kunchenko space). The method generates a new set of ‘statistical-geometric’ features based on the reconstruction error of vector text messages of the corresponding classes. Experiments on Ukrainian (EMOBENCH-UA) and English (EmoEvent) datasets showed that the proposed hybrid approach statistically significantly improves classification quality. The research also identified key conditions for the method’s effectiveness: it is a powerful ‘refiner’ for models pre-trained on the target task, but ineffective on ‘raw’, non-specialised vector representations. It has been established that the choice of basis functions for reconstruction is an important hyperparameter that allows the method to be adapted to the specific geometry of the data space.
Keywords: emotion recognition, natural language processing, vector representation, Kunchenko space, feature generation, hybrid model.
full text
REFERENCES
- 1. Acheampong F.A., Wenyu C., Nunoo-Mensah H. Text-based emotion detection: Advances, challenges, and opportunities. Engineering Reports. 2020. Vol. 2, Iss. 7. Article number e12189. https://doi.org/10.1002/eng2.12189.
- 2. Bobichev V., Kanishcheva O., Cherednichenko O. Sentiment analysis in the Ukrainian and Russian news. Proc. 2017 IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON) (29 May – 02 June 2017, Kyiv, Ukraine). Kyiv, 2017. P. 1050–1055. https://doi.org/10.1109/UKRCON.2017.8100410.
- 3. Dementieva D., Khylenko V., Babakov N., Groh G. Toxicity classification in Ukrainian. Proc. 8th Workshop on Online Abuse and Harms (WOAH 2024) (June 2024, Mexico City, Mexico). Mexico City, Mexico 2024. P. 244–255. https://doi.org/10.18653/v1/2024.woah-1.19.
- 4. Dementieva D., Babakov N., Fraser A. EmoBench-UA: A benchmark dataset for emotion detection in Ukrainian. arxiv:2505.23297v2[cs.CL] 26 Sep 2025. https://doi.org/10.48550/arXiv.2505.23297.
- 5. Plaza-del-Arco F.M., Strapparava C., Urea-Lpez L.A., Martin-Valdivia M.T. EmoEvent: A multilingual emotion corpus based on different events. Proc. 12th Conference on Language Resources and Evaluation (LREC 2020) (13–15 May 2020, Marseille, France) Marseille, 2020. P. 1492–1498. URL: https://aclanthology.org/2020.lrec-1.186.pdf?utm_source=chatgpt.com.
- 6. Devlin J., Chang M.W., Lee K., Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. (2–7 June 2019, Minneapolis, Minnesota, USA). Minneapolis, 2019. Vol. 1. P. 4171–4186. https://doi.org/10.18653/v1/N19-1423.
- 7. Liu Y., Ott M., Goyal N., Du J., Joshi M., Chen D., Stoyanov V. RoBERTa: A robustly optimized BERT pretraining approach. arXiv:1907.11692v1[cs.CL] 26 Jul 2019. https:// doi.org/10.48550/arXiv.1907.11692.
- 8. Conneau A., Khandelwal K., Goyal N. et al. Unsupervised cross-lingual representation learning at scale. Proc. 58th Annual Meeting of the Association for Computational Linguistics (5–10 July 2020, Seattle, Washington, USA). Seattle, 2020. P. 8440–8451. https://doi.org/10.18653/v1/2020.acl-main.747.
- 9. Jawahar G., Sagot B., Seddah D. What does BERT learn about the structure of language? Proc. 57th Annual Meeting of the Association for Computational Linguistics (July 28 – August 2 2019, Florence, Italy). Florence, 2019. P. 3651–3661. https://doi.org/10.18653/v1/P19-1356.
- 10. Rogers A., Kovaleva O., Rumshisky A. A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics. 2020. Vol. 8. P. 842–866. //doi.org/10.1162/tacl_a_00349.
- 11. Ahanin Z., Ismail M.A., Singh N.S.S., AL-Ashmori A. Hybrid feature extraction for multi-label emotion classification in english text messages. Sustainability. 2023. Vol. 15, Iss. 16. P. 12539. https://doi.org/10.3390/su151612539.
- 12. Belinkov Y. Probing classifiers: Promises, shortcomings, and advances. Computational Linguistics. 2022. Vol. 48, Iss. 1. P. 207–218. https://doi.org/10.48550/arXiv.2102.12452.
- 13. Levy T., Goldman O., Tsarfaty R. Is probing all you need? Indicator tasks as an alternative to probing embedding spaces. Proc. Findings of the Association for Computational Linguistics (EMNLP 2023) (6–10 December 2023, Singapore (hybrid format)). Singapore, 2023. P. 5243–5254. https://doi.org/10.18653/v1/2023.findings-emnlp.348.
- 14. Fukunaga K. Introduction to statistical pattern recognition. 2nd ed. Boston: Academic Press, 1990. 591 p.
- 15. Kirichenko N.F., Krak Yu.V., Polishchuk A.A. Pseudoinverse and projection matrices in problems of synthesis of functional transformers. Cybernetics and Systems Analysis. 2004. Vol. 40, N 3. P. 407–419. https://doi.org/10.1023/B:CASA.0000041999.63598.28.
- 16. Kirichenko N.F., Krivonos Y.G., Lepekha N.P. Synthesis of systems of neurofunctional transformations in classification problems. Cybernetics and Systems Analysis. 2007. Vol. 43, N 3. P. 353–361. https://doi.org/10.1007/s10559-007-0056-4.
- 17. Кунченко Ю.П. Полиномы приближения в пространстве с порождающим элементом. Київ: Наук. думка, 2003. 243 с.
- 18. Kunchenko Y.P. Polynomial parameter estimations of close to Gaussian random variables. Aachen: Shaker Verlag, 2002. 396 p.
- 19. Zabolotnii S.V. Statistical pattern recognition based on spatial distribution with a breed element. Bulletin of the National University “Lviv Polytechnic”. Series “Computer Science and Information Technologies”. 2009. No. 638. P. 118–123. URL: https://science.lpnu.ua/sites/default/files/journal-paper/2024/feb/33271/vis638kompnauky-118-123.pdf?utm_source=chatgpt.com.
- 20. Zabolotnii S.V. Application of spatial decomposition with a breed element for solving probabilistic diagnostics problems. Eastern-European Journal of Enterprise Technologies. 2014. Vol. 4, No. 4(70). Pp. 28–35. https://doi.org/10.15587/1729-4061.2014.26195.
- 21. Zabolotnii S.W., Martynenko S.S., Salypa S.V. Method of verification of hypothesis about mean value on a basis of expansion in a space with generating element. Radioelectron. Commun. Syst. 2018. Vol. 61, Iss. 5. P. 222–229. https://doi.org/10.3103/S0735272718050060.
- 22. Chertov O., Slipets T. Epileptic seizures diagnose using Kunchenko’s polynomials template matching. In: Progress in industrial mathematics at ECMI 2012. Fontes M., Gьnther M., Marheineke N. (Eds.). Mathematics in Industry. 2014. Vol. 19. P. 245–248. https://doi.org/10.1007/978-3-319-05365-3_33.