DOI
10.34229/KCA2522-9664.25.3.4
UDC 004.021+004.89
1 Educational and Research Institute for Applied Systems Analysis of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute," Kyiv, Ukraine
zgurovsm@hotmail.com
|
2 Educational and Research Center "World Data Center for Geoinformatics and Sustainable Development" of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute," Kyiv, Ukraine
boldak@wdc.org.ua
|
3 Educational and Research Institute for Applied Systems Analysis of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute," Kyiv, Ukraine
k.yefremov@wdc.org.ua
|
4 Educational and Research Institute for Applied Systems Analysis of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute," Kyiv, Ukraine
o.stus@kpi.ua
|
5 Educational and Research Center "World Data Center for Geoinformatics and Sustainable Development" of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute;" Institute for Information Recording of the NAS of Ukraine, Kyiv, Ukraine
dmytrenko@wdc.org.ua
|
|
NLP-BASED VERIFICATION OF MESSAGE RELIABILITY
USING SEMANTIC NETWORK ANALYSIS
Abstract. This article focuses on methods and approaches for constructing semantic networks for textual (news) messages
in media streams to identify potential sources of disinformation. The main idea involves developing a comprehensive methodology for building such networks, where key terms are used as the foundation for semantic modeling. The authors analyze various text processing techniques, including preliminary computational text processing, extraction of key terms, and the identification of semantic relationships between them. Particular attention is given to the development of a metric for measuring semantic proximity between information messages represented as semantic networks. The proposed metric, based on the Frobenius norm, enables an effective evaluation of the similarity and interconnection between texts. This enhances the accuracy of semantic content analysis, uncovers hidden semantic relationships, and facilitates the structuring of information. Using the Frobenius-based metric, the article proposes an approach for identifying reliable and unreliable information sources, enabling further validation of the facts presented in news messages. This approach enhances the efficiency of information analysis, identifies trends, and predicts the development of events within the news space. Most importantly, it allows for detecting information influences, contributing not only to maintaining information security but also to ensuring national resilience against external threats.
Keywords: semantic network, Frobenius measure, text analysis, Horizontal Visibility Graph algorithm, Directed Weighted Network of Terms, verification of message reliability.
full text
REFERENCES
- 1. Chen W., Lakshmanan L.V.S., Castillo C. Information and influence propagation in social networks. Morgan & Claypool Publishers, 2013. 177 p. https://doi.org/10.2200/ .
- 2. Whitman M.E., Mattord H.J. Principles of information security. Cengage Learning, 2021. 752 p.
- 3. Prier J. Commanding the trend: Social media as information warfare. In: Information Warfare in the Age of Cyber Conflict. Whyte C., Thrall A.T., Mazanec B.M. (Eds.). London: Routledge, 2020. P. 88–113. https://doi.org/10.4324/.
- 4. Taddeo M. Information warfare: A philosophical perspective. In: The Ethics of Information Technologies. Miller K.W., Taddeo M. (Eds.). London: Routledge, 2017. P. 461–476. https://doi.org/10.4324/.
- 5. Libicki M.C. The convergence of information warfare. In: Information Warfare in the Age of Cyber Conflict. Whyte C., Thrall A.T., Mazanec B.M. (Eds.). London: Routledge, 2020. P. 15–26. https://doi.org/10.4324/.
- 6. Fridman O. “Information War” as the Russian conceptualisation of strategic communications. The RUSI Journal. 2020. Vol. 165, Iss. 1. P. 44–53. https://doi.org/10.1080/.
- 7. Ulichev O.S., Meleshko E.V. Modeling the processes of dissemination and neutralization of information influences in the social network segment. Zakhyst informatsiyi. 2020. Vol. 22, N 3. P. 166–176. https://doi.org/10.18372/.
- 8. Melnikova-Kurganova O.S. Means of information influence during war: types, transformation, trends. Proc. X Congress "Aviation in the 21st Century" – "Safety in Aviation and Space Technologies". Kyiv: National Aviation University, 2022. P. 6.2.154–6.2.157.
- 9. Nasir J.A., Khan O.S., Varlamis I. Fake news detection: A hybrid CNN-RNN based deep learning approach. International Journal of Information Management Data Insights. 2021. Vol. 1, Iss. 1. Article number 100007. https://doi.org/10.1016/.
- 10. Kaliyar R.K., Goswami A., Narang P. FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimedia Tools and Applications. 2021. Vol. 80, Iss. 8. P. 11765–11788. "https://doi.org/10.1007/.
- 11. Lamichhane P., Shrestha K. Implementation of machine learning approach to detect clickbaits in online news. Fuse Machines Inc., 2020. P. 15–19. https://doi.org/10.13140/.
- 12. Sahoo S.R., Gupta B.B. Multiple features based approach for automatic fake news detection on social networks using deep learning. Applied Soft Computing. 2021. Vol. 100. Article number 106983. https://doi.org/10.1016/.
- 13. Drieieva H., Drieiev O., Meleshko Y., Yakymenko M., Mikhav V. A method of determining the fractal dimension of network traffic by its probabilistic properties and experimental research of the quality of this method. Proc. COLINS-2022: 6th International Conference on Computational Linguistics and Intelligent Systems (12–13 May 2022, Gliwice, Poland). Gliwice, 2022. P. 1694–1707. https://ceur-ws.org/.
- 14. Smetanyuk B., Mishyn V., Nakonechna Y. Wavelet and fractal analysis based news spreading model. Theoretical and Applied Cybersecurity. 2020. Vol. 2, N 1. P. 74–83. https://doi.org/10.20535/tacs.2664-29132020.1.209481.
- 15. Lande D.V., Dmytrenko O.O. Methodology for extracting of key words and phrases and building directed weighted networks of terms with using part-of-speech tagging. Selected Papers of the XX International Scientific and Practical Conference “Information Technologies and Security” (ITS 2020) (10 December 2020, Kyiv, Ukraine). CEUR Workshop Proceedings. 2020. Vol. 2859. P. 168–177. http://ceur-ws.org/Vol-2859/.
- 16. Lande D.V., Dmytrenko O.O. Using part-of-speech tagging for building networks of terms in legal sphere. Proc. 5th International Conference on Computational Linguistics and Intelligent Systems (COLINS 2021) (22–23 April 2021, Kharkiv, Ukraine). CEUR Workshop Proceedings. 2021. Vol. 2870. P. 87–97. http://ceur-ws.org/.
- 17. Zgurovsky M., Boldak A., Lande D., Yefremov K., Pyshnograiev I., Soboliev A., Dmytrenko O. Enhancing the relevance of information retrieval in internet media and social networks in scenario planning tasks. In: System Analysis & Intelligent Computing. Zgurovsky M., Pankratova N. (Eds.). Studies in Computational Intelligence. 2022. Vol. 1022. P. 187–199. https://doi.org/10.1007/.
- 18. Manning C.D., Raghavan P., Schtze H. An introduction to information retrieval. Cambridge University Press, 2008. 506 p. https://doi.org/10.1017/.
- 19. Teodorescu M. Machine learning methods for strategy research. Harvard Business School Research Paper Series. 2017. N18-011. http://doi.org/10.2139/ssrn.3012524.
- 20. Santorini B. Part-of-speech tagging guidelines for the Penn Treebank Project. Technical Report No. MS-CIS-90-47. Department of Computer and Information Science. University of Pennsylvania, 1990. 34 p. https://repository.upenn.edu/handle/.
- 21. Brill E. A simple rule-based part of speech tagger. Proc. the Third Conference on Applied Natural Language Processing (ANLC‘92) (31 March 1992 – 3 April 1992, Trento, Italy). Trento, 1992. P. 152–155. "https://doi.org/10.3115/.
- 22. Luque B., Lacasa L., Ballesteros F., Luque J. Horizontal visibility graphs: Exact results for random time series. Physical Review E. 2009. Vol. 80. Article number 046103. https://doi.org/ 10.1103/.
- 23. Gutin G., Mansour T., Severini S. A characterization of horizontal visibility graphs and combinatorics on words. Physica A: Statistical Mechanics and its Applications. 2011. Vol. 390, Iss. 12. P. 2421–2428. https://doi.org/10.1016/.
- 24. Lacasa L., Luque B., Ballesteros F., Luque J., Nuno J.C. From time series to complex networks: The visibility graph. Proc. Natl. Acad. Sci. U.S.A. 2008. Vol. 105, N 13. P. 4972–4975. https://doi.org/10.1073/.
- 25. Zou Y., Donner R.V., Marwan N., Donges J.F., Kurths J. Complex network approaches to nonlinear time series analysis. Physics Reports. 2019. Vol. 787. P. 1–97. https://doi.org/10.1016/.
- 26. Lande D.V., Dmytrenko O.O. Creating directed weighted network of terms based on analysis of text corpora. Proc. 2020 IEEE 2nd International Conference on System Analysis & Intelligent Computing (SAIC) (05–09 October 2020, Kyiv, Ukraine). Kyiv, 2020. P. 1–4. https://doi.org/10.1109/.
- 27. Lande D.V., Dmytrenko O.O. Construction of semantic networks and determination of the degree of divergence of texts. Informatsiya i pravo. 2022. N 2(41). P. 44–51. https://doi.org/10.37750/ .
- 28. Dmytrenko O. Formation networks of terms for identifying semantic similarity or difference degree of texts in cybersecurity. Theoretical and Applied Cybersecurity. 2022. Vol. 4, N 1. P. 39–44. https://doi.org/10.20535/.
- 29. Böttcher A., Wenzel D. The Frobenius norm and the commutator. Linear Algebra and Its Applications. 2008. Vol. 429, Iss. 8–9. P. 1864–1885. https://doi.org/10.1016/.
- 30. spaCy. (n.d.). Industrial-Strength Natural Language Processing. https://spacy.io.
- 31. Trained Pipelines. English. (n.d.). spaCy. https://spacy.io/models/.
- 32. spaCy 101: Everything you need to know. (n.d.). spaCy. https://v2.spacy.io/usage/.
- 33. Shahane S. Fake news classification: Fake news classification on WELFake dataset. 2023. https://www.kaggle.com/datasets/.
- 34. Verma P.K., Agrawal P., Amorim I., Prodan R. WELFake: word embedding over linguistic features for fake news detection. IEEE Transactions on Computational Social Systems. 2021. Vol. 8, Iss. 4. P. 881–893. https://doi.org/10.1109/.