UDC 004.822
REPRESENTATION, ANALYSIS AND EXTRACTION OF KNOWLEDGE
FROM UNSTRUCTURED NATURAL LANGUAGE TEXTS
Abstract. The article provides an overview of the means of descriptive logics for knowledge representation in natural-language texts.
Descriptive logics are classified by constructors of concepts and roles, and the basic concepts of temporal descriptive logics are considered.
The approach to construction of systems of the analysis of natural-language text based on problems of parts of speech tagging,
dependency parsing, coreference resolution is considered. Examples of using natural-language knowledge bases to solve applied problems,
in particular to check the integrity of the text and to reveal contradictions, are provided.
Keywords: description logics, knowledge bases, tableau algorithm, knowledge extraction, natural language processing, semantic analysis.
FULL TEXT
REFERENCES
- Baader F., Calvanese D., McGuinness D., Nardi D., Patel-Schneider P. The description logic handbook. Cambridge University Press, 2007. 578 p.
- Kryvyi S.L., Hoherchak G.I. Logic in mathematics and computer science. Proceedings of the first Ukrainian conference "Logic and its application" (Kyiv, November 26-28, 2019). Kyiv: AVANPOST-PRIM, 2019. P. 47–55.
- Lutz C., Wolter F., Zakharyaschev M. Temporal description logics: A survey. Proc. of the 15th International Symposium on Temporal Representation and Reasoning (Montreal, Canada, June 16–18, 2008). IEEE Computer Society, 2008. P. 3–14. https://doi.org/10.1109/TIME.2008.14.
- Lutz C., Sturm H., Wolter F., Zakharyaschev M. Tableaux for temporal description logic with constant domains. Proc. of First International Joint Conference, IJCAR 2001: Automated Reasoning (Sienna, Italy, June 18–22, 2001). Springer, 2001. P. 121–136. https://doi.org/10.1007/3-540-45744-5_10.
- Lai S., Leung K. S., Leung Y. SUNNYNLP at SemEval-2018 Task 10: A Support-Vector-Machine-based method for detecting semantic difference using taxonomy and word embedding features. Proc. of The 12th International Workshop on Semantic Evaluation (New Orleans, USA, June 5–6, 2018). 2018. P. 741–746. http://doi.org/10.18653/v1/S18-1118.
- Zhan J., Zhao H. Span model for open information extraction on accurate corpus. Proc. of the AAAI Conference on Artificial Intelligence. 2020. Vol. 34, Iss. 5. P. 9523–9530. https://doi.org/10.1609/ aaai.v34i05.6497.
- Gangemi A., Presutti V., Reforgiato Recupero D., Nuzzolese A., Draicchio F., Mongiovi M. Semantic Web machine reading with FRED. Semantic Web. 2017. Vol. 8, Iss. 6. P. 873–893. https://doi.org/10.3233/SW-160240.
- Reforgiato Recupero D., Nuzzolese A., Consoli S., Presutti V., Mongiovi M., Peroni S. Extracting knowledge from text using SHELDON, a Semantic Holistic framEwork for LinkeD ONtology data. Proc. of the 24th International Conference on World Wide Web (WWW’15 Companion) (Florence, Italy, May 2015). Association for Computing Machinery, 2015. P. 235–238. https://doi.org/10.1145/ 2740908.2742842.
- Hoherchak H. Knowledge bases and description logics applications to natural language texts analysis. Problems in Programming. 2020. N 2–3. P. 259–269. https://doi.org/10.15407/pp2020.02-03.259.
- Kryvy S.L., Darchuk N.P., Provotar O.I. Ontological systems of analysis of natural language texts. Problemy prohramuvannya. 2018. N 2–3. P. 132–139.
- Palagin A.V., Kryvyi S.L., Petrenko N.G. Knowledge-oriented information systems with the processing of natural language objects: the fundamentals of methodology and architectural and structural organization. USiM. 2009. N 3. P. 42–55.
- Palagin A.V., Kryvyi S.L., Petrenko N.G. On the automation of the process of extracting knowledge from natural language texts. Natural and Artificial Intelligence Intern. Book Series. Inteligent Processing. Sofia: ITHEA, 2012. N 9. P. 44–52.
- Palagin A.V., Kryvyi S.L., Bibikov D.S. Natural language sentence processing using dictionaries and word frequency. Natural and Artificial Intelligence Intern. Book Series. Inteligent Processing. Sofia: ITHEA, 2010. N 9. P. 44–52.
- McDonald R., Nivre J., Quirmbach-Brundage Y., Goldberg Y., Das D., Ganchev K., Hall K., Petrov S., Zhang H., Tckstrm O., Bedini C., CastellЛ N.B., Lee J. Universal dependency annotation for multilingual parsing. Proc. of the 51st Annual Meeting of the Association for Computational Linguistics (Sofia, Bulgaria, August 4–9, 2013). Association for Computational Linguistics, 2013. (Vol. 2: Short Papers) P. 92–97.
- Mrini K., Dernoncourt F., Bui T., Chang W., Nakashole N. Rethinking self-attention: An interpretable self-attentive encoder-decoder parser. Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, 2020. P. 731–742. http://doi.org/10.18653/v1/2020.findings-emnlp.65.
- Che W., Lui Y, Wang Y., Zheng B., Liu T. Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation. Proc. of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies (Brussels, Belgium, October 31 – November 1, 2018). Association for Computational Linguistics, 2018. P. 55–64. http:// doi.org/ 10.18653/v1/K18-2005.
- Darchuk N. Automatic parsing of texts of the corpus of the Ukrainian language. Ukrainian linguistics. 2013. N 43. P. 11–19.
- Vilain M., Burger J., Aberdeen J., Connolly D., Hirschman L. A model-theoretic coreference scoring scheme. Proc. of the 6th Message Understanding Conference (MUC-6) (Maryland, USA, November 6–8, 1995). Association for Computational Linguistics, 1995. P. 45–52. https://doi.org/10.3115/ 1072399.1072405.
- Stoyanov V., Gilbert N., Cardie C., Riloff E. Conundrums in noun phrase coreference resolution: Making sense of the state-of-the-art. Proc. of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing (Singapore, August 2–7, 2009). Association for Computational Linguistics, 2009. P. 656–664. http://doi.org/10.3115/1690219.1690238.
- Luo X. On coreference resolution performance metrics. Proc. of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05 (Vancouver, Canada, October, 2005). Association for Computational Linguistics, 2005. P. 25–32. http://doi.org/10.3115/ 1220575.1220579.
- Devlin J., Chang M.-W., Lee K., Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. Proc. of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Minneapolis, USA, June 2–7, 2019). Association for Computational Linguistics, 2019. Vol. 1 (Long and Short Papers). P. 4171–4186. http://dx.doi.org/10.18653/v1/N19-1423.
- Xu L., Choi J.D. Revealing the myth of higher-order inference in coreference resolution. Proc. of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (online, November 16–20, 2020). Association for Computational Linguistics, 2020. P. 8527–8533. http://dx.doi.org/ 10.18653/v1/2020.emnlp-main.686.
- Lukashevich N.V. Thesauri in information retrieval problems. Moscow: Publishing house Mosk. un-ta, 2011. 512 p.