DOI
10.34229/KCA2522-9664.25.6.18
UDC 004.89:004.94
A.E. ALIYEVA
Institute of Control Systems of Ministry of Science and Education of Republic of Azerbaijan, Baku, Azerbaijan,
aynur.aliyeva8020@gmail.com.
A RAG-ASAG HYBRID MODEL FOR
FORMATIVE
ASSESSMENT IN INFORMATICS EDUCATION
AT UNIVERSITY
Abstract. The article examines the integration of retrieval-augmented generation (RAG) technology into formative assessment in computer science teaching. Traditional automatic short answer grading (ASAG) methods based only on text similarity do not allow for a deep analysis of student responses in terms of content and context. To solve this problem, a hybrid model combining RAG and ASAG technologies is proposed. As a result of the proposed assessment criteria, the model evaluates student responses not only in terms of language, but also on the basis of computer science knowledge, teaching materials, and expert examples. This approach allows for a more objective and transparent assessment of students’ knowledge application, critical thinking, and creativity skills. The research results show that the RAG-based hybrid model provides personalized feedback, facilitates monitoring of the learning process at both the individual and group levels, and develops students’ self-directed learning skills
Keywords: formative assessment, automatic short answer grading (ASAG), RAG technology, computer science teaching, coding skills.
full text
REFERENCES
- 1. P. Black and D. Wiliam, “Assessment and classroom learning,” Assessment in Education: Principles, Policy & Practice, Vol. 5, No. 1, 7–74 (1998). https://doi.org/10.1080/ 0969595980050102.
- 2. A. Cullinane, “Formative assessment classroom techniques,” Resource & Research Guides, Vol. 2, No. 13, 1–4 (2011).
- 3. D. R. Sadler, “Formative assessment and the design of instructional systems,” Instr. Sci., Vol. 18, 119–144 (1989). https://doi.org/10.1007/BF00117714.
- 4. Z. Li, Z. Wang, W. Wang, K. Hung, H. Xie, and F. L. Wang, “Retrieval-augmented generation for educational application: A systematic survey,” Comput. Educ.: Artif. Intell., Vol. 8, 100417 (2025). https://doi.org/10.1016/ j.caeai.2025.100417.
- 5. P. C. Mendonça F. Quintal, and F. Mendonça “Evaluating LLMs for automated scoring in formative assessments,” Appl. Sci., Vol. 15, No. 5, 2787 (2025). https://doi.org/10.3390/app15052787.
- 6. O. Henkel, Z. Levonian, M.-E. Postle, and C. Li, “Retrieval-augmented generation to improve math question-answering: Trade-offs between groundedness and human preference,” in Proc. 17th Int. Conf. Educational Data Mining (EDM), Atlanta, GA, USA, July 14–17 (2024), pp. 315–320. https://doi.org/10.48550/arXiv.2310.03184.
- 7. P. Lewis, E. Perez, A. Piktus, et al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” arXiv:2005.11401v4 [cs.CL] 12 Apr (2021). https://doi.org/10.48550/ arXiv.2005.11401.
- 8. C. Leacock and M. Chodorow, “C-rater: Automated scoring of short-answer questions,” Computers and the Humanities, Vol. 37, No. 4, 389–405 (2003). https://doi.org/10.1023/A:1025779619903.
- 9. M. Mohler, R. Bunescu, and R. Mihalcea, “Learning to grade short answer questions using semantic similarity measures and dependency graph alignments,” in Proc. 49th Annu. Meeting Assoc. Computational Linguistics: Human Language Technologies, Portland, Oregon, USA, Association for Computational Linguistics (2011), pp. 752–762. URL: https://aclanthology.org/P11-1076/.
- 10. S. Burrows, I. Gurevych, and B. Stein, “The eras and trends of automatic short answer grading,” Int. J. Artif. Intell. Educ., Vol. 25, No. 1, 60–117 (2015). https://doi.org/10.1007/s40593-014-0026-8.
- 11. S. Hassan, A. A. Fahmy, and M. El-Ramly, “Automatic short answer scoring based on paragraph embeddings,” Int. J. Adv. Comput. Sci. Appl., Vol. 9, No. 10, 133–140 (2018). https://doi.org/10.14569/IJACSA.2018.091048.
- 12. K. Surya, Gayakwad Ekansh, and M. K. Nallakaruppan, “Deep learning for short answer scoring,” Int. J. Recent Technol. Eng., Vol. 7, No. 6 (2019). URL: https://www.ijrte.org/wp-content/uploads/papers/v7i6/F2253037619.pdf.
- 13. A. Alikaniotis, H. Yannakoudakis, and M. Rei, “Automatic text scoring using neural networks,” in Proc. 54th Annu. Meeting Assoc. Computational Linguistics (Vol. 1: Long Papers), Berlin, Germany (2016), pp. 715–725. https://doi.org/10.18653/v1/P16-1068.
- 14. J. Maruta, K. Uchida, H. Kurozumi, et al., “Deep convolutional neural networks for automated scoring of pentagon copying test results,” Sci. Rep., Vol. 12, 9881 (2022). https://doi.org/10.1038/s41598-022-13984-7.
- 15. A. Tay, L. A. Tuan, and S. C. Hui, “Hyperbolic representation learning for fast and efficient neural question answering,” in: WSDM’2018: Proc. of the Eleventh ACM Int. Conf. on Web Search and Data Mining, Marina Del Rey, CA, USA February (2018), pp. 583–591. https://doi.org/10.1145/3159652.3159664.
- 16. G.-G. Lee, E. Latif, X. Wu, N. Liu, and X. Zhai, “Applying large language models and chain-of-thought for automatic scoring,” arXiv:2312.03748v2 [cs.CL] 16 Feb (2024). https://doi.org/10.48550/arXiv.2312.0378.
- 17. F. Miladi, V. PsychÁ, and D. Lemire, “Comparative performance of GPT-4, RAG-augmented GPT-4, and students in MOOCs,” in: A. Basiouni and C. Frasson (eds), Breaking Barriers with Generative Intelligence. Using GI to Improve Human Education and Well-Being. BBGI 2024. Communications in Computer and Information Science, Vol. 2162, Springer, Cham (2024), pp. 81–92. https://doi.org/10.1007/978-3-031-65996-6_7.
- 18. K. Sundar, E. Manohar, K. Vijay, and S. Prakash, “Revolutionizing assessment: AI-powered evaluation with RAG and LLM technologies,” in: 2024 2nd Int. Conf. on Self Sustainable Artificial Intelligence Systems (ICSSAS), Erode, India (2024), pp. 43–48. https://doi.org/10.1109/ICSSAS64001.2024.10760285.
- 19. K. Guu, K. Lee, Z. Tung, P. Pasupat, and M. Chang, “REALM: Retrieval augmented language model pre-training,” in: Proc. of the 37th Int. Conf. on Machine Learning (ICML), Vienna, Austria, PMLR, Vol. 119 (2020), pp. 3929–3938.