Rethinking Ethical Responsibility and Data Governance in Academic Assessment Using Large Language Models

Ruri Supatmi; Diyah Dwi Agustina; Rangga Mega Putra; Asti Cahyani

doi:10.64268/jtse.v1i2.58

Authors

Ruri Supatmi Universitas Nahdlatul Ulama Lampung, Indonesia.
Diyah Dwi Agustina Universitas Nahdlatul Ulama Lampung, Indonesia.
Rangga Mega Putra Universitas Nahdlatul Ulama Lampung, Indonesia.
Asti Cahyani Universitas Nahdlatul Ulama Lampung, Indonesia.

DOI:

https://doi.org/10.64268/jtse.v1i2.58

Keywords:

Academic Grading;, Data Governance;, Ethical Accountability;, Higher Education Assessment;, Large Language Models;

Abstract

Background: The integration of Large Language Models (LLMs) into academic grading practices has expanded rapidly in higher education, driven by demands for efficiency and consistency.

Aims: In response to these concerns, this study seeks to explore issues of ethical accountability and data governance in the use of LLMs for academic assessment, drawing on the perspectives of lecturers, students, and academic administrators.

Methods: The study adopted a qualitative exploratory approach to capture in-depth insights into current assessment practices involving LLMs. Data were gathered through semi-structured interviews, institutional document analysis, and direct observations across selected higher education institutions. Analysis followed the interactive framework proposed by Miles, Huberman, and Saldaña, involving iterative processes of data reduction, data display, and conclusion verification, with triangulation applied to strengthen trustworthiness.

Result: The findings demonstrate a set of interrelated challenges. The involvement of LLMs in grading processes often obscures responsibility for assessment decisions, particularly when transparency is limited. Concerns regarding fairness and potential bias persist, especially in evaluating varied linguistic and contextual expressions. At the same time, data governance mechanisms remain insufficiently developed, with unclear procedures for consent, data storage, and regulatory compliance. These issues collectively reflect uneven institutional preparedness and weak ethical oversight.

Conclusion: The study concludes that the use of LLMs in academic grading requires clearly defined ethical accountability and comprehensive data governance frameworks. Continued human oversight, supported by institutional policies and capacity-building initiatives, is essential to safeguard academic integrity and ensure responsible adoption of AI-assisted assessment in higher education.

References

Abfalter, D., Mueller-Seeger, J., & Raich, M. (2021). Translation decisions in qualitative research: A systematic framework. International Journal of Social Research Methodology, 24(4), 469–486. https://doi.org/10.1080/13645579.2020.1805549 DOI: https://doi.org/10.1080/13645579.2020.1805549

Agostini, D., & Picasso, F. (2024). Large language models for sustainable assessment and feedback in higher education: Towards a Pedagogical and Technological Framework. Intelligenza Artificiale, 18(1), 121–138. https://doi.org/10.3233/IA-240033 DOI: https://doi.org/10.3233/IA-240033

Bearman, M., Ryan, J., & Ajjawi, R. (2023). Discourses of artificial intelligence in higher education: A critical literature review. Higher Education, 86(2), 369–385. https://doi.org/10.1007/s10734-022-00937-2 DOI: https://doi.org/10.1007/s10734-022-00937-2

Biton, Y. (2025). Student and teacher learning as a result of developing peer assessment criteria for mathematical tasks. Eurasia Journal of Mathematics, Science and Technology Education, 21(7). https://doi.org/10.29333/ejmste/16605 DOI: https://doi.org/10.29333/ejmste/16605

Brown, G. T. L. (2022). The past, present and future of educational assessment: A transdisciplinary perspective. Frontiers in Education, 7. https://doi.org/10.3389/feduc.2022.1060633 DOI: https://doi.org/10.3389/feduc.2022.1060633

Cilsalar-Sagnak, H., Anakok, I., & Katz, A. (2025). Curriculum comparison: Chemical and mechanical engineering education in the United States and Turkey. Australasian Journal of Engineering Education, 30(1), 62–79. https://doi.org/10.1080/22054952.2024.2441091 DOI: https://doi.org/10.1080/22054952.2024.2441091

de Bruijn, H., Warnier, M., & Janssen, M. (2022). The perils and pitfalls of explainable AI: Strategies for explaining algorithmic decision-making. Government Information Quarterly, 39(2), 101666. https://doi.org/10.1016/j.giq.2021.101666 DOI: https://doi.org/10.1016/j.giq.2021.101666

Diab Idris, M., Feng, X., & Dyo, V. (2024). Revolutionizing Higher Education: Unleashing the Potential of Large Language Models for Strategic Transformation. IEEE Access, 12, 67738–67757. https://doi.org/10.1109/ACCESS.2024.3400164 DOI: https://doi.org/10.1109/ACCESS.2024.3400164

Elshall, A. S., & Badir, A. (2025). Balancing AI-assisted learning and traditional assessment: The FACT assessment in environmental data science education. Frontiers in Education, 10. https://doi.org/10.3389/feduc.2025.1596462 DOI: https://doi.org/10.3389/feduc.2025.1596462

Estipona, E. P., & Delos Santos, M. S. M. (2025). Linking wellbeing to success: Life satisfaction on mathematics performance of Philippine public high school students. Frontiers in Education, 10. https://doi.org/10.3389/feduc.2025.1540813 DOI: https://doi.org/10.3389/feduc.2025.1540813

Fazelpour, S., & Danks, D. (2021). Algorithmic bias: Senses, sources, solutions. Philosophy Compass, 16(8), e12760. https://doi.org/10.1111/phc3.12760 DOI: https://doi.org/10.1111/phc3.12760

Gazit, L. (2025). AI as a Group Mediator: A Conceptual Framework for Triadic Chat-Based Therapy. International Journal of Systemic Therapy, 0(0), 1–30. https://doi.org/10.1080/2692398X.2025.2587315 DOI: https://doi.org/10.1080/2692398X.2025.2587315

Gonsalves, C., & Lin, Z. (2025). Clear in advance to whom? Exploring ‘transparency’ of assessment practices in UK higher education institution assessment policy. Studies in Higher Education, 50(7), 1454–1470. https://doi.org/10.1080/03075079.2024.2381124 DOI: https://doi.org/10.1080/03075079.2024.2381124

Hahn, M. G. (2025). Work in Progress: Investigating ChatGPT for Grading Algebra Problems. EDUNINE - IEEE Eng. Educ. World Conf.: Educ. Age Gener. AI: Embrac. Digit. Transform. - Proc. EDUNINE 2025 - 9th IEEE Engineering Education World Conference: Education in the Age of Generative AI: Embracing Digital Transformation - Proceedings. https://doi.org/10.1109/EDUNINE62377.2025.10981322 DOI: https://doi.org/10.1109/EDUNINE62377.2025.10981322

Katsamakas, E., Pavlov, O. V., & Saklad, R. (2024). Artificial Intelligence and the Transformation of Higher Education Institutions: A Systems Approach. Sustainability, 16(14), 6118. https://doi.org/10.3390/su16146118 DOI: https://doi.org/10.3390/su16146118

Kihaga, H., Dahl, B., Kitta, S., & Likinjiye, M. (2025). Strategies for teaching and assessing students in large mathematics classes in secondary schools in Tanzania: A systematic review. Social Sciences and Humanities Open, 12. https://doi.org/10.1016/j.ssaho.2025.101744 DOI: https://doi.org/10.1016/j.ssaho.2025.101744

Kirmizi, M. (2025). Exploring the mediating role of teacher expectation on whole class participation. Power and Education, 17(3), 334–349. https://doi.org/10.1177/17577438241272594 DOI: https://doi.org/10.1177/17577438241272594

König, P. D. (2021). Citizen-centered data governance in the smart city: From ethics to accountability. Sustainable Cities and Society, 75, 103308. https://doi.org/10.1016/j.scs.2021.103308 DOI: https://doi.org/10.1016/j.scs.2021.103308

Kooli, C. (2023). Chatbots in Education and Research: A Critical Examination of Ethical Implications and Solutions. Sustainability, 15(7), Article 7. https://doi.org/10.3390/su15075614 DOI: https://doi.org/10.3390/su15075614

Lim, K. (2024). Assessing beyond grades: Unravelling the implications on student learning and engagement in higher education. Assessment & Evaluation in Higher Education, 49(5), 665–679. https://doi.org/10.1080/02602938.2024.2314703 DOI: https://doi.org/10.1080/02602938.2024.2314703

Meliou, E., Ozbilgin, M., & Edwards, T. (2021). How does responsible leadership emerge? An emergentist perspective. European Management Review, 18(4), 521–534. https://doi.org/10.1111/emre.12488 DOI: https://doi.org/10.1111/emre.12488

O, ’Dea Xianghan, & O, ’Dea Mike. (2023). Is artificial intelligence really the next big thing in learning and teaching in higher education?: A conceptual paper. Journal of University Teaching and Learning Practice, 20(5), 1–17. https://doi.org/10.3316/informit.T2024112700021091343163252 DOI: https://doi.org/10.53761/1.20.5.06

Pantiris, P., Pallis, P. L., Chountalas, P. T., & Dasaklis, T. K. (2025). Enhancing Coordination and Decision Making in Humanitarian Logistics Through Artificial Intelligence: A Grounded Theory Approach. Logistics, 9(3), 113. https://doi.org/10.3390/logistics9030113 DOI: https://doi.org/10.3390/logistics9030113

Pham, S. T. H., & Sampson, P. M. (2022). The development of artificial intelligence in education: A review in context. Journal of Computer Assisted Learning, 38(5), 1408–1421. https://doi.org/10.1111/jcal.12687 DOI: https://doi.org/10.1111/jcal.12687

Rojas Bruna, C. E. (2025). Enhancing primary teacher training through academic portfolios in advanced mathematics courses. International Electronic Journal of Mathematics Education, 20(4). https://doi.org/10.29333/iejme/16635 DOI: https://doi.org/10.29333/iejme/16635

Sebestyén, M. (2025). Focal points and blind spots of human-centered AI: AI risks in written online media. Humanities and Social Sciences Communications, 12(1), 564. https://doi.org/10.1057/s41599-025-04814-y DOI: https://doi.org/10.1057/s41599-025-04814-y

Sprague, N. L., Scott, S. N., Mehranbod, C. A., Sachs, A. L., Ekenga, C. C., Rundle, A. G., Branas, C. C., & Factor-Litvak, P. (2025). Changing Degrees: A weight-of-evidence scoping review examining the impact of childhood exposures to climate change on educational outcomes. Environmental Research, 277. https://doi.org/10.1016/j.envres.2025.121639 DOI: https://doi.org/10.1016/j.envres.2025.121639

Srđan, V., Marija, K., Zorana, L., Dragana, T., & Branislav, R. (2025). Simpson’s paradox in mathematics grading: A case study of Serbian primary schools. Zbornik Instituta za Pedagoska Istrazivanja, 57(1), 5–28. https://doi.org/10.2298/ZIPI2501005V DOI: https://doi.org/10.2298/ZIPI2501005V

Torelli, R. (2020). Sustainability, responsibility and ethics: Different concepts for a single path. Social Responsibility Journal, 17(5), 719–739. https://doi.org/10.1108/SRJ-03-2020-0081 DOI: https://doi.org/10.1108/SRJ-03-2020-0081

Toscani, G. (2025). Integrating minds: Adaptive knowledge sharing strategies for ML team synergy. Cognition, Technology & Work, 27(4), 745–761. https://doi.org/10.1007/s10111-025-00822-9 DOI: https://doi.org/10.1007/s10111-025-00822-9

Uhing, K., Bennett, A. B., & Wright, G. (2025). Students’ Experiences in a Coordinated College Algebra Course: A Case Study of Implementing Active Learning and Standards-Based Grading. International Journal of Research in Undergraduate Mathematics Education. https://doi.org/10.1007/s40753-025-00265-7 DOI: https://doi.org/10.1007/s40753-025-00265-7

Valentine, N., Durning, S., Shanahan, E. M., & Schuwirth, L. (2021). Fairness in human judgement in assessment: A hermeneutic literature review and conceptual framework. Advances in Health Sciences Education, 26(2), 713–738. https://doi.org/10.1007/s10459-020-10002-1 DOI: https://doi.org/10.1007/s10459-020-10002-1

Verhulst, S. G. (2021). Reimagining data responsibility: 10 new approaches toward a culture of trust in re-using data to address critical public needs. Data & Policy, 3, e6. https://doi.org/10.1017/dap.2021.4 DOI: https://doi.org/10.1017/dap.2021.4

Yan, L., Sha, L., Zhao, L., Li, Y., Martinez-Maldonado, R., Chen, G., Li, X., Jin, Y., & Gašević, D. (2024). Practical and ethical challenges of large language models in education: A systematic scoping review. British Journal of Educational Technology, 55(1), 90–112. https://doi.org/10.1111/bjet.13370 DOI: https://doi.org/10.1111/bjet.13370

Yi, H., Pingsterhaus, A., & Song, W. (2021). Effects of Wearing Face Masks While Using Different Speaking Styles in Noise on Speech Intelligibility During the COVID-19 Pandemic. Frontiers in Psychology, 12. https://doi.org/10.3389/fpsyg.2021.682677 DOI: https://doi.org/10.3389/fpsyg.2021.682677

Zou, W., Goh, T.-T., Zhu, H., Liu, M., & Yang, B. (2025). Algorithmic Learning: Assessing the Potential of Large Language Models (LLMs) for Automated Exercise Generation and Grading in Educational Settings. International Journal of Human–Computer Interaction, 0(0), 1–18. https://doi.org/10.1080/10447318.2025.2520931 DOI: https://doi.org/10.1080/10447318.2025.2520931

Rethinking Ethical Responsibility and Data Governance in Academic Assessment Using Large Language Models

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

indexing

kanann

pengunjung