Contenido principal del artículo

Autores

En un contexto de uso masivo de ChatGPT, este artículo explora la evaluación de la escritura académica en la educación superior a través de dos preguntas de investigación: ¿cómo evaluar la escritura en la universidad en la era de la inteligencia artificial (IA) y qué tan confiable es usar la IA para evaluar la escritura académica? El estudio aplica un diseño secuencial exploratorio de dos fases. La primera fase consiste en una entrevista a ChatGPT sobre sus capacidades de escritura. La segunda fase analiza la confiabilidad de ChatGPT para evaluar una muestra de 40 respuestas construidas, evaluadas por 5 evaluadores, de una prueba de español académico respaldada por una argumentación de validez. Los resultados apuntan a dos conclusiones provisionales: la evaluación de la escritura académica necesita incrementar prácticas específicas que minimicen los riesgos del uso de la IA; y las evaluaciones que hace la IA necesitan una verificación constante de su confiabilidad debido al sesgo de tendencia central.

Sergio Álvarez Uribe, Universidad del Norte, Barranquilla, Colombia

Profesor asistente del Departamento de Español del Instituto de Idiomas de la Universidad del Norte. Doctor en Lingüística aplicada a la enseñanza de lenguas de la Universidad Antonio de Nebrija (calificación sobresaliente cum laude). Docente del área de formación básica en eficacia comunicativa. Director del Centro de escritura Eficacia Comunicativa - ECO. Investigador en el área de evaluación de la escritura académica.

Álvarez Uribe, S. (2025). La evaluación de la escritura en español en la educación superior: una conversación con ChatGPT. Lenguaje, 53(1S), e20414433. https://doi.org/10.25100/lenguaje.v53i1S.14433

Aljuaid, H. (2024). The Impact of Artificial Intelligence Tools on Academic Writing Instruction in Higher Education: A Systematic Review. Arab World English Journal (AWEJ) Special Issue on ChatGPT, 26-55. https://doi.org/10.24093/awej/ChatGPT.2

Allen Cu, M. y Hochman, S. (22 de enero de 2023). Scores of Stanford students used ChatGPT on final exams, survey suggests. The Stanford Daily. https://stanforddaily.com/2023/01/22/scores-of-stanford-students-used-chatgpt-on-final-exams-survey-suggests/

Almeida, J. G. (2023). Claves para entender la relación entre la inteligencia artificial y la creatividad humana: una conversación con ChatGPT. https://doi.org/10.13140/RG.2.2.11522.76480

Aloisi, C. (2023). The future of standardised assessment: Validity and trust in algorithms for assessment and scoring. European Journal of Education, 58(1), 98-110. https://doi.org/10.1111/ejed.12542

Altman, D. G. (1999). Practical statistics for medical research. Chapman & Hall/CRC Press.

Álvarez Uribe, S. (2023). Validación de las interpretaciones de una prueba diagnóstica para evaluar el español académico escrito en la educación superior (2023). [Tesis doctoral, Universidad Antonio de Nebrija]. Biblioteca Nebrija. https://biblioteca.nebrija.es/cgi-bin/repositorio?TITN=137362

Álvarez Uribe, S. (2025). Una conversación con ChatGPT (entrevista). Zenodo. https://doi.org/10.5281/zenodo.15619070

Anil, R., Borgeaud, S., Alayrac, J. B., Yu, J., Soricut, R. ... y Blanco, L. (2023). Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805

Anthropic (2024). The Claude 3 Model Family: Opus, Sonnet, Haiku. Recuperado de https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf

Aryadoust, V., Ng, L. Y. y Sayama, H. (2021). A comprehensive review of Rasch measurement in language assessment: Recommendations and guidelines for research. Language Testing, 38(1), 6-40. https://doi.org/10.1177/0265532220927487

Astutik, I., Widiati, U., Ratri, D. P., Jonathans, P. M., Nurkamilah, Devanti, Y. M. y Harfal, Z. (2024). Transformative Practices: Integrating Automated Writing Evaluation in Higher Education Writing Classrooms - A Systematic Review. Indonesian Journal on Learning and Advanced Education (IJOLAE), 6(3), 423-441. https://doi.org/10.23917/ijolae.v6i3.23675

Awidi, I. T. (2024). Comparing expert tutor evaluation of reflective essays with marking by generative artificial intelligence (AI) tool. Computers and Education: Artificial Intelligence, 6, 100226, 1-17. https://doi.org/10.1016/j.caeai.2024.100226

Barrett, A. y Pack, A. (2023). Not quite eye to AI: student and teacher perspectives on the use of generative artificial intelligence in the writing process. International Journal of Educational Technology in Higher Education, 20(59), 1-24. https://doi.org/10.1186/s41239-023-00427-0

Berrezueta-Guzman, J., Malache-Silva, L. y Krusche, S. (2023). ChatGPT-4 as a Tool for Reviewing Academic Books in Spanish. En: Berrezueta, S. (Ed.). Proceedings of the 18th Latin American Conference on Learning Technologies (LACLO 2023). LACLO 2023. Lecture Notes in Educational Technology (pp. 384-397). Springer. https://doi.org/10.1007/978-981-99-7353-8_29

Bernard, H. R. (2017). Research methods in anthropology: Qualitative and quantitative approaches. Rowman & Littlefield.

Biber, D. (1988). Variation across speech and writing. Cambridge University Press. https://doi.org/10.1017/CBO9780511621024

Bond, M., Khosravi, H., De Laat, M., Bergdahl, N., Negrea, V., Oxley, E., Pham, P., Chong, S. W. y Siemens, G. (2024). A meta systematic review of artificial intelligence in higher education: a call for increased ethics, collaboration, and rigour. International Journal of Educational Technology in Higher Education, 21, 1-41. https://doi.org/10.1186/s41239-023-00436-z

Bucol, J. L. y Sangkawong, N. (2024). Exploring ChatGPT as a writing assessment tool. Innovations in Education and Teaching International, 1-16. https://doi.org/10.1080/14703297.2024.2363901

Burstein, J., Chodorow, M. y Leacock, C. (2004). Automated essay evaluation: The criterion online writing system. AI Magazine, 25(3). 27-36. https://doi.org/10.1609/aimag.v25i3.1774

Cahill, A., Bruno, J., Ramey, J., Ayala Meneses, G., Blood, I., Tolentino, F., Lavee, T. y Andreyev, S. (6-11 de junio de 2021). Supporting Spanish Writers using Automated Feedback [Conference]. En A. Sil y X. V. Lin (Eds.). Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations (pp. 116-124). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.naacl-demos.14

Carretero, M. y Gartner, E. (2024). Artificial Intelligence and historical thinking: a dialogic exploration of ChatGPT/Inteligencia Artificial y pensamiento histórico: una exploración dialógica del ChatGPT. Studies in Psychology, 45(1), 80-102. https://doi.org/10.1177/02109395241241379

Castien, J. I., Cuenca, C. y Zlobina, A. (2021). Aplicación en el nivel universitario del programa informático Gallito-Api para la corrección y evaluación de breves ejercicios escritos. En M. A. Martín López y C. S. Rodríguez (Coord.). Cuestiones transversales en la innovación de la docencia y la investigación de las ciencias sociales y jurídicas: especial referencia al impacto del covid-19, las nuevas tecnologías y metodologías, las perspectivas de género y la diversidad (pp. 797-820). Dykinson.

Castro Nascimento, C. M. y Silva Pimentel, A. (2023). Do large language models understand chemistry? a conversation with chatgpt. Journal of Chemical Information and Modeling, 63(6), 1649-1655. https://doi.org/10.1021/acs.jcim.3c00285

Chapelle, C. A. (2020). Argument-based validation in testing and assessment. Sage Publications. https://doi.org/10.4135/9781071878811

Charmaz, K. (2006). Constructing grounded theory: A practical guide through qualitative analysis. Sage publications.

Chen, G. H., Chen, S., Liu, Z., Jiang, F. y Wang, B. (2024). Humans or LLMs as the judge? a study on judgement biases. En Y. Al-Onaizan, M. Bansal y Y. N. Chen (Eds.). Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (pp. 8301-8327). Association for Computational Linguistics https://doi.org/10.18653/v1/2024.emnlp-main.474

Chomsky, N. (marzo 8 de 2023). The false promise of ChatGPT. New York Times. https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chatgpt-ai.html

Creswell, J. W. y Creswell, J. D. (2017). Research design: Qualitative, quantitative, and mixed methods approaches. Sage publications.

Currie, G. (2023). A conversation with ChatGPT. Journal of Nuclear Medicine Technology, 51(3), 255-260. https://doi.org/10.2967/jnmt.123.265864

Da Cunha, I., Montané, M. A. e Hysa, L. (2017). The arText prototype: An automatic system for writing specialized texts. En A. Peñas, A. y A. Martins (Eds.). Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017). Software Demonstrations (pp. 57–60). Association for Computational Linguistics.

Dimova, S., Yan, X. y Ginther, A. (2020). Local language testing. Design, implementation and development. Routledge. https://doi.org/10.4324/9780429492242

Dong, Y. (2023). Revolutionizing Academic English Writing through AI-Powered Pedagogy: Practical Exploration of Teaching Process and Assessment. Journal of Higher Education Research, 4(2), 52-57.

Dörnyei, Z. (2007). Research methods in applied linguistics. Oxford university press.

Durmus, E., Lovitt, L., Tamkin, A., Ritchie, S., Clark, J., y Ganguli, D. (2024). Measuring the Persuasiveness of Language Models. Anthropic. https://www.anthropic.com/news/measuring-model-persuasiveness

Dwivedi, Y. K., Kshetri, N., Hughes, L., Slade, E. L., Jeyaraj, A., Kar, A. K., Baabdullah, A. M., Koohang, A., Raghavan, V., Ahuja, M., Albanna, H., Albashrawi, M. A., Al-Busaidi, A. S., Balakrishnan, J., Barlette, Y., Basu, S., Bose, I., Brooks, L., Buhalis, D., … Wright, R. (2023). “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. International Journal of Information Management, 71, 102642. https://doi.org/10.1016/j.ijinfomgt.2023.102642

Elliot, S. (2013). IntelliMetric: From here to validity. En M. D. Shermis y J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 71–86). Lawrence Erlbaum Associates.

Eysenbach, G. (2023). The role of ChatGPT, generative language models, and artificial intelligence in medical education: A conversation with ChatGPT and a call for papers. JMIR Medical Education, 9(1), e46885. https://doi.org/10.2196/46885

Fulcher, G. (2015). Re-examining language testing: A philosophical and social inquiry. Routledge.

Gliem, J. A. y Gliem, R. R. (2003). Calculating, interpreting, and reporting Cronbach’s alpha reliability coefficient for Likert-type scales. Midwest Research-to-Practice Conference in Adult, Continuing, and Community Education.

Graesser, A. C., McNamara, D. S., Louwerse, M. M. y Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods, Instruments, & computers, 36(2), 193-202. https://doi.org/10.3758/BF03195564

HAI (2023). Artificial Intelligence Index Report 2023. Human-Centered Artificial Intelligence.

HAI (2024). Artificial Intelligence Index Report 2024. Human-Centered Artificial Intelligence.

Hao, J. y Fauss, M. (2024). Test security in remote testing age: Perspectives from process data analytics and AI [Prepublicación]. arXiv. preprint arXiv:2411.13699

Herbold, S., Hautli-Janisz, A., Heuer, U., Kikteva, Z. y Trautsch, A. (2023). A large-scale comparison of human-written versus ChatGPT-generated essays. Scientific reports, 13, (1), 18617. https://doi.org/10.1038/s41598-023-45644-9

Hicks, M. T., Humphries, J. y Slater, J. (2024). ChatGPT is bullshit. Ethics and Information Technology, 26(38), 1-26. https://doi.org/10.1007/s10676-024-09775-5

Hidayanto, M.B., Lubis, M. y Jacob, D.W. (2025). Online assessment security through computer lockdown and human proctor methods. En. A. Nagar, D.S. Jat, D. Mishra y A. Joshi (Eds.), Intelligent sustainable systems. Worlds4 2024. Lecture Notes in Networks and Systems, (vol 1179, pp. 263-273). Springer. https://doi.org/10.1007/978-981-97-9327-3_22

Hinkle, D. E., Wiersma, W. y Jurs, S. G. (2003). Applied statistics for the behavioral sciences (Vol. 663). Houghton Mifflin.

Hussein, M. A., Hassan, H. y Nassef, M. (2019). Automated language essay scoring systems: A literature review. PeerJ Computer Science, 5, e208. http://doi.org/10.7717/peerj-cs.208

Imran, M. y Almusharraf, N. (2023). Analyzing the role of ChatGPT as a writing assistant at higher education level: A systematic review of the literature. Contemporary Educational Technology, 15(4), ep464. https://doi.org/10.30935/cedtech/13605

İpek, Z. H., Gözüm, A. I. C., Papadakis, S. y Kallogiannakis, M. (2023). Educational applications of the ChatGPT AI System: A systematic review research. Educational Process: International Journal, 12(3), 26-55.

Karakose, T., Demirkol, M., Aslan, N., Köse, H., y Yirci, R. (2023). A conversation with ChatGPT about the impact of the COVID-19 pandemic on education: Comparative review based on human–AI collaboration. Educational Process: International Journal, 12(3), 7-25.

Koo, T. K. y Li, M. Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine. 15(2), 155-163. https://doi.org/10.1016/j.jcm.2016.02.012

Kuleva, M. y Miladinov, O. (2024). Exploring the efficacy of online proctoring in online examinations: A comprehensive review. En Environment. Technologies. Resources. Proceedings of the International Scientific and Practical Conference (Vol. 2, pp. 192-196). Rezekne Academy of Technologies. https://doi.org/10.17770/etr2024vol2.8058

Kumar, V. y Boulanger, D. (2020). Explainable automated essay scoring: Deep learning really has pedagogical value. Frontiers in education, 5, 1-22. https://doi.org/10.3389/feduc.2020.572367

Lanier, J. (20 de abril de 2023). There Is no A.I. The New Yorker. https://www.newyorker.com/science/annals-of-artificial-intelligence/there-is-no-ai

Lanier, J. (01 de marzo de 2024). How to picture A.I. The New Yorker. https://www.newyorker.com/science/annals-of-artificial-intelligence/how-to-picture-ai

Lee, V. R., Pope, D., Miles, S. y Zárate, R. C. (2024). Cheating in the age of generative AI: A high school survey study of cheating behaviors before and after the release of ChatGPT. Computers and Education: Artificial Intelligence, 7, (100253), 1-10. https://doi.org/10.1016/j.caeai.2024.100253

Levine, S., Beck, S. W., Mah, C., Phalen, L. y Pittman, J. (2024). How do students use ChatGPT as a writing support? Journal of Adolescent & Adult Literacy, 1-13. https://doi.org/10.1002/jaal.1373

Lillo-Fuentes, F., Venegas, R. y Lobos, I. (2023). Evaluación automatizada y semiautomatizada de la calidad de textos escritos: una revisión sistemática. Perspectiva Educacional, 62(2), 5-36. https://doi.org/10.4151/07189729-Vol.62-Iss.2-Art.1420

Linacre, J. M. (1994). Many-facet Rasch measurement. Mesa Press.

Linacre, J. M. (2002). Optimizing rating scale category effectiveness. Journal of Applied Measurement, 3(1), 85–106. https://acortar.link/RNUcC1

Lo, C. K. (2023). What is the impact of ChatGPT on education? A rapid review of the literature. Education Sciences, 13(4), 410. https://doi.org/10.3390/educsci13040410

López-Regalado, O., Núñez Rojas, N., López Gil, Ó. R. y Sánchez Rodríguez, J. (2024). Análisis del uso de la inteligencia artificial en la educación universitaria: una revisión sistemática. Pixel-Bit. Revista de Medios y Educación, (70), 97-122. https://dx.doi.org/10.12795/pixelbit.106336

Lundgren, M. (2024). Large language Models in student assessment: Comparing ChatGPT and human graders. arXiv. https://doi.org/10.48550/arXiv.2406.16510

Luo, J. (2024). A critical review of GenAI policies in higher education assessment: A call to reconsider the “originality” of students’ work. Assessment & Evaluation in Higher Education, 1-14. https://doi.org/10.1080/02602938.2024.2309963

Malik, A. R., Pratiwi, Y., Andajani, K., Numertayasa, I. W., Suharti, S. y Darwis, A. (2023). Exploring artificial intelligence in academic essay: Higher education students’ perspective. International Journal of Educational Research Open, 5, e100296. https://doi.org/10.1016/j.ijedro.2023.100296

Martínez-Olmo, F. y González Catalán, F. (2024). Revisión sistemática de tendencias en la aplicación de la inteligencia artificial al ámbito de la escritura académica en las ciencias sociales. Digital Education Review, (45), 37-42. https://doi.org/10.1344/der.2024.45.37-42

Menon, D. y Shilpa, K. (2023). “Chatting with ChatGPT”: Analyzing the factors influencing users' intention to use the Open AI's ChatGPT using the UTAUT model. Heliyon, 9(11), e20541. https://doi.org/10.1016/j.heliyon.2023.e20541

Mizumoto, A. y Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050. https://doi.org/10.1016/j.rmal.2023.100050

Moorhouse, B. L. (2024). Beginning and first-year language teachers’ readiness for the generative AI age. Computers and Education: Artificial Intelligence, 6, e100201. https://doi.org/10.1016/j.caeai.2024.100201

Naismith, B., Mulcaire, P. y Burstein, J. (2023, July). Automated evaluation of written discourse coherence using GPT-4. En E. Kochmar, J. Burstein, A. Horbach, R. Laarmann Quante, N. Madnani, A. Tack, V. Yaneva, Z. Yuan y T. Zesch (Eds.). Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023) (pp. 394-403). https://doi.org/10.18653/v1/2023.bea-1.32

Nazar, R. y Renau, I. (2023). Estilector: un sistema de evaluación automática de la escritura académica en castellano. Perspectiva Educacional. Formación de Profesores, 62(2), 37-59. http://dx.doi.org/10.4151/07189729-vol.62-iss.2-art.1427

Nguyen, T. N. y Truong, H. T. (2025). Trends and emerging themes in the effects of generative artificial intelligence in education: A systematic review. Eurasia Journal of Mathematics, Science and Technology Education, 21(4), em2613. https://doi.org/10.29333/ejmste/16124

OpenAI (2024). Términos de uso. https://openai.com/policies/row-terms-of-use/

Owan, V. J., Abang, K. B., Idika, D. O., Etta, E. O. y Bassey, B. A. (2023). Exploring the potential of artificial intelligence tools in educational measurement and assessment. Eurasia Journal of Mathematics, Science and Technology Education, 19(8), em2307. https://doi.org/10.29333/ejmste/13428

Paek, S. y Kim, N. (2021). Analysis of worldwide research trends on the impact of artificial intelligence in education. Sustainability, 13(14), 7941. https://doi.org/10.3390/su13147941

Parker, J. L., Becker, K. y Carroca, C. (2023). ChatGPT for automated writing evaluation in scholarly writing instruction. Journal of Nursing Education, 62(12), 721-727. https://doi.org/10.3928/01484834-20231006-02

Patton, M. Q. (2015). Qualitative research & evaluation methods: Integrating theory and practice (4.ª ed.). Sage publications. https://acortar.link/wU2rWM

Pérez, A., McClain, S. K., Roa, A. F., Rosado-Mendinueta, N., Trigos-Carrillo, L., Robles, H. y Campo, O. (2025). Artificial Intelligence Applications in College Academic Writing and Composition: A Systematic Review. Íkala, Revista de Lenguaje y Cultura, 30(1), 1-37. https://doi.org/10.17533/udea.ikala.355878

Pfau, A., Polio, C. y Xu, Y. (2023). Exploring the potential of ChatGPT in assessing L2 writing accuracy for research purposes. Research Methods in Applied Linguistics, 2(3), 100083. https://doi.org/10.1016/j.rmal.2023.100083

Qin, Q. y Zhang, S. (2025). Visualizing the knowledge mapping of artificial intelligence in education: A systematic review. Education and Information Technologies 30, 449–483. https://doi.org/10.1007/s10639-024-13076-1

Ramesh, D. y Sanampudi, S. K. (2022). An automated essay scoring systems: A systematic literature review. Artificial Intelligence Review, 55(3), 2495-2527. https://doi.org/10.1007/s10462-021-10068-2

Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests (Ed. Ampliada). University of Chicago Press. (Obra original publicada en 1960).

Römer, U. y O'Donnell, M. B. (2011). From student hard drive to web corpus (part 1): The design, compilation and genre classification of the Michigan Corpus of Upper-level Student Papers (MICUSP). Corpora, 6(2), 159-177. https://doi.org/10.3366/cor.2011.0011

Ruano-Borbalan, J. C. (2025). The transformative impact of artificial intelligence on higher education: A critical reflection on current trends and futures directions. International Journal of Chinese Education, 14(1). https://doi.org/10.1177/2212585X251319364

Rudner, L., Garcia, V. y Welch, C. (2006). An evaluation of the IntelliMetric essay scoring system. The Journal of Technology, Learning and Assessment, 4, (4). https://ejournals.bc.edu/index.php/jtla/article/view/1651

Rudolph, J., Tan, S. y Tan, S. (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of applied learning and teaching, 6(1), 342-363. https://doi.org/10.37074/jalt.2023.6.1.9

Sala, A. (2024). AI watermarking: A watershed for multimedia authenticity. The UN Agency for Digital Technologies. https://www.itu.int/hub/2024/05/ai-watermarking-a-watershed-for-multimedia-authenticity/

Salas-Pilco, S. Z. y Yang, Y. (2022). Artificial intelligence applications in Latin American higher education: a systematic review. International Journal of Educational Technology in Higher Education, 19(1), 21. https://doi.org/10.1186/s41239-022-00326-w

Seol, H. (2023). seolmatrix: Correlations suite for jamovi (Versión 3.7.1) [módulo de jamovi]. https://github.com/hyunsooseol/seolmatrix.

Shi, H. y Aryadoust, V. (2023). A systematic review of automated writing evaluation systems. Education and Information Technologies, 28(1), 771-795. https://doi.org/10.1007/s10639-022-11200-7

Spataro, J. (2023). Introducing Microsoft 365 Copilot – your copilot for work. Microsoft Blog. https://blogs.microsoft.com/blog/2023/03/16/introducing-microsoft-365-copilot-your-copilot-for-work/

Steele, J. L. (2023). To GPT or not GPT? Empowering our students to learn with AI. Computers and Education: Artificial Intelligence, 5, Artículo 100160. https://doi.org/10.1016/j.caeai.2023.100160

Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., ... y Olson, C. B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894

Su, Y., Lin, Y., y Lai, C. (2023). Collaborating with ChatGPT in argumentative writing classrooms. Assessing Writing, 57, 100752. https://doi.org/10.1016/j.asw.2023.100752

Taber, K. S. (2018). The use of Cronbach’s alpha when developing and reporting research instruments in science education. Research in Science Education, 48(6), 1273-1296. https://doi.org/10.1007/s11165-016-9602-2

Tarp, S. y Nomdedeu-Rull, A. (2024). Who Has the last word? Lessons from Using ChatGPT to develop an AI-based Spanish writing assistant. Círculo de Lingüística Aplicada a la Comunicación, 97, 309-321. https://doi.org/10.5209/clac.91985

The jamovi project (2022). Jamovi. (Versión 2.3) [Software]. https://www.jamovi.org.

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M. A., Lacroix, T. Rozière B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E. y Lample, G. (2023). Llama: Open and efficient foundation language models. arXiv. https://doi.org/10.48550/arXiv.2302.13971

Townsend, B. (2023). Exploring the Impact of college rankings on student perceptions. College Rover. https://collegerover.com/campus-library/news/1/exploring-the-impact-of-college-rankings-on-student-perceptions

Tülübaş, T., Demirkol, M., Ozdemir, T. Y., Polat, H., Karakose, T. y Yirci, R. (2023). An interview with ChatGPT on emergency remote teaching: A comparative analysis based on human–AI collaboration. Educational Process: International Journal, 12(2), 93-110. https://doi.org/10.22521/edupij.2023.122.6

Wang, S., Wang, F., Zhu, Z., Wang, J., Tran, T. y Du, Z. (2024). Artificial intelligence in education: A systematic literature review. Expert Systems with Applications, 252, 124167. https://doi.org/10.1016/j.eswa.2024.124167

Wise, B., Emerson, L., Van Luyn, A., Dyson, B., Bjork, C. y Thomas, S. E. (2024). A scholarly dialogue: Writing scholarship, authorship, academic integrity and the challenges of AI. Higher Education Research & Development, 43(3), 578–590. https://doi.org/10.1080/07294360.2023.2280195

Yeadon, W., Agra, E., Inyang, O. O., Mackay, P. y Mizouri, A. (2024). Evaluating AI and human authorship quality in academic writing through physics essays. arXiv. https://doi.org/10.48550/arXiv.2403.05458

Zaheer, M., Munir, S. y Sherazi, S. N. (2024). Exploring the Issues and challenges of online assessment and evaluation in the era of artificial intelligence. Journal of Asian Development Studies, 13(1), 185-197. https://doi.org/10.62345/jads.2024.13.1.16

Zhang, S. (2021). Review of automated writing evaluation systems. Journal of China Computer-Assisted Language Learning, 1(1), 170-176. https://doi.org/10.1515/jccall-2021-2007

Zirar, A. (2023). Exploring the impact of language models, such as ChatGPT, on student learning and assessment. Review of Education, 11(3), e3433. https://doi.org/10.1002/rev3.3433

Zupanc, K. y Bosnic, Z. (2015). Advances in the field of automated essay evaluation. Informatica, 39(4), 383-395. https://informatica.si/index.php/informatica/article/view/815