A GPT-Based Code Review System With Accurate Feedback for Programming Education

The increasing demand for programming education and growing class sizes require immediate and personalized feedback. However, integrating Large Language Models (LLMs) like ChatGPT in introductory programming courses raises concerns about AI-assisted cheating. In large-scale settings, faulty code sub...

Full description

Saved in:

Bibliographic Details
Main Authors:	Dong-Kyu Lee, Inwhee Joe
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Large language models (LLMs) GPT-4o programming education learner-friendly code reviews LangChain
Online Access:	https://ieeexplore.ieee.org/document/11039773/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1839656182731505664
author	Dong-Kyu Lee Inwhee Joe
author_facet	Dong-Kyu Lee Inwhee Joe
author_sort	Dong-Kyu Lee
collection	DOAJ
description	The increasing demand for programming education and growing class sizes require immediate and personalized feedback. However, integrating Large Language Models (LLMs) like ChatGPT in introductory programming courses raises concerns about AI-assisted cheating. In large-scale settings, faulty code submissions may lead LLMs to overanalyze, causing unnecessary token consumption. This paper proposes a GPT-4o-based code review system that provides accurate feedback while reducing token usage and preventing AI-assisted cheating. Unlike general-purpose LLM tools for professionals, the system is pedagogically designed for primary and secondary students by focusing on review necessity and learner-friendly feedback. The system features a Code Review Module (CRM) that reduces token usage via a Review Necessity Chain (RNC), and Code Correctness Check Module (CCM) combining test case validation with LLM-based assessment. To prevent AI-assisted cheating, the system provides automated feedback on submitted code without prompting and revealing correct answers, which are accessed only through the “Ask Code Tutor” button. In usability test, the system detected up to 42.86% more errors than a conventional online judge. BERTScore analysis showed that over 80% of the system-generated reviews were semantically aligned with human feedback. A performance comparison with state-of-the-art systems demonstrated a blocking success rate of 86%, with a comparable review omission rate. These results indicate that the system provides more accurate feedback than conventional automated code reviews, while achieving token efficiency and supporting self-directed learning through educational feedback. Thus, it can serve as a practical solution for scalable programming education in primary and secondary classes.
format	Article
id	doaj-art-f01e265b15cb42b5a3aae29e96ea4da8
institution	Matheson Library
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-f01e265b15cb42b5a3aae29e96ea4da82025-06-24T23:00:46ZengIEEEIEEE Access2169-35362025-01-011310572410573710.1109/ACCESS.2025.358113911039773A GPT-Based Code Review System With Accurate Feedback for Programming EducationDong-Kyu Lee0https://orcid.org/0009-0007-3067-7791Inwhee Joe1https://orcid.org/0000-0002-8435-0395Department of Computer Science, Hanyang University, Seoul, Republic of KoreaDepartment of Computer Science, Hanyang University, Seoul, Republic of KoreaThe increasing demand for programming education and growing class sizes require immediate and personalized feedback. However, integrating Large Language Models (LLMs) like ChatGPT in introductory programming courses raises concerns about AI-assisted cheating. In large-scale settings, faulty code submissions may lead LLMs to overanalyze, causing unnecessary token consumption. This paper proposes a GPT-4o-based code review system that provides accurate feedback while reducing token usage and preventing AI-assisted cheating. Unlike general-purpose LLM tools for professionals, the system is pedagogically designed for primary and secondary students by focusing on review necessity and learner-friendly feedback. The system features a Code Review Module (CRM) that reduces token usage via a Review Necessity Chain (RNC), and Code Correctness Check Module (CCM) combining test case validation with LLM-based assessment. To prevent AI-assisted cheating, the system provides automated feedback on submitted code without prompting and revealing correct answers, which are accessed only through the “Ask Code Tutor” button. In usability test, the system detected up to 42.86% more errors than a conventional online judge. BERTScore analysis showed that over 80% of the system-generated reviews were semantically aligned with human feedback. A performance comparison with state-of-the-art systems demonstrated a blocking success rate of 86%, with a comparable review omission rate. These results indicate that the system provides more accurate feedback than conventional automated code reviews, while achieving token efficiency and supporting self-directed learning through educational feedback. Thus, it can serve as a practical solution for scalable programming education in primary and secondary classes.https://ieeexplore.ieee.org/document/11039773/Large language models (LLMs)GPT-4oprogramming educationlearner-friendly code reviewsLangChain
spellingShingle	Dong-Kyu Lee Inwhee Joe A GPT-Based Code Review System With Accurate Feedback for Programming Education IEEE Access Large language models (LLMs) GPT-4o programming education learner-friendly code reviews LangChain
title	A GPT-Based Code Review System With Accurate Feedback for Programming Education
title_full	A GPT-Based Code Review System With Accurate Feedback for Programming Education
title_fullStr	A GPT-Based Code Review System With Accurate Feedback for Programming Education
title_full_unstemmed	A GPT-Based Code Review System With Accurate Feedback for Programming Education
title_short	A GPT-Based Code Review System With Accurate Feedback for Programming Education
title_sort	gpt based code review system with accurate feedback for programming education
topic	Large language models (LLMs) GPT-4o programming education learner-friendly code reviews LangChain
url	https://ieeexplore.ieee.org/document/11039773/
work_keys_str_mv	AT dongkyulee agptbasedcodereviewsystemwithaccuratefeedbackforprogrammingeducation AT inwheejoe agptbasedcodereviewsystemwithaccuratefeedbackforprogrammingeducation AT dongkyulee gptbasedcodereviewsystemwithaccuratefeedbackforprogrammingeducation AT inwheejoe gptbasedcodereviewsystemwithaccuratefeedbackforprogrammingeducation

A GPT-Based Code Review System With Accurate Feedback for Programming Education

Similar Items