A Benchmark for Testing the Capabilities of LLMs in Assessing the Quality of Multiple-choice Questions in Introductory Programming Education (SIGCSE Virtual 2024 - Conference)

Who

Aninditha Ramesh, Arav Agarwal, Jacob Doughty, Ketan Ramaneti, Jaromir Savelka, Majd Sakr

Track

SIGCSE Virtual 2024 Conference

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 7 Dec 2024 16:52 - 17:15 at Track 1 - Saturday - Papers 3: AI (2)

Abstract

There is a growing interest in utilizing large language models (LLMs) for various educational applications. Recent studies have focused on the use of LLMs for generating various educational artifacts for programming education, such as programming exercises, model solutions or multiple-choice questions (MCQs). The ability to efficiently and reliably assess the quality of such artifacts has become of paramount importance. In this paper, we investigate an example use case of assessing the quality of programming MCQs. To that end we carefully curated a data set of 192 MCQs annotated with quality scores based on a rubric that assesses crucial aspects such as clarity, the presence of a single correct answer, the quality of distractors, and alignment with learning objectives (LOs). Our results show that the task presents a considerable challenge even to the state-of-the-art LLMs. To further research in this important area we release the data set as well as the evaluation pipeline to the public.

Link to Presentation: https://youtu.be/Zv74c3hjEoc

Aninditha Ramesh

Carnegie Mellon University

United States

Arav Agarwal

Carnegie Mellon University

United States

Jacob Doughty

Carnegie Mellon University

United States

Ketan Ramaneti

Carnegie Mellon University

United States

Jaromir Savelka

Carnegie Mellon University

United States

Majd Sakr

Carnegie Mellon University

United States

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Sat 7 Dec
Displayed time zone: (UTC) Coordinated Universal Time change

16:30 - 18:00	Papers 3: AI (2)Conference at Track 1 - Saturday

16:30 22m Other		Watch Videos Conference
16:52 22m Paper		A Benchmark for Testing the Capabilities of LLMs in Assessing the Quality of Multiple-choice Questions in Introductory Programming Education Conference Aninditha Ramesh Carnegie Mellon University, Arav Agarwal Carnegie Mellon University, Jacob Doughty Carnegie Mellon University, Ketan Ramaneti Carnegie Mellon University, Jaromir Savelka Carnegie Mellon University, Majd Sakr Carnegie Mellon University
17:15 22m Paper		Examining the Relationship between Socioeconomic Status and Beliefs about Large Language Models in an Undergraduate Programming Course Conference Amy Pang University of Michigan, Aadarsh Padiyath University of Michigan, Diego Viramontes Vargas University of Michigan, Barbara Ericson University of Michigan
17:37 22m Paper		Generative AI in Introductory Programming Instruction: Examining the Assistance Dilemma with LLM-Based Code Generators Conference Eric Poitras Dalhousie University, Brent Crane Dalhousie University, Angela Siegel Dalhousie University

Information for Participants

Sat 7 Dec 2024 16:30 - 18:00 at Track 1 - Saturday - Papers 3: AI (2)

Info for room Track 1 - Saturday:

Track 1 - Saturday December 7th

To access the live meeting for this track, please use the following Zoom link:

https://acm-org.zoom.us/j/99069522006?pwd=Lje2z3fWti91RmkoOlECcShrbOQUPi.1