CFlow: Supporting Semantic Flow Analysis of Students' Code in Programming Problems at Scale

Files

TR Number

Date

2024-07-09

Journal Title

Journal ISSN

Volume Title

Publisher

ACM

Abstract

Introductory programming courses have been growing rapidly, now enrolling hundreds or thousands of students. In such large courses, it can be overwhelmingly difficult for instructors to understand class-wide problem-solving patterns or issues, which is crucial for improving instruction and addressing important pedagogical challenges. In this paper, we propose a technique and system, CFlow, for creating understandable and navigable representations of code at scale. CFlow is able to represent thousands of code samples in a visualization that resembles a single code sample. CFlow creates scalable code representations by (1) clustering individual statements with similar semantic purposes, (2) presenting clustered statements in a way that maintains semantic relationships between statements, (3) representing the correctness of different variations as a histogram, and (4) allowing users to navigate through solutions interactively using semantic filters. With a multi-level view design, users can navigate high-level patterns, and low-level implementations. This is in contrast to prior tools that either limit their focus on isolated statements (and thus discard the surrounding context of those statements) or cluster entire code samples (which can lead to large numbers of clustersโ€”for example, if there are ๐‘› code features and ๐‘š implementations of each, there can be ๐‘š๐‘› clusters). We evaluated the effectiveness of CFlow with a comparison study, found participants using CFlow spent only half the time identifying mistakes and recalled twice as many desired patterns from over 6,000 submissions.

Description

Keywords

Citation