Novice Program Visualizations Usage

This repository contains the study materials of the paper, "How Novices Use Program Visualizations to Understand Code that Manipulates Data Tables." The study investigates how novices interpret code with visual tool, particularly in tasks involving data table manipulation.

Key Findings

Finding	Description	Representative Quote
In-notebook visualizations were usable and useful	Participants appreciated that the tool visualized code within their familiar notebook environment, finding it easy to use and useful with minimal instruction.	"Yeah, since you actually do [the groupby and aggregate] step by step, I'm like, oh, I get it." (P4)
Visualizations helped to verify assumptions	The tool lowered the effort required to visualize intermediate program results, giving novices more confidence in their understanding of code.	"Good visualization gives me more hints and gives me a clear understanding of what's going on." (P1)
Revealing normally-hidden table rows can highlight salient information	Participants noted that the default output only shows the top and bottom rows of a dataframe, while middle rows might contain relevant information.	"It's not that clear what has happened within each step since there might be 1,010,000 rows of information inside [hidden in the dataframe output]." (P8)
Visualization tools can generate complicated diagrams, but interaction can help users manage cognitive load	In complex cases, the tool generated diagrams with many visual elements (e.g., arrows), which were overwhelming. However, interactions enabling selective filtering were helpful for understanding.	"I see when you put it like this, it looks very complicated, but as soon as you put the allow hovers, the hovers make it a lot easier." (P11)
Limitations of the tool	The tool was better for understanding code rather than writing code because it lacked direct support for debugging.	"I think it's better for checking instead of writing." (P4)

Study Protocol

Procedure

Each study session is designed to last a maximum of 60 minutes.

Task	Time	Time out of 60	Description
Intro	5	5	Introduce the study procedure and reconfirm participants' consent to record.
Section 1	20	5–25	20 min total; not all tasks need to be completed. Hard stop at 20 minutes.
Task 1.1			Complete task.
Task 1.2			Complete task.
Task 1.3			Complete task.
Section 2	20	25–45	20 min total; not all tasks need to be completed. Hard stop at 20 minutes.
Task 2.1			Complete task.
Task 2.2			Complete task.
Task 2.3			Complete task.
Open-Ended	5	45–50	Fill-in-the-blank coding task. Complete task.
Interview	10	60	Conduct an interview with four prepared questions and collect additional comments.

Task Notebook Details

The task notebooks used in the study are included in this repository under the folder task_notebooks. Participants were assigned one of four versions of a task notebook. Each notebook contained two sections with three tasks each and a fill-in-the-blank task at the end. The two sections in the same notebook use different but structurally similar dataset. One section provides step-by-step diagrams for the code, while the other section provides default Python output. The three tasks in the two sections correspond one by one, with similar topics and structures. The tasks are randomly shuffled within one section and named with Greek letters.

Notebook Versions

Version	First Section	Dataset	Second Section	Dataset
A	Default Output	Sales	Step-By-Step Diagram	Student
B	Default Output	Student	Step-By-Step Diagram	Sales
C	Step-By-Step Diagram	Sales	Default Output	Student
D	Step-By-Step Diagram	Student	Default Output	Sales

Task Mapping

Task	Sales Task	Student Task
Task 1	epsilon	theta
Task 2	zeta	gamma
Task 3	lambda	kappa

Interview Questions

Participants were asked the following questions during the interview phase of the study:

How did you find the tasks? Were any particularly easy or difficult?
Did the visualizations help you understand the code better? If yes, how?
How comfortable do you feel working with code in general?
Did you have any strategies for approaching the open-ended task?
Any other feedback, comments, or questions?

Participant Responses

Participants completed each task at their own pace, verbalizing their thought process. Afterward, they described their understanding of the code and recorded their confidence level in a Google form (Template).

Data

participant_data.csv contains all participant data from the study, including:

Demographic Information: Age, year level, current programming courses, and experience with Python and Pandas.
Task Information: Task order and correctness scores for each participant.
Confidence Levels: Self-reported confidence scores for each task.
Evaluation Metrics: Ratings for the usefulness and ease of creating visualizations.

This dataset can be used to reproduce the study's analyses or perform further exploratory research.

Contact Information

Sam Lau
GitHub: @SamLau95
Email: sel011@ucsd.edu

Ylesia Wu
GitHub: @ylesia-wu
Email: q7zheng@ucsd.edu

Qirui Zheng
GitHub: @Qz07
Email: xw001@ucsd.edu

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
task_notebooks		task_notebooks
README.md		README.md
participant_data.csv		participant_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Novice Program Visualizations Usage

Key Findings

Study Protocol

Procedure

Task Notebook Details

Notebook Versions

Task Mapping

Interview Questions

Participant Responses

Data

Contact Information

About

Uh oh!

Releases

Packages

Languages

dstl-lab/Novice-Program-Visualizations-Usage

Folders and files

Latest commit

History

Repository files navigation

Novice Program Visualizations Usage

Key Findings

Study Protocol

Procedure

Task Notebook Details

Notebook Versions

Task Mapping

Interview Questions

Participant Responses

Data

Contact Information

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages