You'll be looking at code changes (shown as diffs) and answering two simple questions about each change:
- Does the changed program behave identically to the original program for all inputs?
- On a scale of 1-5, how realistic is this code change?
Look for changes that would make the program do something different: If the change is in code that never executes, or if it's mathematically equivalent, it doesn't affect behavior.
✅ The programs have identical behavior (Answer 1):
# Original
if False:
print("This never runs")
# Changed version
if False:
print("This STILL never runs") # Same behavior - dead code❌ Change to program behavior (Answer 0):
# Original
if x < 10:
return True
# Changed version
if x <= 10: # Now includes x=10, different behavior
return True# Original
import pandas as pd
# Changed version - Commenting out an import causes change in behavior if the import is used
# import pandas as pd Think: "Could someone accidentally write this while coding?"
✅ Realistic mistake (Answer 4 or 5):
# Original
for i in range(len(items)):
# Changed version - Natural off-by-one error
for i in range(len(items) - 1):❌ Unrealistic change (Answer 1 or 2):
# Original
def calculate_total(prices):
# Changed version - No developer would write this
def xkcd_random_gibberish_name(prices):# Original
result = a + b
# Changed version - meaningless Python syntax
result = a @@ bKey insight: Ask yourself "Would a real person accidentally type this?" Typos and common mistakes = realistic. Random gibberish = unrealistic.
- Install the repository and dependencies
git clone https://github.com/Jirachiii/mutant_analysis.git
pip install requests colorama- Save your input file (e.g.,
test_sampled_mutants.json) inside the repository. - Update the
filenamevariable at line 7 ofobject_browser.pyto match your input file name. - Run the labeling program
cd mutant_analysis
python object_browser.py- Examine the diff - A comparison between original and mutant code is displayed, such as:
def __call__(self, data, groupby, orient, scales):
- return (
- groupby
- .apply(data.dropna(subset=["x", "y"]), self._fit_predict)
- )
+ return (
+ groupby
+ .apply(data, self._fit_predict)
+ )
-
Understand the change - In this example: The change takes place in line
.apply(data.dropna(subset=["x", "y"]), self._fit_predict), which was changed to.apply(data, self._fit_predict) -
Answer two questions:
- Do the programs behave the same? The change does affect program behavior, as the data can now include NaN values -> Choose 0 for NO
- On a scale of 1 to 5, how realistic is this mistake? A developer can forget to remove NaN values before processing data, so this looks natural -> Choose 4 or 5 for STRONGLY NATURAL
- Below each mutant ID, you'll find a commit URL
- Click the URL to view the original code context on GitHub
- Use GitHub's "Search within code" to locate the relevant file
- Use "View file" to see the complete file for better context