Skip to content

Demon702/robust_code_summary

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Understanding code semantics in Code Summarization

EMNLP GenBench Workshop Paper: Understanding Code Semantics: An Evaluation of Transformer Models in Summarization

Repository Structure

  • data-modifier/ : The folder contains all the scripts for performing data-transformations. Within this directory, there are subdirectories named java, javascript, and python that contain scripts specifically designed to transform codes written in their respective languages.
  • human-annotations/ : The folder contains the manual evaluation set of data (200 randomly sampled dataset from the original dataset). The files have been named on the basis of the coding language and the respective annotator.

Instructions to download clean data

  • pip install gsutil
  • gsutil -m cp -r "gs://sfr-codet5-data-research/data" .

Python

Scripts

  • corrupt_variables-python.py : Script to rename variable and function identifiers
  • change_comments.py : Script to add commented code to the original code.
  • add_code_after_return_statement.py : Script to add dead code to the original code.

How to Run

  • Ensure your system has python pre-installed.
  • Install the required libraries by running the given command.
    • pip install -r requirements.txt
  • To run a particular script run the following the command.
    • python filename.py -i {input_data_directory} -o {output_data_directory}

Javascript

Scripts

  • parser.js : Script to rename variable and function identifiers
  • comment_code.js : Script to add commented code to the original code.
  • dead_code_parser.js : Script to add dead code to the original code.

How to Run

  • Ensure your system has node and npm/yarn preinstalled.
  • Ensure all the required libraries are installed in your system.
    • npm install
  • To run a specific file run the following command.
    • node file.js {input_data_json} {output_data_json}

Java

Scripts

  • Main.java : Script to generate corrupted java code (renamed identifiers) and codes with commented code dataset.
  • MethodNameModifier.class : Class for renaming the method name identifier.
  • VariableNameModifier.class : Class for renaming the variable name identifier.
  • java.jar : The jar file for running the Main.java code.

How to Run

  • Ensure java is preinstalled in your system.
  • To perform the data transformation for java codes, run the following command.
    • java -jar src/main/java/org/modifier/java.jar

Contributors

  • Abhilasha Lodha
  • Ankita Sahoo
  • Beena Kumari
  • Debanjan Mondal

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors