A toolkit for automatic code documentation generation. This repository takes un-annotated code from the examples directory, trains a custom model, and generates readable documentation for each code file.
code2doc provides a pipeline to automatically generate documentation for Python code using a model trained on example scripts. The research paper included in this repository explains the motivation, methodology, and results.
Please refer to research_paper.pdf in this repository for a detailed explanation of the approach, experiments, and findings behind code2doc.
git clone https://github.com/YARE0909/code2doc.git
cd code2docIt is recommended to use a virtual environment to avoid package conflicts:
python -m venv venv
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activateMake sure you have pip up to date. Then install dependencies:
pip install --upgrade pip
pip install -r requirements.txtRun main.py to train the code documentation model.
python main.pyAfter training, generate documentation for the code files within the examples directory by running:
python generate.pyThe generated documentation will be saved in the output directory.
code2doc/
├── examples/ # Input example codes (without docstrings)
├── images/ # Supporting images for this project
├── output/ # Generated documentation will appear here along with training outputs
├── eda.py # Exploratory Data Analysis scripts
├── generate.py # Script to generate documentation
├── main.py # Script to train the model
├── requirements.txt # Python dependencies
├── research_paper.pdf # Research paper describing the project
├── .gitignore
└── ...
For full details on methodology and experiments, see research_paper.pdf.