This repository contains the Data Collection Pipeline project for Model-Driven Development.
To fetch commits and export them to a CSV file, run:
python -m src.repo_miner fetch-commits --repo owner/repo [--max 100] --out commits.csv- Replace
owner/repowith the GitHub repository you want to analyze. - The
--maxflag is optional and limits the number of commits fetched. - The
--outflag specifies the output file (e.g.,commits.csv).
To fetch issues and export them to a CSV file, run:
python -m src.repo_miner fetch-issues --repo owner/repo [--state all|open|closed] [--max 50] --out issues.csv- Replace
owner/repowith the GitHub repository you want to analyze. - The
--stateflag is optional and filters issues by status (all, open, or closed). Default is all. - The
--maxflag is optional and limits the number of commits fetched. - The
--outflag specifies the output file (e.g.,commits.csv).
To summarize commits and export them to a CSV file, run:
python -m src.repo_miner summarize --commits commits.csv --issues issues.csv- The
--commitsflag specifies the path to the CSV file containing commit data. - The
--issuesflag specifies the path to the CSV file containing issue data.
Note: Depending on your configuration, you may need to use python3 instead of python.
If you encounter missing dependency warnings, install the required packages:
pip install -r requirements.txtOr, on some systems:
pip3 install -r requirements.txtRun all tests using:
pytestFor more detailed output, use the verbose flag:
pytest -v