Predicting Github Issue Piorities

The purpose of this project was to develop a model that could accurately classify a github issue as low priority, medium priority, or high priority. Using TFIDF, we were able to pass text as features to the model. Important to note we only considered posts that explicity had high, medium, or low priority in our dataset and we assigned them 1,0,-1 respectively.

How to use this project: run the model.py file

Possible Improvements: Need to figure out how to pass both the title and the body as features. Right now I can only pass one of them. This is supposed to be a command-line tool, but due to time constraint and one of our member's having to leave early we couldn't implement that.

Accuracy: The model is around 55% accurate; pretty significant considering random guess would yield 33%, but of course can be better.

Link to pipeline design

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
models		models
.gitignore		.gitignore
README.md		README.md
import_data.sql		import_data.sql
label_data.py		label_data.py
model.py		model.py
normalized-github-issues.csv		normalized-github-issues.csv
preprocessing.py		preprocessing.py
project.py		project.py
requirements.txt		requirements.txt
tfidf.py		tfidf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting Github Issue Piorities

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

bharddwaj/GithubIssuePrioritizer

Folders and files

Latest commit

History

Repository files navigation

Predicting Github Issue Piorities

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages