This project is focused on the efficient computation of code complexity for a given repository, cycling through every commit in the repo, utilising a set of nodes as appropriate to minimise execution time from submission to result return.
I used the CloudHaskell and Argon libraries to distribute the work among worker nodes and to compute the cyclomatic complexities of each .hs file for each commit in the given repository.
To obtain the repository’s working folder, I send a command to clone the repo into my own folder (remove it then once the work is completed). I then recursively crawl through the folder and obtain the absolute file path for ALL files in the repo directory and then filter the list of files for only .hs files because they are all that Argon works on.
Once these are obtained I send these file paths to the workers to get the complexities and return the results and repeat for every commit in the repo.
- Clone this repo
git clone https://craig1901/complexity_api stack buildinside the directorybash workers.shto create the workers (you can open this file to add more or use less)bash run.sh <repo to compute clyclomatic complexity>- View your results!
For results on this project, I collected the time it took to complete the computation of the cyclomatic complexities for all commits for a given repo. As can be seen from the graph below, the use of workers to complete this task is very useful where we have a drastic drop in completion time when we increase from 1 worker to 2. This project was done on local development so all workers were all executed on the same machine, which is not the intended use of a distributed project such as this so if all workers were on different machines, the completion time would decrease even more as the worker count increased.
Results based on this Repository:
| # of Workers | Time (s) |
|---|---|
| 1 | 85.32 |
| 2 | 55.42 |
| 3 | 53.62 |
| 4 | 51.81 |
| 5 | 51.47 |
| 6 | 49.53 |
| 7 | 48.93 |
| 8 | 49.09 |
| 9 | 48.33 |
| 10 | 48.73 |
