-
Notifications
You must be signed in to change notification settings - Fork 10
Overview
Most of the content in this Wiki page was extracted from the Graduate Work pager of Flávio Juvenal (@fjsj), who was Groundhog's initial developer.
This document can certainly help you understand Groundhog, what things it does and how it works.
In Groundhog, the integration flow is linear - that is - for every project to be analyzed, first the Search module is executed, then Crawler, then CodeHistory and ultimately, Parser. In some case it's also necessary to execute another module: the Extractor. Below is a brief, overview-ish explanation of the responsibilities of each module:
-
Search– browses the forges' web pages or official APIs and fetches the information about the projects for further download by theCrawlermodule. -
Crawler– uses the project's information obtained by theSearchmodule and performs the download of the source code of the projects through the forges' pages or its VCSs. -
CodeHistory– obtains the most recent version of the source code of a project within a date of interest to have its metrics extracted by theParsermodule. -
Parser– analyzes each file in the source code of a project and builds an AST and browses through it to extract metrics. -
Extractor – in some cases the source code files are compressed and this module performs the extraction of these files in order to make the source code analysis possible.