Skip to content

Bugs in Java Analyzer  #82

@tammychau99

Description

@tammychau99
  1. Issue: Total file count (numFiles) and analyzed Java files (numJavaFiles) are inaccurate. The numbers are the total number of all repos the JA has analyzed not for the individual repository it’s analyzing
  • File to fix: RepoTraversal.java
  • Proposed Fix: Inside function traverseRepoForFileContent(String repoURL), at the beginning of the try-catch statement, set both variables to 0
  1. Issue: Total file count (numFiles) is counting both files (“blob”) and directories (“tree”)
  • File to fix: RepoTraversal.java
  • Proposed Fix: Inside function traverseTreeForFileURLs(String repoURL), remove original numFiles++ increment and add a conditional before the conditional in the for loop to only increment numFiles if its type is a file(“blob”)
  1. Issue: Min and Max indent counts for files are inaccurate and affect the min, max, and average indent count for the repository. Since the variables are being set at INTEGER.MIN and INTEGER.MAX, sometimes these variables never get set with the correct number, producing such high/low values.
  • File to fix: WhiteSpaceParser.java
  • Proposed Fix: Once the issue with the min and max indent count for files is fixed, I believe this will fix the counts for the repository. The function parseIndents(String[] methodLines, MethodWhiteSpaceResults mwp) contains the logic for calculating the indents, spaces and tabs. I printed the min and max indent values that are set after the function is done and there are times INTEGER.MIN and/or INTEGER.MAX is the value that was set since it never got changed in the for loop.
  • Note: The logic on how the JA code works is hard to follow since there isn’t much documentation and information is passed back and forth a lot making it hard to track what’s being sent and received.
  1. Issue: Values needed for analysis not collected
    Owner Name (owner.login)
    Owner ID (owner.node_id)
    Owner Type (owner.type)
    Repo ID (node_id)
    Repo Name (name)
    Repo Privacy Setting (private)
    Repo Language (language)
    File ID (repo_analysis.<file_name>.node_id)
  • Proposed Fix: Finding out where in the JA code some of these attributes can be pulled and adding them to the JSON that gets added to mongoDB

Note: Looking at the numbers, I’m not sure but I think the other values generated from the JA are okay. I feel like since this is all being done without any libraries, it may be easier to create a JA using Python, having the user state what code style they want to follow, and then producing numbers based on that style.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions