-
Notifications
You must be signed in to change notification settings - Fork 2
Redo commented code 2 #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redo commented code 2 #11
Conversation
Maillman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updated check is working really well against all of the student code that I've ran it against! It's even catching commented code that the previous check wasn't catching because of empty comment lines inserted in the commented code. Like this commented code that I found when I ran it on a certain student's code:
//Session session;
//URI socketURI = new URI(url+"/ws");
// this.notificationHandler = notificationHandler;
//
// WebSocketContainer container = ContainerProvider.getWebSocketContainer();
//
// this.session = container.connectToServer(this,socketURI);
I also ran it against the autograder code under src/. It unfortunately marked https://github.com/softwareconstruction240/autograder/blob/edcbd89b68d4fdd58d0181363d0cf308ed6af8e0/src/main/java/edu/byu/cs/service/ConfigService.java#L95-L97 as a "3 comment lines likely containing commented code (75% confidence)" and https://github.com/softwareconstruction240/autograder/blob/edcbd89b68d4fdd58d0181363d0cf308ed6af8e0/src/main/java/edu/byu/cs/service/ConfigService.java#L121-L123 as a "3 comment lines likely containing commented code (75% confidence)".
It seems to include the empty comment lines as part of the "likely containing commented code" when I don't think it should.
Could you make it so it doesn't include empty lines as part of the check?
…king reserved words
|
Yep, Thank you for catching that |
| static { | ||
| ENTIRE_COMMENT_CODE_PATTERNS.put(Pattern.compile("[\\s\\S]*(?:if|for|while)\\s*\\([\\s\\S]+\\)\\s*\\{[\\s\\S]*}[\\s\\S]*"), 0.9); //simple if/for/while | ||
| ENTIRE_COMMENT_CODE_PATTERNS.put(Pattern.compile("[\\s\\S]*if\\s*\\([\\s\\S]+\\)\\s*\\{[\\s\\S]*}\\s*else(?:\\sif\\s*\\([\\s\\S]+\\))?\\s*\\{[\\s\\S]*}[\\s\\S]*"), 0.95); //if/else or if/elseif | ||
| ENTIRE_COMMENT_CODE_PATTERNS.put(Pattern.compile("^(?:[\\s\\S]*\\n\\s*)?(?:private|public|protected)?\\s+\\w+\\s+\\w+\\s*\\([\\s\\S]*\\)\\s*\\{[\\s\\S]*}[\\s\\S]*$"), 0.9); //method declaration | ||
|
|
||
| SINGLE_LINE_CODE_PATTERNS.put(Pattern.compile("^(?:if|while)\\s*\\([\\s\\S]+\\)\\s*\\{?\\s$"), 0.85); //simple if/while | ||
| SINGLE_LINE_CODE_PATTERNS.put(Pattern.compile("^for\\s*\\([\\s\\S]*;[\\s\\S]*;[\\s\\S]*\\)\\s*\\{?\\s$"), 0.9); //start of for loop | ||
| SINGLE_LINE_CODE_PATTERNS.put(Pattern.compile("^\\w+\\s+\\w+\\s*=[\\s\\S]*$"), 0.9); //variable declaration | ||
| SINGLE_LINE_CODE_PATTERNS.put(Pattern.compile("^\\w+\\s*(?:\\+|-|\\*|/|%|&|\\^|\\||<<|>>|>>>)?=[\\s\\S]*$"), 0.85); //variable assignment | ||
| SINGLE_LINE_CODE_PATTERNS.put(Pattern.compile("^(?:\\w+(?:\\(.*\\))?\\.)*\\w+\\(.*\\);$"), 0.85); //method call | ||
| SINGLE_LINE_CODE_PATTERNS.put(Pattern.compile("^return\\s+.*;$"), 0.9); // return statements | ||
| SINGLE_LINE_CODE_PATTERNS.put(Pattern.compile("^\\w+(?:--|\\+\\+);?$"), 0.9); //increment/decrement | ||
| SINGLE_LINE_CODE_PATTERNS.put(Pattern.compile("^.*;$"), 0.8); //end semicolon | ||
| SINGLE_LINE_CODE_PATTERNS.put(Pattern.compile("^}$"), 0.6); //only end curly | ||
|
|
||
|
|
||
| ENTIRE_COMMENT_NONCODE_PATTERNS.put(Pattern.compile("^(?:\\*.*\n)*\\*.*?$"), 0.85); //Javadoc comment | ||
| ENTIRE_COMMENT_NONCODE_PATTERNS.put(Pattern.compile("^\\{(?:\\s*\"\\w+\":\\s*(?:\"?\\w+\"?|\"\"|\\[[\\s\\S]*]|\\{[\\s\\S]*}),?\\s*)+}$"), 0.75); //JSON object | ||
|
|
||
| SINGLE_LINE_NONCODE_PATTERNS.put(Pattern.compile("^\\*.*$"), 0.65); //start of javadoc comment line | ||
| SINGLE_LINE_NONCODE_PATTERNS.put(Pattern.compile("^\\*\\s+@(?:param|return|throws)\\s+.*$"), 0.8); //start of javadoc comment line | ||
| SINGLE_LINE_NONCODE_PATTERNS.put(Pattern.compile("^\"\\w+\":\\s*(?:\"?[\\w: ]+\"?|\"\"),?$"), 0.65); //individual line of json field | ||
| SINGLE_LINE_NONCODE_PATTERNS.put(Pattern.compile("^(?i)(?:TODO|FIXME):.*$"), 0.8); | ||
| SINGLE_LINE_NONCODE_PATTERNS.put(Pattern.compile("^NOTE:.*$"), 0.7); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Maillman Are there any other code patterns or common non-code patterns you think we should be checking for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is robust enough, and already significantly better than the previous commented code check. I don't think there's anything else I would add.
Maillman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I'll get this merged in and make a PR for the autograder to the updated checkstyle jar.
Second improvement to the commented code check. Does more in-depth regex scanning (including regex for negative examples that shouldn't be counted as code, most importantly javadoc comments/documentation) rather than the simple approach of checking the end character of each line.
I'd recommend playing with this, maybe changing some of the default variables, and running it on real student code before merging (I no longer have access to the code of students taking the course)