From 65afa572eb721a507e1339d33a521bbd3332a938 Mon Sep 17 00:00:00 2001 From: lemon-gith <77743782+lemon-gith@users.noreply.github.com> Date: Mon, 26 Jan 2026 13:58:50 +0000 Subject: [PATCH 1/4] update escape and single-line comment rules also inadvertently trimmed a trailing space on L27, and added changelog comments? --- 1-regexes/readme.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/1-regexes/readme.md b/1-regexes/readme.md index 7cc6ac0..c971396 100755 --- a/1-regexes/readme.md +++ b/1-regexes/readme.md @@ -5,6 +5,8 @@ This lab is about lexers and regular expressions. It is intended to give you eno Changelog --------- +- **26-Jan-2026#3:** For 267 cohort - Reworded `\` rule to only ignore the following character, instead of up to the next whitespace +- **26-Jan-2026#2:** For 267 cohort - Reworded `//` rule to not remove final newline of a line (as this can lead to awkward behaviours) - **26-Jan-2026:** Added clarification that the "Number of comments" message does not start a new line if the preceding `\n` was deleted when a comment was removed. - **23-Jan-2025:** Added clarification that comments inside attributes should not count towards the number of comments removed. @@ -14,15 +16,15 @@ Specification Write a tool using [Flex](https://www.cs.virginia.edu/~cr4bd/flex-manual/index.html) that reads a stream of ASCII characters, and processes it, character by character, by applying the rules below. In what follows, a _line_ is defined as any maximal sub-sequence of the stream whose last character is a `newline` and which does not contain any other `newline` characters. You may assume that the final character in the whole stream is a `newline`. -- The `//` sequence indicates the beginning of a _comment_. If `//` is encountered, remove it and the rest of the line. +- The `//` sequence indicates the beginning of a _comment_. If `//` is encountered, remove it and all non-newline characters contained in the rest of the line. -- The `\` character indicates the beginning of an _escaped identifier_. If `\` is encountered, ignore it and move to the next space or `newline` in the stream. +- The `\` character indicates the beginning of an _escaped identifier_. If `\` is encountered, ignore the following character and move to the next next character in the stream, e.g. `\abc` -> ignore `a` and continue lexing from `b` - The `(*` sequence indicates the beginning of an _attribute_. If `(*` is encountered, remove it, and then remove all characters up to and including the end of the attribute. The end of the attribute is the next occurrence of `*)` that is _not_ part of an escaped identifier and is _not_ part of a comment. The end of the attribute may either be on the same line or on a subsequent line. If there is no end of the attribute, then this rule does not apply. - If any other character is encountered, ignore it and move to the next character in the stream. -Finally, the tool should add a line to the end of its output that says `Number of comments and attributes removed: n.` where `n` is the number of comments and attributes that have been removed. +Finally, the tool should add a line to the end of its output that says `Number of comments and attributes removed: n.` where `n` is the number of comments and attributes that have been removed. _[**Edit 23-Jan-2025:** If a comment is nested inside an attribute, then it is automatically removed when the attribute is removed, and does not need removing explicitly; therefore, it doesn't count towards the number of comments removed. The [result](test/ref/09.stdout.txt) of [test 9](test/in/09.txt) clarifies this behaviour.]_ From e9a664c9f8af0b6c46b60bcbd575a17621aea201 Mon Sep 17 00:00:00 2001 From: lemon-gith <77743782+lemon-gith@users.noreply.github.com> Date: Mon, 26 Jan 2026 14:00:14 +0000 Subject: [PATCH 2/4] add 2 newlines to ref // no longer removes the newlines --- 1-regexes/test/ref/02.stdout.txt | 2 ++ 1 file changed, 2 insertions(+) diff --git a/1-regexes/test/ref/02.stdout.txt b/1-regexes/test/ref/02.stdout.txt index ed95626..e2664cd 100755 --- a/1-regexes/test/ref/02.stdout.txt +++ b/1-regexes/test/ref/02.stdout.txt @@ -1,3 +1,5 @@ Hello + + world Number of comments and attributes removed: 2. From b0e2ebe3adc1ed6f6f74c362be89939239809ba8 Mon Sep 17 00:00:00 2001 From: lemon-gith <77743782+lemon-gith@users.noreply.github.com> Date: Mon, 26 Jan 2026 14:00:53 +0000 Subject: [PATCH 3/4] add 1 newline to ref // no longer removes that newline --- 1-regexes/test/ref/05.stdout.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/1-regexes/test/ref/05.stdout.txt b/1-regexes/test/ref/05.stdout.txt index aa18710..0c8a831 100755 --- a/1-regexes/test/ref/05.stdout.txt +++ b/1-regexes/test/ref/05.stdout.txt @@ -1,2 +1,3 @@ -Hello world +Hello +world Number of comments and attributes removed: 1. From f30c6850a0420fd60090f76393ba0f21125b6f6f Mon Sep 17 00:00:00 2001 From: lemon-gith <77743782+lemon-gith@users.noreply.github.com> Date: Mon, 26 Jan 2026 14:01:14 +0000 Subject: [PATCH 4/4] add 1 newline to ref 10 // no longer removes that newline --- 1-regexes/test/ref/10.stdout.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/1-regexes/test/ref/10.stdout.txt b/1-regexes/test/ref/10.stdout.txt index e151450..abdb20f 100755 --- a/1-regexes/test/ref/10.stdout.txt +++ b/1-regexes/test/ref/10.stdout.txt @@ -1,6 +1,7 @@ module foo () wire \hello; wire \//go away; - wire there; + + wire there; endmodule Number of comments and attributes removed: 2.