Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds pattern-matched Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
bartblast
left a comment
There was a problem hiding this comment.
Thanks for the PR! The approach looks reasonable. Could you expand the test coverage? I think we should test both single and multiple hyphens for each tag name parsing case, and also cover some additional edge case scenarios:
- Start tag (single + multiple hyphens)
- Self-closing tag (single + multiple hyphens)
- End tag (single + multiple hyphens)
- Tag name starting with a hyphen, e.g.
<-foo> - Tag name ending with a hyphen, e.g.
<foo-> - Consecutive hyphens, e.g.
<foo--bar>
8f49a90 to
9dd1fc8
Compare
|
Good catch on the allowed names. Though, as far as I can tell they are unable to start with a |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/elixir/hologram/template/parser_test.exs (1)
114-161: Good test coverage for the new parsing behavior.The tests methodically cover single-hyphen, multi-hyphen, trailing-hyphen, and consecutive-hyphen cases for start tags, self-closing tags, and end tags.
Consider adding a test for a hyphenated element with attributes to verify the
prev_statusguard transition works end-to-end:test "hyphenated element start tag with attributes" do assert parse_markup(~s'<foo-bar class="x">') == [start_tag: {"foo-bar", [{"class", [text: "x"]}]}] endAnd a round-trip start+end pair:
test "hyphenated element start and end tags" do assert parse_markup("<foo-bar></foo-bar>") == [start_tag: {"foo-bar", []}, end_tag: "foo-bar"] end🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/elixir/hologram/template/parser_test.exs` around lines 114 - 161, Add two tests to parser_test.exs to cover hyphenated tags with attributes and round-trip start/end behavior: create a test named "hyphenated element start tag with attributes" that calls parse_markup(~s'<foo-bar class="x">') and asserts it returns [start_tag: {"foo-bar", [{"class", [text: "x"]}]}], and a test named "hyphenated element start and end tags" that calls parse_markup("<foo-bar></foo-bar>") and asserts it returns [start_tag: {"foo-bar", []}, end_tag: "foo-bar"]; these ensure parse_markup's prev_status guard transitions and end-tag matching are exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/elixir/hologram/template/parser_test.exs`:
- Around line 114-161: Add two tests to parser_test.exs to cover hyphenated tags
with attributes and round-trip start/end behavior: create a test named
"hyphenated element start tag with attributes" that calls
parse_markup(~s'<foo-bar class="x">') and asserts it returns [start_tag:
{"foo-bar", [{"class", [text: "x"]}]}], and a test named "hyphenated element
start and end tags" that calls parse_markup("<foo-bar></foo-bar>") and asserts
it returns [start_tag: {"foo-bar", []}, end_tag: "foo-bar"]; these ensure
parse_markup's prev_status guard transitions and end-tag matching are exercised.
|
Let me know if you want me to implement the test suggested by coderabbit. I'm not sure if it would technically be a duplicate or not based on how the parser works. |
|
The first one is beneficial IMO, since we're parsing two tokens at once here and that's a separate code path. The second one not so much, kind of redundant. |
9dd1fc8 to
5f45c32
Compare
|
Implemented the first test. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/elixir/hologram/template/parser_test.exs (2)
127-129: Nit: Rename "multiple simultaneous hyphenated" to "consecutive-hyphen" (or "double-hyphen")."Multiple simultaneous" is ambiguous — it reads as if multiple hyphens appear in separate positions rather than consecutively adjacent. "Consecutive-hyphen" or "double-hyphen" conveys the intent at a glance.
♻️ Suggested renames
- test "multiple simultaneous hyphenated element start tag" do + test "consecutive-hyphen element start tag" do - test "multiple simultaneous hyphenated element self-closed start tag" do + test "consecutive-hyphen element self-closed start tag" do - test "multiple simultaneous hyphenated element end tag" do + test "consecutive-hyphen element end tag" doAlso applies to: 149-151, 165-167
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/elixir/hologram/template/parser_test.exs` around lines 127 - 129, Rename ambiguous test names that use "multiple simultaneous hyphenated" to clearer terms like "consecutive-hyphenated" (or "double-hyphenated") so intent is obvious; specifically update the test description "multiple simultaneous hyphenated element start tag" and the other two tests with the same phrasing to e.g. "consecutive-hyphenated element start tag" (and similar for the end-tag and combined cases) while keeping the test bodies and calls to parse_markup unchanged.
115-167: Consider adding a negative test for the leading-hyphen constraint.The PR discussion explicitly notes that
<-foo>is invalid. Without a corresponding error-case test, a future refactor of the hyphen-handlingparse_tokensclauses could accidentally start accepting leading-hyphen tags without any failing test catching it.💡 Suggested addition in
describe "template syntax errors"+ test "element tag starting with a hyphen" do + expected_msg = + normalize_newlines(""" + Reason: + ... + """) + + test_syntax_error_msg("<-foo>", expected_msg) + end(Fill in the actual error message produced by the parser for this input.)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/elixir/hologram/template/parser_test.exs` around lines 115 - 167, Add a negative test that ensures tags with a leading hyphen are rejected (e.g. calling parse_markup("<-foo>") should raise/return the parser's error for invalid tag names); place it in the "describe \"template syntax errors\"" group and assert the exact error message/output produced by the parser for this input so future changes to parse_tokens or parse_markup that accidentally accept leading-hyphen tags will fail the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/elixir/hologram/template/parser_test.exs`:
- Around line 127-129: Rename ambiguous test names that use "multiple
simultaneous hyphenated" to clearer terms like "consecutive-hyphenated" (or
"double-hyphenated") so intent is obvious; specifically update the test
description "multiple simultaneous hyphenated element start tag" and the other
two tests with the same phrasing to e.g. "consecutive-hyphenated element start
tag" (and similar for the end-tag and combined cases) while keeping the test
bodies and calls to parse_markup unchanged.
- Around line 115-167: Add a negative test that ensures tags with a leading
hyphen are rejected (e.g. calling parse_markup("<-foo>") should raise/return the
parser's error for invalid tag names); place it in the "describe \"template
syntax errors\"" group and assert the exact error message/output produced by the
parser for this input so future changes to parse_tokens or parse_markup that
accidentally accept leading-hyphen tags will fail the test.
5f45c32 to
d55332f
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
test/elixir/hologram/template/parser_test.exs (2)
133-135: Inconsistent test name: "simultaneous" vs "consecutive" for the samefoo--barpattern.Lines 133 and 171 use "multiple simultaneous hyphenated element", while line 155 uses "multiple consecutive hyphenated element" — all three describe the identical
foo--barinput. Pick one term and apply it consistently.🔧 Proposed fix (using "consecutive" throughout)
- test "multiple simultaneous hyphenated element start tag" do + test "multiple consecutive hyphenated element start tag" do assert parse_markup("<foo--bar>") == [start_tag: {"foo--bar", []}] end- test "multiple simultaneous hyphenated element end tag" do + test "multiple consecutive hyphenated element end tag" do assert parse_markup("</foo--bar>") == [end_tag: "foo--bar"] endAlso applies to: 155-157, 171-173
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/elixir/hologram/template/parser_test.exs` around lines 133 - 135, Update the three test names that describe the same case to use a single consistent term (either "consecutive" or "simultaneous"); locate the tests that call parse_markup("<foo--bar>") (the test blocks with descriptions "multiple simultaneous hyphenated element start tag" at lines around 133 and 171 and "multiple consecutive hyphenated element" around 155) and rename the test descriptions so they all match (e.g., change both "multiple simultaneous hyphenated element..." occurrences to "multiple consecutive hyphenated element..." or vice versa) while leaving the assertions and the use of parse_markup("<foo--bar>") unchanged.
114-173: Consider adding whitespace-after-name tests for hyphenated tags.The existing suite covers whitespace after tag names for plain tags (lines 103–113:
<div \n\r\t>,</div \n\r\t>). No equivalent tests exist for hyphenated tags, leaving coverage for parser state transitions like<foo-bar \n\r\t>and</foo-bar \n\r\t>implicit.test "whitespaces after hyphenated element start tag name" do assert parse_markup("<foo-bar \n\r\t>") == [start_tag: {"foo-bar", []}] end test "whitespaces after hyphenated element end tag name" do assert parse_markup("</foo-bar \n\r\t>") == [end_tag: "foo-bar"] end🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/elixir/hologram/template/parser_test.exs` around lines 114 - 173, Add two tests to cover whitespace-after-name for hyphenated tags: create "whitespaces after hyphenated element start tag name" asserting parse_markup("<foo-bar \n\r\t>") == [start_tag: {"foo-bar", []}] and "whitespaces after hyphenated element end tag name" asserting parse_markup("</foo-bar \n\r\t>") == [end_tag: "foo-bar"]; place them alongside the other hyphenated tag tests so they exercise the same parser paths (refer to existing tests that call parse_markup like "hyphenated element start tag" and "hyphenated element end tag").
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/elixir/hologram/template/parser_test.exs`:
- Around line 127-131: Replace the fully qualified exception name
Hologram.TemplateSyntaxError with the in-scope alias TemplateSyntaxError in the
test that calls parse_markup; update the assert_raise invocation to use
TemplateSyntaxError so it matches the other tests in this file (look for the
test "element name cannot start with a hyphen" and the parse_markup call).
---
Nitpick comments:
In `@test/elixir/hologram/template/parser_test.exs`:
- Around line 133-135: Update the three test names that describe the same case
to use a single consistent term (either "consecutive" or "simultaneous"); locate
the tests that call parse_markup("<foo--bar>") (the test blocks with
descriptions "multiple simultaneous hyphenated element start tag" at lines
around 133 and 171 and "multiple consecutive hyphenated element" around 155) and
rename the test descriptions so they all match (e.g., change both "multiple
simultaneous hyphenated element..." occurrences to "multiple consecutive
hyphenated element..." or vice versa) while leaving the assertions and the use
of parse_markup("<foo--bar>") unchanged.
- Around line 114-173: Add two tests to cover whitespace-after-name for
hyphenated tags: create "whitespaces after hyphenated element start tag name"
asserting parse_markup("<foo-bar \n\r\t>") == [start_tag: {"foo-bar", []}] and
"whitespaces after hyphenated element end tag name" asserting
parse_markup("</foo-bar \n\r\t>") == [end_tag: "foo-bar"]; place them alongside
the other hyphenated tag tests so they exercise the same parser paths (refer to
existing tests that call parse_markup like "hyphenated element start tag" and
"hyphenated element end tag").
d55332f to
7d80597
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
test/elixir/hologram/template/parser_test.exs (1)
115-173: LGTM — consider adding a<script-*>round-trip testThe new tests comprehensively cover all the hyphenated-tag cases. One scenario worth adding is a custom element whose name starts with
"script":test "hyphenated element starting with 'script' does not activate script mode" do assert parse_markup("<script-element>{`@var`}</script-element>") == [ start_tag: {"script-element", []}, expression: "{`@var`}"}, end_tag: "script-element" ] endThis would confirm the parser-level bug described in the comment below is fixed before closing the PR.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/elixir/hologram/template/parser_test.exs` around lines 115 - 173, Add a test in test/elixir/hologram/template/parser_test.exs that verifies tags whose names start with "script" do not activate script mode: create a test named like "hyphenated element starting with 'script' does not activate script mode" that calls parse_markup on "<script-element>{`@var`}</script-element>" and asserts the parsed result is a sequence of start_tag: {"script-element", []}, expression: "{`@var`}", and end_tag: "script-element" so the parser correctly treats "script-element" as a normal hyphenated element rather than entering script parsing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/hologram/template/parser.ex`:
- Around line 503-522: The two parse_tokens clauses that handle hyphen
continuations for :start_tag_name must call
maybe_disable_script_mode(context.tag_name) before updating the tag name so
script mode is turned off for names that were only partially "script"; insert a
call to maybe_disable_script_mode(context.tag_name) at the top of both clauses
(the ones that match [{:string, str} = token2 | rest] and [{:symbol, "-"} =
token | rest]) immediately before calling set_tag_name, leaving the subsequent
add_processed_token, set_prev_status(:start_tag_name) and recursive
parse_tokens(:start_tag, rest) calls unchanged.
- Around line 269-284: The end-tag hyphen continuation clauses in parse_tokens
(the two def parse_tokens(..., :end_tag, ...) clauses) currently run after
maybe_disable_script_mode("script") was called in :end_tag_name and thus can
leave script mode disabled incorrectly; update both of these :end_tag handlers
to call maybe_enable_script_mode(context.tag_name) after you extend
context.tag_name (i.e., after set_tag_name(...) completes) so that if the
accumulated prefix is "script" script mode is re-enabled before continuing
parsing, then proceed with add_processed_token(...) and recursive
parse_tokens(:end_tag, rest).
---
Nitpick comments:
In `@test/elixir/hologram/template/parser_test.exs`:
- Around line 115-173: Add a test in
test/elixir/hologram/template/parser_test.exs that verifies tags whose names
start with "script" do not activate script mode: create a test named like
"hyphenated element starting with 'script' does not activate script mode" that
calls parse_markup on "<script-element>{`@var`}</script-element>" and asserts the
parsed result is a sequence of start_tag: {"script-element", []}, expression:
"{`@var`}", and end_tag: "script-element" so the parser correctly treats
"script-element" as a normal hyphenated element rather than entering script
parsing.
| def parse_tokens(context, :end_tag, [ | ||
| {:symbol, "-"} = token1 | [{:string, str} = token2 | rest] | ||
| ]) do | ||
| context | ||
| |> set_tag_name(context.tag_name <> "-" <> str) | ||
| |> add_processed_token(token1) | ||
| |> add_processed_token(token2) | ||
| |> parse_tokens(:end_tag, rest) | ||
| end | ||
|
|
||
| def parse_tokens(context, :end_tag, [{:symbol, "-"} = token | rest]) do | ||
| context | ||
| |> set_tag_name(context.tag_name <> "-") | ||
| |> add_processed_token(token) | ||
| |> parse_tokens(:end_tag, rest) | ||
| end |
There was a problem hiding this comment.
Symmetric script-mode issue in the end-tag hyphen handlers
maybe_disable_script_mode("script") is invoked in :end_tag_name (line 256) before the hyphen continuation runs. This means </script-element> prematurely exits script mode — which would produce incorrect parsing if encountered inside a real <script> block (unlikely in practice, but the same root cause applies).
The mirror fix is to call maybe_enable_script_mode(context.tag_name) in both new :end_tag clauses to undo the premature deactivation when the prefix was "script".
🐛 Proposed fix
def parse_tokens(context, :end_tag, [
{:symbol, "-"} = token1 | [{:string, str} = token2 | rest]
]) do
context
+ |> maybe_enable_script_mode(context.tag_name)
|> set_tag_name(context.tag_name <> "-" <> str)
|> add_processed_token(token1)
|> add_processed_token(token2)
|> parse_tokens(:end_tag, rest)
end
def parse_tokens(context, :end_tag, [{:symbol, "-"} = token | rest]) do
context
+ |> maybe_enable_script_mode(context.tag_name)
|> set_tag_name(context.tag_name <> "-")
|> add_processed_token(token)
|> parse_tokens(:end_tag, rest)
end📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def parse_tokens(context, :end_tag, [ | |
| {:symbol, "-"} = token1 | [{:string, str} = token2 | rest] | |
| ]) do | |
| context | |
| |> set_tag_name(context.tag_name <> "-" <> str) | |
| |> add_processed_token(token1) | |
| |> add_processed_token(token2) | |
| |> parse_tokens(:end_tag, rest) | |
| end | |
| def parse_tokens(context, :end_tag, [{:symbol, "-"} = token | rest]) do | |
| context | |
| |> set_tag_name(context.tag_name <> "-") | |
| |> add_processed_token(token) | |
| |> parse_tokens(:end_tag, rest) | |
| end | |
| def parse_tokens(context, :end_tag, [ | |
| {:symbol, "-"} = token1 | [{:string, str} = token2 | rest] | |
| ]) do | |
| context | |
| |> maybe_enable_script_mode(context.tag_name) | |
| |> set_tag_name(context.tag_name <> "-" <> str) | |
| |> add_processed_token(token1) | |
| |> add_processed_token(token2) | |
| |> parse_tokens(:end_tag, rest) | |
| end | |
| def parse_tokens(context, :end_tag, [{:symbol, "-"} = token | rest]) do | |
| context | |
| |> maybe_enable_script_mode(context.tag_name) | |
| |> set_tag_name(context.tag_name <> "-") | |
| |> add_processed_token(token) | |
| |> parse_tokens(:end_tag, rest) | |
| end |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/hologram/template/parser.ex` around lines 269 - 284, The end-tag hyphen
continuation clauses in parse_tokens (the two def parse_tokens(..., :end_tag,
...) clauses) currently run after maybe_disable_script_mode("script") was called
in :end_tag_name and thus can leave script mode disabled incorrectly; update
both of these :end_tag handlers to call
maybe_enable_script_mode(context.tag_name) after you extend context.tag_name
(i.e., after set_tag_name(...) completes) so that if the accumulated prefix is
"script" script mode is re-enabled before continuing parsing, then proceed with
add_processed_token(...) and recursive parse_tokens(:end_tag, rest).
Standard HTML elements do not use hyphens in their names, but custom elements (Web Components) require them, e.g., <my-widget></my-widget>. Web Components names MUST start with a lower case letter of the alphabet (a-z). So the case of having a name such as <-foo> would be invalid.
7d80597 to
aa31dd2
Compare
| def parse_tokens(context, :start_tag, [{:symbol, "/>"} = token | rest]) do | ||
| parse_start_tag_end(context, token, rest, true) | ||
| context | ||
| |> maybe_enable_script_mode(context.tag_name) |
There was a problem hiding this comment.
Isn't maybe_enable_script_mode/2 call redundant for self-closed cases?
There was a problem hiding this comment.
You are absolutely right.
| |> parse_start_tag_end(token, rest, false) | ||
| end | ||
|
|
||
| def parse_tokens(%{prev_status: :start_tag_name} = context, :start_tag, [ |
There was a problem hiding this comment.
Could you move this 2 to # --- START TAG NAME --- section?
| end | ||
|
|
||
| test "element name cannot start with a hyphen" do | ||
| assert_raise TemplateSyntaxError, fn -> |
There was a problem hiding this comment.
Could you test the error message?
| context | ||
| |> reset_attributes() | ||
| |> set_tag_name(tag_name) | ||
| |> maybe_enable_script_mode(tag_name) |
There was a problem hiding this comment.
Moving maybe_enable_script_mode from :start_tag_name to the :start_tag ">" / "/>" handlers misses another path that closes start tags:
def parse_tokens(context, :attribute_name, [{:symbol, ">"} = token | rest]) do
...
<script defer> triggers this path (boolean attribute immediately before >), so script mode is never enabled. <script defer > (with a space) still works because whitespace returns to :start_tag first.
How would a solution look that keeps setting script mode in :start_tag_name (the original placement)? That way there's a single place that handles it, instead of needing to patch every clause that can close a start tag - which is fragile if more constructs are introduced in the future.
There was a problem hiding this comment.
I am thinking something like this.
def parse_tokens(context, :start_tag_name, [{:symbol, "-"} = token | rest]) do
context
|> set_tag_name(context.tag_name <> "-")
|> add_processed_token(token)
|> parse_tokens(:start_tag_name, rest)
end
def parse_tokens(context, :start_tag_name, [{:string, tag_name} = token | rest]) do
context
|> set_tag_name(context.tag_name <> tag_name)
|> add_processed_token(token)
|> parse_tokens(:start_tag_name, rest)
end
def parse_tokens(context, :start_tag_name, tokens) do
context
|> reset_attributes()
|> maybe_enable_script_mode(context.tag_name)
|> set_prev_status(:start_tag_name)
|> parse_tokens(:start_tag, tokens)
endThis will require a slight change to the case when matching on < in order to start parsing the :start_tag_name. We would need to set_tag_name to "". There may be a better place to do this as well.
The reason for this is that the original implementation (current dev) expected the context.tag_name to have been overwritten by the single :string that makes up the name. With this change, multiple :strings can create the tag name, meaning we potentially need to concatenate multiple values to create the tag name. The fact that context.tag_name is never cleared/removed meant that something like
Hologram.Template.Parser.parse_markup("<script></script><script></script>")would end up being parsed as
[
start_tag: {"script", []},
end_tag: "script",
start_tag: {"scriptscript", []},
end_tag: "scriptscript"
]Notice how the second set of tags is scriptscript instead of the expected script.
The above implementation should allow parsing only the :strings and symbol: "-" for a tag name. After it reaches another token, such as :whitespace or a different :symbol, it will check if it should enable script mode and continue parsing as expected.
This does have the downside that there is an extra function call for each :start_tag_name because of the third function head that acts as a catch-all because I thought it would be worse to explicitly check for either :whitespace to start (potentially) checking for attributes or check for a symbol: ">" in order to close the tag. With this catch-all in place parsing will continue as it currently does.
Let me know if this approach is in line with what you were thinking or if there is another approach that you would like me to explore as well.
| test "multiple simultaneous hyphenated element end tag" do | ||
| assert parse_markup("</foo--bar>") == [end_tag: "foo--bar"] | ||
| end | ||
| end |
There was a problem hiding this comment.
Since there are quite a few hyphenated tag tests now, it might be worth extracting them into a separate describe "custom element tags" block (and renaming the current one to "standard element tags"). This would keep things organized as more cases are added.
Also a few tests I think are missing:
- Leading hyphen in end tags (
</-foo>) should raise - Round-trip:
<foo-bar>text</foo-bar> - Script mode is NOT enabled for
<script-widget>...</script-widget> - Script mode IS still enabled for
<script>with attributes, e.g.<script defer>var x = "<div>";</script> - etc.
Standard HTML elements do not use hyphens in their names, but custom elements (Web Components) require them, e.g., .
Resolves #691
Summary by CodeRabbit
New Features
Tests