Skip to content
This repository was archived by the owner on Jan 23, 2020. It is now read-only.
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions src/context-parser.js
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,14 @@ FastParser.prototype.walk = function(i, input, endsWithEOF) {
if(this.tags[0].toLowerCase() === this.tags[1].toLowerCase()) {
reconsume = 0; /* see 12.2.4.13 - switch state for the following case, otherwise, reconsume. */
this.matchEndTagWithStartTag(symbol);
/*
After matchEndTagWithStartTag (with start tag name == end tag name),
the state will be transition back to DATA.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is true only during https://github.com/yahoo/context-parser/pull/38/files#diff-453c94db9d68bf214cb0505c0e0791b5R245

my two cents: expect to move this.tags[0] = ''; to there or update the comment here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STATE_BEFORE_ATTRIBUTE_NAME and STATE_SELF_CLOSING_START_TAG would only lead to STATE_DATA when an end tag is created.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach seems simpler and easier to follow along. And, not doing a reset for https://github.com/yukinying/context-parser/blob/reset-end-tag/src/context-parser.js#L239 and https://github.com/yukinying/context-parser/blob/reset-end-tag/src/context-parser.js#L242 means we'll need to have another place to reset tags[0] for those cases, which seems complicated. thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whenever these 3 symbols (StateMachine.Symbol.SPACE = 0, StateMachine.Symbol.SOLIDUS = 6, StateMachine.Symbol.GREATER = 9) are consumed from the state 13, 16, 19, 27 (i.e. they are are RCDATA, RAWTEXT, SCRIPT end tag name state), this extra logic will be triggered and eventually, it will jump back to the DATA state, so setting tags[0] = '' makes sense.

however, i am wondering whether it is the right time to reset the tags[0] = ''. For example, if the input is <textarea></textarea[X]> and X = space, then the tags[0] will be reset and jump to before attribute name state. should we reset when transiting to DATA state when symbol StateMachine.Symbol.GREATER is encountered?

and do we cover the transition from state 9 to DATA state?

thought?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maditya, since you were the requester for having tags[0] = '' to be done earlier for html-purifier, could you comment on @neraliu's question?


Thus we need to reset the start tag variable (tags[0]) back to nil.
*/

this.tags[0] = "";
}
break;
case 8: this.matchEscapedScriptTag(ch); break;
Expand All @@ -200,6 +208,7 @@ FastParser.prototype.walk = function(i, input, endsWithEOF) {
FastParser.prototype.createStartTag = function (ch) {
this.tagIdx = 0;
this.tags[0] = ch;
this.tags[1] = '';
};

FastParser.prototype.createEndTag = function (ch) {
Expand All @@ -225,8 +234,6 @@ FastParser.prototype.matchEndTagWithStartTag = function (symbol) {
GREATER-THAN SIGN (>): If the current end tag token is an appropriate end tag token, then switch to the data state and emit the current tag token.
Otherwise, treat it as per the 'anything else' entry below.
*/
this.tags[0] = '';
this.tags[1] = '';

switch (symbol) {
case stateMachine.Symbol.SPACE: /** Whitespaces */
Expand Down
4 changes: 2 additions & 2 deletions tests/unit/run-functions-spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -222,8 +222,8 @@ Authors: Nera Liu <neraliu@yahoo-inc.com>

[ { html: "<div class='classname' style='color:red'></div>", tag0: 'div', tag1: 'div', index: 1},
{ html: "<div class='classname' style='color:red'></div> ", tag0: 'div', tag1: 'div', index: 1},
{ html: "<div class='classname' style='color:red'></div><img>", tag0: 'img', tag1: 'div', index: 0},
{ html: "<div class='classname' style='color:red'></div><img ", tag0: 'img', tag1: 'div', index: 0},
{ html: "<div class='classname' style='color:red'></div><img>", tag0: 'img', tag1: '', index: 0},
{ html: "<div class='classname' style='color:red'></div><img ", tag0: 'img', tag1: '', index: 0},
{ html: "<div class='classname' style='color:red'></div><img></im", tag0: 'img', tag1: 'im', index: 1},
{ html: "<div class='classname' style='color:red'></div><img></img ",tag0: 'img', tag1: 'img', index: 1},
].forEach(function(testObj) {
Expand Down