Skip to content

Comments

Provide name, public and system id in Doctype event#57

Merged
kornelski merged 1 commit intokornelski:mainfrom
drott:doctypeParts
Nov 15, 2025
Merged

Provide name, public and system id in Doctype event#57
kornelski merged 1 commit intokornelski:mainfrom
drott:doctypeParts

Conversation

@drott
Copy link

@drott drott commented Nov 6, 2025

Improve Doctype parsing to distinguish internal subset state and process public and system id, processing them into fields passed to the Doctype event.

Introducing extra states for system and public parts of ExternalId.

Moving parts of what used to be process in Doctype substate "Outside" to substate "InternalSubset" where it fits better.

Passes three additional tests from xmltest.xml that deal with Doctype validity.

Fixes #55.

Improve Doctype parsing to distinguish internal subset state and process
public and system id, processing them into fields passed to the Doctype
event.

Introducing extra states for system and public parts of ExternalId.

Moving parts of what used to be process in Doctype substate "Outside" to
substate "InternalSubset" where it fits better.

Passes three additional tests from xmltest.xml that deal with Doctype
validity.

Fixes kornelski#55.
@drott
Copy link
Author

drott commented Nov 11, 2025

Friendly ping @kornelski , would you be ready to consider an API change as proposed here? And combine this with #56?

@kornelski
Copy link
Owner

Thanks for the PR. The code looks good. Unfortunately, I've made a mistake of exposing the DOCTYPE as a String in the public API, so this change will require a semver-major bump. That's unfortunate after I've just released v1. I'll have to think whether I can retrofit it backwards-compatibly.

@drott
Copy link
Author

drott commented Nov 15, 2025

IMO one acceptable option is to leave the syntax event property intact, and just add the new ones in parallel. This would require a major bump, but is practically then relatively easy to upgrade to for clients. They could just ignore the new separately parsed fields.

@kornelski kornelski merged commit 7da7012 into kornelski:main Nov 15, 2025
2 checks passed
@kornelski
Copy link
Owner

I've exposed it via doctype_ids() method.

The project is in the middle of switching from xml-rs crate to the xml crate, so I don't want to cause any more churn on top of that until most users migrate over.

aarongable pushed a commit to chromium/chromium that referenced this pull request Nov 20, 2025
Improved DOCTYPE parsing, string conversion for XMLVersion
and allowing documents with version 1.x.

Covers Googler pull requests 54, 56, 57:
kornelski/xml-rs#54
kornelski/xml-rs#56
kornelski/xml-rs#57

Changes merged in upstream but no version released yet.

Covers $ git log 1.1.0..
bd8fe07 Allow parsing of XML Version 1.x to improve conformance
18c4d62 Move assert_match macro
30227e8 Nicer comparison for OwnedName
f0e34b3 Thinner string type
aba9a85 Backwards-compatible DOCTYPE properties
7da7012 Provide name, public and system id in Doctype event
34458a2 Add `as_str()` method to `XmlVersion` `enum`.

Bug: 441911594
Change-Id: Icc58864e4393fe2725f9137f52cf92c23a5f3ba7
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/7172953
Auto-Submit: Dominik Röttsches <drott@chromium.org>
Reviewed-by: Łukasz Anforowicz <lukasza@chromium.org>
Commit-Queue: Łukasz Anforowicz <lukasza@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1547919}
aarongable pushed a commit to chromium/chromium that referenced this pull request Nov 21, 2025
Use Doctype Parsing functionality from upstream [1] to replace
local workaround based on regular expressions.

[1] kornelski/xml-rs#57

Bug: 441911594
Change-Id: If49a5629bb57e17e6bfd3985e01b198e8c437bb4
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/7172025
Reviewed-by: David Baron <dbaron@chromium.org>
Commit-Queue: Dominik Röttsches <drott@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1548359}
mohd-akram pushed a commit to gsource-mirror/chromium-src-third_party-rust that referenced this pull request Jan 9, 2026
Improved DOCTYPE parsing, string conversion for XMLVersion
and allowing documents with version 1.x.

Covers Googler pull requests 54, 56, 57:
kornelski/xml-rs#54
kornelski/xml-rs#56
kornelski/xml-rs#57

Changes merged in upstream but no version released yet.

Covers $ git log 1.1.0..
bd8fe07 Allow parsing of XML Version 1.x to improve conformance
18c4d62 Move assert_match macro
30227e8 Nicer comparison for OwnedName
f0e34b3 Thinner string type
aba9a85 Backwards-compatible DOCTYPE properties
7da7012 Provide name, public and system id in Doctype event
34458a2 Add `as_str()` method to `XmlVersion` `enum`.

Bug: 441911594
Change-Id: Icc58864e4393fe2725f9137f52cf92c23a5f3ba7
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/7172953
Auto-Submit: Dominik Röttsches <drott@chromium.org>
Reviewed-by: Łukasz Anforowicz <lukasza@chromium.org>
Commit-Queue: Łukasz Anforowicz <lukasza@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1547919}
NOKEYCHECK=True
GitOrigin-RevId: e35fe768bc00fbe9e2c49e6f5334140eb8446b50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Include name, public id and system id in Doctype event

2 participants